The alleged diagram probably means Vega has one 8-wide, two 4-wide and one 2-wide pipeline, which surprisingly pretty much follows the patent that was linked above. This gives a variable wavefront size from 8 wide to 32 wide, assuming the same 4-cycle cadence. But it also means each "NCU" would get only 18 lane. So one might expect multiple of these NCUs to form a larger block that shares at least the LDS.This suggests the SIMDs are now 2-wide in Vega...?
Though anyone could have made that in Paint...
All this shows is that the card is working and driver QC is probably close to shipping.
I wish they had used a more punishing game than Battlefront, but if this is in Ultra settings + TAA then not even the Pascal Titan X can achieve solid 60FPS.
Though this could be a special new build of Battlefront with DX12.
But I guess tomorrow we'll know almost everything there is to know about the new cards.
Complexity is one drawback, in terms of steps in the execution loop, hardware dedicated to scheduling, register file allocation/access choices, and potentially peak throughput if following the patent diagram exactly (only 14 lanes in vector units, only one SIMD's worth in a CU).What would the drawbacks be since it wasn't done before?
The patent casts a decently wide net, with every parameter being physically or dynamically variable: number of scalar, high-performance scalar, and vector units, their actual widths versus partial gating, etc.The alleged diagram probably means Vega has one 8-wide, two 4-wide and one 2-wide pipeline, which surprisingly pretty much follows the patent that was linked above. This gives a variable wavefront size from 8 wide to 32 wide, assuming the same 4-cycle cadence. But it also means each "NCU" would get only 18 lane. So one might expect multiple of these NCUs to form a larger block that shares at least the LDS.
Looks more like some kind of review guide or white paper if it is real. Slides are usually full of fanciness, aren't they?I kind of hope it's not a marketing slide, and is just someone trying to explain part of the idea.
Some kind of migration or forwarding seem to be the case if it is real, and as implied by "not wasting space" with variable width SIMD. If it is just clock gating, it could say just power saving in 16 lane SIMDs. This might also explain the smaller number of hardware lanes in an NCU (complexity in data path and instruction scheduling).I don't follow the portion about not wasting SIMD space in the variable scenario. The visual language still seems to indicate 4 independent SIMDS, but unless SIMD lanes are migratory or AMD has discovered a 8-4-2-4 pattern to wavefront coverage, I don't see how it saves SIMD space.
Are quads still a concrete concept in the CU domain though? They are essentially four consecutive work-items.The 18-lane thing doesn't quite fit unless quads stopped being a thing.
There is no obligation in forming abbreviations with all the first letters though. Next-generation Compute Unit stands for NCU as well.Also, isn't Next Generation Compute Unit shortened to NGCU?
Maybe a review guide, although then I still hope not. That might be down to a personal bias against automotive analogies, however.Looks more like some kind of review guide or white paper if it is real. Slides are usually full of fanciness, aren't they?
There are some elements to the design that show optimizations for data swizzling between quads, and it's a reasonable expectation in a graphics context that a lot of work will be coming in a granularity of 4. A physically two-wide SIMD drawn in a similar position as a formerly independent 16-wide is creating a scenario where there's over-subscription when a quad needs to fit, or under-subscription if well-packed graphics wavefronts have to ignore it.Are quads still a concrete concept in the CU domain though? They are essentially four consecutive work-items.
The marketing may have been served well if that hyphen were added. That's more of a nitpick where I think it adds an iffy impression, like the MS-Paint level of the graphic in general.There is no obligation in forming abbreviations with all the first letters though. Next-generation Compute Unit stands for NCU as well.
But the alleged diagram doesn't imply the instruction pipelining though. If the four-cycle lockstep execution is here to stay, that means at minimum the SIMD would be running an 8-wide wavefront, which fits two quads.There are some elements to the design that show optimizations for data swizzling between quads, and it's a reasonable expectation in a graphics context that a lot of work will be coming in a granularity of 4. A physically two-wide SIMD drawn in a similar position as a formerly independent 16-wide is creating a scenario where there's over-subscription when a quad needs to fit, or under-subscription if well-packed graphics wavefronts have to ignore it.
FXAA and the FOV is very small at 55º (which is why they're showing it in 3rd-person, I guess), so I think it'll be very hard to find a comparable test in the web.
Too bad. Still hope the variable SIMD thing would be in the real Vega. Less than 17 hours to go.The semi truck illustration is just something someone made on Reddit anyway. The patent shows an 8 + 4 + 2 + 1 + 1 = 16 example configuration.
That would split a quad across clocks, which may not have been necessary before with operations that do work on a quad granularity like interpolation or the quad-swizzle DDP ops. Then there's some elements of the GPU's graphics hardware that work on quad granularity as well. They could be buffered, but seemingly add complexity just to be different.But the alleged diagram doesn't imply the instruction pipelining though. If the four-cycle lockstep execution is here to stay, that means at minimum the SIMD would be running an 8-wide wavefront, which fits two quads.
That's good in my opinion, because I hope it's inaccurate enough to keep Vega interesting--just not too inaccurate.The semi truck illustration is just something someone made on Reddit anyway.
Although the diagram doesn't give two of those ALUs a vector register file to draw from.The patent shows an 8 + 4 + 2 + 1 + 1 = 16 example configuration.
Someone has taken a screenshot of the settings:
http://videocardz.com/65343/amd-demos-star-wars-battlefront-on-ryzen-and-vega-at-ces2017
FXAA and the FOV is very small at 55º (which is why they're showing it in 3rd-person, I guess), so I think it'll be very hard to find a comparable test in the web.
I would actually prefer variable wavefront sizes realized by a variable amount of looping with a narrower SIMD (like vec4). Okay, it stays a bit more granular (if one keeps the latency=troughput=4 cycles one would get wavefronts sizes of at least 16), but one could keep a lot of the other stuff intact. For the smaller wavefronts one needs relatively more scalar ALUs in the CU (optimally still one per 4 vALUs). But that should be a relatively small investment.Complexity is one drawback, in terms of steps in the execution loop, hardware dedicated to scheduling, register file allocation/access choices, and potentially peak throughput if following the patent diagram exactly (only 14 lanes in vector units, only one SIMD's worth in a CU).
Wavefronts can dynamically change width based on branch divergence, and the patent admits there is cost and uncertainty in deciding whether or not to change SIMD allocation to react to it.
I played the beta demo extensively but now I just got it for the PS4 (15€ IIRC) to get access to the X-Wing VR demo.Do you people actually Game? that is standard vertical FOV for all DICE games since atleast BF3 maybe even BF:BC2
You also have to consider the map when looking at other benchmarks endor is one of the more taxing GPU maps.
I dont have SW:BF installed but here is BF4 and BF1Why would a FOV slider be for vertical FOV? that makes no sense.
Regardless, it never crossed my mind that the FoV option shown in those settings was for vertical FoV. I've never seen a game with that setting before.
What do they call horizontal FoV?
This patent will be used in Navi,Not Vega.