AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Thank you 3dilettante for your meanings. But if the rasterizer will fall back at such a simple task, you mean you need special software (Driver, Game,Bios?) that the rasterizer will not fallback?
Another possibility is that the early driver sometimes misdetects applications and chooses the wrong approach. That should be fixable easily and if utilized more broadly, the use of DSBR could speed up some possibly misdetected games as well.

It's around 526 according to GamersNexus
http://www.gamersnexus.net/news-pc/2972-amd-vega-frontier-edition-tear-down-die-size-and-more
900 mm² for the whole package.
You forgot to quote this just above the list:
"As for measurements, we took a pair of calipers to get some rough measurements. These may be off by 0.25-0.5mm."
I do not know what kind of calipers they used, but taken into account their own assessment of their margin of error, we are looking at ~504-550mm² according to gamersnexus.net.
 
Last edited:
You forgot to quote this just above the list:
"As for measurements, we took a pair of calipers to get some rough measurements. These may be off by 0.25-0.5mm."
I do not know what kind of calipers they used, but taken into account their own assessment of their margin of error, we are looking at ~504-550mm² according to gamersnexus.net.

Measurement at 14:35

 
Did a quick measurement as well using the PCPer image, comes up to 526mm²:

dsc03538copyv6j8e.jpg


Pretty sure that's the actual size of the die, or extremely close to that.
 
So what happens in a test that compares the performance of back-to-front versus front-to-back rendering?
 
Rasterization is done over tiles that stay on-chip, but hidden surface removal or culling pixels before they are shaded is not part of that.

That is why multiple triangles show up as in-progress even though they would all be at different depths. It doesn't matter if they issue out of order as long as at the very end their pixel outputs are correctly stored or rejected based on depth. That means the counter used in test to control the number of pixels rendered still increases even if the screen doesn't show the pixels.
This may meant this test isn't going to capture Vega's behavior the same way.

Vega's rasterizer is supposed to collect triangles, and prevent pixels that are blocked by a closer primitive from being shaded.
If Vega's new mode were on, what would this test actually do?
You can discard only primitives which don't have any influence on something. You can't do it with transparent stuff or when not only writing to framebuffer with z test. That means in case of the triangle bin test (writing to an UAV independent of the outcome of the z test) no HSR can be performed. It will look exclusively on the binning behaviour.
If the per-pixel counter is considered a condition for reducing the number of triangles in a batch to 1, one possible outcome is that the GPU cannot move on to the next triangle until the current batch is done. That starts looking sequential even if the rasterizer is fully enabled.
There is no reason, why an UAV access should disable binning (reducing batch size to 1 means disabling binning).

========================

Technically the mentioned test even in NV case does not show tiled rasterizer in action. It shows a weird side effect of it.
It basically shows you roughly the order the rasterizer operates on the submitted triangles in the different screen areas. As this is inherently changed with binning in place, I wouldn't call it a weird side effect.
All triangles are still rasterized and pixels shaders executed and pixels drawn. As mentioned the test is effectively counting pixels (increases atomic value) and at some point when specified number of pixels is reached it starts killing off the pixels. Since you can vary at which pixel number the pixel is killed you can determine the order in which the rasterizer walks over the screen. And that order is "funky" in case of Maxwell/Pascal.
There is nothing funky in my opinion of operating on screen tile after screen tile. It visualizes pretty directly the binning in operation.
The test case itself is nothing that some sort of tiling would speed up. If you're rendering say 100 triangles and writing something per pixel with atomic value that's already a reason to turn any pixel killing optimizations off. What if developer wanted to count the pixels those 100 triangles are covering and doesn't care about the overdraw (because he specifically doesn't use Z-buffer)?
Right, HSR can't work in this specific test because of the UAV write. But you still save a ton of external bandwidth. Each tile has to be written only exactly once to memory with tiling (the tiles sit in cache before), irrespective of the number of triangles drawn. With AMD's more "classical behaviour" you write the framebuffer once per triangle. So with 100 triangles nV's solution uses only 1% of memory bandwidth than the current AMD behaviour.
 
Last edited:
There is nothing funky in my opinion of operating on screen tile after screen tile. It visualizes pretty directly the binning in operation.
I know, I only used funky in quotes as in comparison to other common GPUs it jumps out.

Right, HSR can't work in this specific test because of the UAV write. But you still save a ton of external bandwidth. Each tile has to be written only exactly once to memory with tiling (the tiles sit in cache before), irrespective of the number of triangles drawn. With AMD's more "classical behaviour" you write the framebuffer once per triangle. So with 100 triangles nV's solution uses only 1% of memory bandwidth than the current AMD behaviour.
Right, NV saves a ton of bandwidth. But they also do quite a few things with their approach and we have no clue what AMD actually does or that it does everything that NV is doing. I get more of a "save shading time and color exports" vibe then what NV is doing. That is bin triangles in tiles and check if multiple triangles touch the same pixels and only shade/write only the top one (even if all were to make a hiz/z buffer pass). But this test does nothing to test for that.
 
Looking at the (very few) responses that AMD is giving the RX VEGA should come with some kind of black magic and be at least like 30% more powerful in gaming. I'm not hoping for that I think 20% would be high but possible but more than that just by software its like wanting a miracle.
 
Right, NV saves a ton of bandwidth. But they also do quite a few things with their approach and we have no clue what AMD actually does or that it does everything that NV is doing. I get more of a "save shading time and color exports" vibe then what NV is doing. That is bin triangles in tiles and check if multiple triangles touch the same pixels and only shade/write only the top one (even if all were to make a hiz/z buffer pass). But this test does nothing to test for that.
Binning is basically a prerequisite for doing HSR in a tile (more efficient than just an early z test) before the pixel shader. This test shows that binning is not active there. If and how effective HSR will work with Vega is another question. But to answer that question binning has to be working first because otherwise the GPU can't do HSR with the binned primitives. Binning should be a net positive even without HSR most of the time. The HSR stuff comes on top of it by reducing not only the external bandwidth requirements (what binning does also without HSR) but also saving actual work for the shader array. That are basically two things, where HSR depends on a working binning rasterizer (but not the other way around).

=======================================

Looking at the (very few) responses that AMD is giving the RX VEGA should come with some kind of black magic and be at least like 30% more powerful in gaming. I'm not hoping for that I think 20% would be high but possible but more than that just by software its like wanting a miracle.
Well, I agree that it is very optimistic to hope for this. But assuming that the current state is really not using any of the architectural improvements over Fiji (besides the L2 backing of the ROPs) as the results very much are identical to what to expect from a heavily overclocked FuryX (~1.4Ghz with slightly more exotic cooling), it wouldn't be completely impossible. That assumption requires quite a bit of faith, though. But keep in mind that nV claims their tiled approach saves them ~60% (!) of the external memory bandwidth (in combination with DCC, which Fiji was barely taking advantage of). Add in some saved work through the means of HSR within bins when applicable, the allegedly overhauled work distribution in the frontend, some slight IPC improvements of the CUs, and it may start to appear feasible. The alternative would be that AMD has severely messed up the Vega design.
 
Last edited:
Eh, if they are not willing to commit to a hard release date in the near future, it is the right move.
No contest here. Problem is: they already committed at Computex.

They are in the midst of another mismanaged product launch, and instead of reviewing their outside communications to make sure it says what it needs to say, they push something out, and have revise it a few hours later.
You'd think that they know by know that anything in a tweet, email, press release, or forum reply(!) gets put under a microscope.

FWIW, Capsaicin events have historically been developer-focused events.
True. But Lisa Su said clearly at Computex that they'd launch RX Vega at Siggraph this year.

Will they have an additional Siggraph event?
 
It couldn't really get much worse. That aside though do you have anything concrete to base those high hopes on?
I guess if I said just "yes" that would count as trolling, right? I am in good faith that the influx of able people in a variety of roles into the RTG will have an effect. Scott, Rys and now Damien, to name the last few I am aware of.
 
I am in good faith that the influx of able people in a variety of roles into the RTG will have an effect. Scott, Rys and now Damien, to name the last few I am aware of.
I had high hopes too when Scott joined RTG, but so far I don't see his effect at all, he even disappeared from the scene completely before Vega FE.

Giving what we know so far, do you still believe RX Vega will (at some levels) be as disruptive as 9700Pro was 13 years ago?
 
Giving what we know so far, do you still believe RX Vega will (at some levels) be as disruptive as 9700Pro was 13 years ago?
As I've explained earlier, I did write intentionally "impact", when comparing Vega to R300. That is only very loosely coupled to performance. AFAIR, for example, the first DX11 chips that got seeded with developers were not Cypress, but Juniper - and I am not saying that Vega FE is a cut-down or mainstream GPU, only that pure performance is not everything.
 
Marc will recrute someone good, i've not doubt about that.

BTW, what happened to "225w of power" ?

AMD-VEGA-10-specifications.jpg


If I look at the float number, at the time they targeted a little slower ship, but if the small bump = 75 more watts, damn just get back to the initial target frequency...
 
Back
Top