AMD Radeon RDNA2 Navi (RX 6500, 6600, 6700, 6800, 6900 XT)

AMD marketing keeping up with the GOOD WORK from Raja times, yay!

I'd rather fire all those Twitter clowns with their $10 256b 4k disruptors. AMD will never learn...
 
HBM2 version with 2.5GHz+ clocks when?

zJBwNYXab6TMJQ5vdNmVeX-970-80.png.webp
 
You should tell also to Scott Wasson and a ton of other sites still reporting the same thing
https://techreport.com/review/21404/crysis-2-tessellation-too-much-of-a-good-thing/

I'm also curious about on how @Dictator could demonstrate how Scott Wanson's screenshots of the concrete slabs in wireframe mode are a myth.. I don't think he faked those screenshots so maybe there's something else. It's definitely the first time I've heard about those being a myth.
I do remember someone saying the visible super-tesselated water in land levels was only visible because the wireframe was turned on (though that still didn't explain why the water had so much geometry), and therefore the water wouldn't have a performance impact during normal playthroughs. I don't remember any rebutal of the concrete slabs and many other models in the levels.


Its a performance issue. nVidia's RT Cores offloading more work from the shaders, have their own caches and Ampere has twice the triangle performance.
AFAIK, triangle performance nowadays takes secondary role to culling performance. How does triangle performance affect hybrid RT more than "simple" rasterization?

As for caches, AMD did mention that Infinity Cache would make a significant difference in RT performance, though I don't know if this needs some kind of software optimization or if the GPU's L3 has significantly higher latency than the cache in ampere's RT cores.
 
If you look closely the marks are at 128, 96 and 64MBs So I think AMD gave us an spoiler there.

Which are the sweet spot(when the curve flattens) for 1440p and 1080p.

Yep I did notice that of course and even 48MB has a reasonably high cache hit rate for 1080p so even that is possible.
HBM2 version with 2.5GHz+ clocks when?

zJBwNYXab6TMJQ5vdNmVeX-970-80.png.webp

N21 does not support HBM but overclocked AIB versions are coming on the 25th.
 
Yep I did notice that of course and even 48MB has a reasonably high cache hit rate for 1080p so even that is possible.

64MB has very similar hit rates on 1440p compared to 128MB on 4K (55%?), and to expand on that:

96MB: ~50% hit rates on 4K, 60% on 1440p, ~68% on FHD;
64MB: ~45% on 4K, ~55% on 1440p, ~65% on FHD
48MB: ~30% on 4K, ~47% on 1440p, ~60% on FHD
32MB: ~23% on 4K, ~45% on 1440p, ~55% on FHD
24MB: ~20% on 4K, ~27% on 1440p, ~48% on FHD.


At 1080p, even the 18MB(?) cache gets significant hit rates up to ~40%, and for a mobile SoC that is limited to e.g. 128bit LPDDR5 so this could make worlds of difference.
It also doesn't look like 96MB would have a significant advantage against 64MB if the GPU is targetting 1440p, so perhaps Navi 22 may use only 64MB.

And as for Navi 23, it's reportedly coming with HBM2, and the Infinity Cache is there to compensate for a lower off-chip bandwidth. It could be using just 32MB LLC + HBM2e and that way it could have a similar "effective bandwidth" to the Navi 22 with 64MB LLC + 192bit GDDR6.


Of course, what's missing here is the fact that lower amounts of cache supposedly means lower LLC bandwidth, unless AMD is keen on using more slices with lower capacity each (which in turn wouldn't linearly scale down in die area).





You should tell also to Scott Wasson and a ton of other sites still reporting the same thing
https://techreport.com/review/21404/crysis-2-tessellation-too-much-of-a-good-thing/

BTW, here's a video with more of the super tessellated models in Crysis 2:

 
Why do you say that? HUB always came across as fair and balanced to me.

I'm making my way through all the big sites and they story seems pretty consistent so far. Very strong rasterization performance, great power efficiency and noise levels. Slower but competitive at 4K and significantly behind on RT.

I don't watch that channel regularly, only recall an impression from content that's been shared. I could be wrong. Regardless of HUB's objectivity I think Gamers Nexus' review is just plain better.
 
In linux and macOS you can se that we have 4 Shader Engines with 2 Shader Array, and each shaderarry have 1 Scanconverter. So this means a total of 8 Rasterizer, but why AMD is drawing only 4 in its architecture slide? Are driver values wrong?
https://www.techpowerup.com/review/amd-radeon-rx-6800-xt/images/arch1.jpg
Actually you could argue they list 16 on the slide, but regardless AMD counts two RB+'s in that diagram to be one RB+.
Here's a better slide for the RB's
upload_2020-11-18_20-16-36.png
 
I'm impressed by the rasterization performances, the "audacity" to use a big cache, etc, giving how navi 1 was a let down (to me). But, I've the feeling that they are still too late or one generation behind again. The RT perfs are problematic imo, and the lack of dlss solution doesn't help. And if there is one day, it will run on the already "overflown" shaders units when RT is done ? Meehh... I'm on the market for a new cpu/mb/gpu combo, but I can see the 6800xt as a good solution.

Now if you don't care about RT, it's reeeaaaally and impressive gpu. If you can find it.

Still, I like the deep reviews when they come, and all the exchanges here : )
 
I'm impressed by the rasterization performances, the "audacity" to use a big cache, etc, giving how navi 1 was a let down (to me). But, I've the feeling that they are still too late or one generation behind again. The RT perfs are problematic imo, and the lack of dlss solution doesn't help. And if there is one day, it will run on the already "overflown" shaders units when RT is done ? Meehh... I'm on the market for a new cpu/mb/gpu combo, but I can see the 6800xt as a good solution.

Now if you don't care about RT, it's reeeaaaally and impressive gpu. If you can find it.

Still, I like the deep reviews when they come, and all the exchanges here : )
How sure are you about those shaders being overflown? I don't remember people complaining about DLSS-performance when it was running on CUDA-cores in few versions during Turing-era.
 
How sure are you about those shaders being overflown? I don't remember people complaining about DLSS-performance when it was running on CUDA-cores in few versions during Turing-era.

I'm not, so I used a "?". But since the RT "core" in RDNA2 offload less that the nvidia arch, I guess the shaders are more used (while on ampere some calculations are still done on the rt cores).
 
In linux and macOS you can se that we have 4 Shader Engines with 2 Shader Array, and each shaderarry have 1 Scanconverter. So this means a total of 8 Rasterizer, but why AMD is drawing only 4 in its architecture slide? Are driver values wrong?
https://www.pcgamer.com/a-linux-update-may-have-let-slip-amd-big-navis-mammoth-core/


arch1.jpg

https://www.techpowerup.com/review/amd-radeon-rx-6800-xt/images/arch1.jpg

The slide matches the officially stated primitive rate so I'm inclined to believe it. It's curious that the XSX is stated as having one primitive unit per SA though. If the PS5 is the same that would mean the XSX has higher primitive throughput than the 6800 and the PS5 matches the 6900XT!
 
64MB has very similar hit rates on 1440p compared to 128MB on 4K (55%?), and to expand on that:

96MB: ~50% hit rates on 4K, 60% on 1440p, ~68% on FHD;
64MB: ~45% on 4K, ~55% on 1440p, ~65% on FHD
48MB: ~30% on 4K, ~47% on 1440p, ~60% on FHD
32MB: ~23% on 4K, ~45% on 1440p, ~55% on FHD
24MB: ~20% on 4K, ~27% on 1440p, ~48% on FHD.


At 1080p, even the 18MB(?) cache gets significant hit rates up to ~40%, and for a mobile SoC that is limited to e.g. 128bit LPDDR5 so this could make worlds of difference.
It also doesn't look like 96MB would have a significant advantage against 64MB if the GPU is targetting 1440p, so perhaps Navi 22 may use only 64MB.

And as for Navi 23, it's reportedly coming with HBM2, and the Infinity Cache is there to compensate for a lower off-chip bandwidth. It could be using just 32MB LLC + HBM2e and that way it could have a similar "effective bandwidth" to the Navi 22 with 64MB LLC + 192bit GDDR6.


Of course, what's missing here is the fact that lower amounts of cache supposedly means lower LLC bandwidth, unless AMD is keen on using more slices with lower capacity each (which in turn wouldn't linearly scale down in die area).







BTW, here's a video with more of the super tessellated models in Crysis 2:



https://www.resetera.com/threads/cr...-2020-and-february-2021.317704/#post-50113078

Wireframe view removes lod and occlusion culling. What you see in wireframe view is not what it renders in ... normal ... view.
 
Super Resolution Technology in development in partnership with Console and game development partners and not exclusive development inside AMD - Scott Herkelman
Is the FidelityFX SR being developed with MS or Sony, MS I would bet. They have been cooperating a lot lately.
Just yesterday AMD announced to use MS Azure hybrid cloud solution to host their cloud EDA workflows and several hours ago AMD announced to integrate MS Pluton Security block in future Ryzen chip.
 

Is the FidelityFX SR being developed with MS or Sony, MS I would bet. They have been cooperating a lot lately.
Just yesterday AMD announced to use MS Azure hybrid cloud solution to host their cloud EDA workflows and several hours ago AMD announced to integrate MS Pluton Security block in future Ryzen chip.

Yes it is probably MS.
 
Wireframe view removes lod and occlusion culling. What you see in wireframe view is not what it renders in ... normal ... view.
LOD levels according to distance are unknown, and there's also no reason why the models on simple geometry like concrete slabs would ever need to have that ridiculously complex geometry in the first place.
I.e. if those tessellation levels and geometry complexity were never going to be used, why would they be there in the first place?

Also, I'm not expecting Crytek devs to ever say "well yeah nvidia pressured us to put unnecessary geometry on our models, so we did as they told".
Nvidia finding ways to get excess geometry in games is something AMD themselves complained about for several years, until Polaris and then Vega came up with higher triangle culling performance. Hairworks is probably the gravest example of subpixel triangles finding their way into games.



This has nothing to do with RDNA2 of course, the point is we should count on the same happening with RT.
 
64MB has very similar hit rates on 1440p compared to 128MB on 4K (55%?), and to expand on that:

96MB: ~50% hit rates on 4K, 60% on 1440p, ~68% on FHD;
64MB: ~45% on 4K, ~55% on 1440p, ~65% on FHD
48MB: ~30% on 4K, ~47% on 1440p, ~60% on FHD
32MB: ~23% on 4K, ~45% on 1440p, ~55% on FHD
24MB: ~20% on 4K, ~27% on 1440p, ~48% on FHD.

At 1080p, even the 18MB(?) cache gets significant hit rates up to ~40%, and for a mobile SoC that is limited to e.g. 128bit LPDDR5 so this could make worlds of difference.
It also doesn't look like 96MB would have a significant advantage against 64MB if the GPU is targetting 1440p, so perhaps Navi 22 may use only 64MB.

And as for Navi 23, it's reportedly coming with HBM2, and the Infinity Cache is there to compensate for a lower off-chip bandwidth. It could be using just 32MB LLC + HBM2e and that way it could have a similar "effective bandwidth" to the Navi 22 with 64MB LLC + 192bit GDDR6.

Of course, what's missing here is the fact that lower amounts of cache supposedly means lower LLC bandwidth, unless AMD is keen on using more slices with lower capacity each (which in turn wouldn't linearly scale down in die area).

The reason I speculated 96MB for N22 was because of the rumored die size of ~340 mm2. Since it has roughly the same configuration as Navi 10, the extra area almost certainly points to a large cache. It could certainly be 64 MB as well but I'd still bet on 96 MB.

I can definitely see it helping massively in SoCs and mobile GPUs, although I'd expect at least 32 MB for it to make a significant impact (And this is the rumor for Rembrandt). N23 has been confirmed to have a 128 bit GDDR interface from the MacOS leaks btw so no HBM. While on the topic of mobile GPUs though I wanted to highlight this:-

arch2.jpg

AMD claims less than 0.5X power of RDNA 1 at the same clocks, which is extremely impressive. We should see AMD as the undisputed leader of the mobile GPU segment next year.
 

Attachments

  • arch2_small.jpg
    arch2_small.jpg
    23 KB · Views: 14
Back
Top