AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Status
Not open for further replies.
If a 5700XT with 40 CUs and 1.755 GHz game clock has a TDP of 225W (https://www.techpowerup.com/gpu-specs/radeon-rx-5700-xt.c3339), a 72 CUs part with 50% perf/W improvement (at game clock?) would have a 270W TDP. But we’re talking a ~35% higher game clock here.. so perhaps the 255W figure is about the main chip only?

Those 225W are TBP not TDP according to AMD https://www.amd.com/en/products/graphics/amd-radeon-rx-5700-xt
(Scroll all the way down for the specs)
 
There are really no rumors / insiders suggesting 160CUs..

Plus, 4x Navi10 CUs at 2.1GHz wouldn't compete with a 3080/3090, it'd be considerably faster.
 
I'm really curious how amd will market RDNA2 on PC. If the rasterisation perfs are on par with 3080, but RT 2 steps behind, do you sell it same price as 3080 ? How do you communicate about RT in this situation ? In 2021, I don't see myself buying a card which do not allow me to play with RT a little (CyberPunk 2077 and the remaster of W3 are the 2 main game to come for me).

Or, a good scenario would be yes, RT is not at ampere level, but sit between Turing and Ampere ?
 
There are really no rumors / insiders suggesting 160CUs..
True. AMD's counter-espionage has hit truly spectacular levels though. NVidia has done much the same, though it seems AMD has won this round.

My die analyses of Navi variants are trivial things for NVidia to have done too (and NVidia has had a very long time to do them). I believe this is why NVidia has marketed "value" so heavily, because it doesn't take a rocket scientist to see that GPU compute is uber-cheap now.

Honestly, I'm gobsmacked by the weak compute of the two consoles (16% of the XSX die is for CUs). I see this as a major fail. Or, built-in extreme obsolescence. Perhaps the next console gen will consist of 2 refreshes?

Plus, 4x Navi10 CUs at 2.1GHz wouldn't compete with a 3080/3090, it'd be considerably faster.
Why would 6900XTX with ~41TF (2GHz) be on a completely different level from 3090 at 36TF? Only NVidia is now allowed to have huge amounts of compute?

160 CUs or more SIMDs per CU?
I've contemplated more SIMDs per CU. ALI:TEX could double, sure, but I wonder about LDS space and LDS versus VGPR/lane-mechanics too. To be honest, I have no theory for or against.

If both XSX and PS5 have RDNA 1 CUs, then yes, PC Navi could be a 4-SIMD per CU "monster", being the only "RDNA 2" GPU that has RDNA 2 CUs.

Also, I think a patent that talks about CUs sharing L1 encourages lots of CUs per L1. One thing I haven't been able to work out is whether an L1 is per shader engine or per shader array.

Because an L0 (and LDS) is shared by two CUs in a WGP, the patent should probably be read with WGPs in mind, not CUs. The WGP is the real unit of compute in RDNA, not a CU.

Additionally there are vague rumours saying that RDNA 2 is real RDNA, not the GCN/RDNA hybrid seen in Navi 1x. This could be interpreted to mean that any rumours that talk about Navi 2x CUs should be re-interpreted with WGP replacing CU.

Both would be monstrous but I don’t see how the former is possible unless they’ve hit theoretical peak density numbers on 7nm.
I don't understand what you mean by theoretical peak density numbers and why hitting them is relevant.

A Navi 10, 14 or XSX CU is ~2mm². A 5xxmm² die with ~150mm² of "cache" doesn't make sense to me. No matter how exciting the idea of a solid 128MB lump of last level cache, I can't take it seriously. I believe that a cache is a cache precisely because it's a small, efficient, block of memory.
160 CU's, 2.4 GHz clocks.

Silly season is in full force.
We've seen a die that's considerably larger than 500mm². So take your pick:
  1. massive last level cache
  2. lots of CUs, about the same FLOPS as EDIT: GA102
  3. HBM 4096-bit bus plus 512-bit GDDR6 bus
  4. some combination of these
Navi 21 scaled up from Navi 10 with 80 CUs and 4 shader engines is only about 360mm².

I would expect the full die to run at lower clocks.
 
Last edited:
I think it would be 160 HCUs. Not CUs. Not DCUs. ...
 
180W for Navi 10? If you start there you get 215W for the chip alone, without considering the much higher clock (if true) and everything else on the board. I am not say it’s impossible but perhaps the numbers we got so far are a bit off..

OTOH if 255W is for Navi21 ‘only’ then it start making more sense.

I was refering the the leak that said the 225W was the TGP. iirc the 5700xt have a 225W TGP with 180W TDP.

Tbh if we believe all these leaks we would expect Navi to have a fully functional AI chip that can upscale image from 240p to 4K with 99% Quality consuming 150W....

I will keep myself a little skeptical until I see the final product. I don't care about the Top of the line so for me the price/performance in the mid range is more important and I know AMD will at least have that.


I'm really curious how amd will market RDNA2 on PC. If the rasterisation perfs are on par with 3080, but RT 2 steps behind, do you sell it same price as 3080 ? How do you communicate about RT in this situation ? In 2021, I don't see myself buying a card which do not allow me to play with RT a little (CyberPunk 2077 and the remaster of W3 are the 2 main game to come for me).

Or, a good scenario would be yes, RT is not at ampere level, but sit between Turing and Ampere ?


I don't think AMD will price Navi based on RT. They don't have the "brand" to do so, it will be slower than NV and only a handful ppl care that much about it. AMD until now market it's RT implementation as "functional" like not losing 50% performance when enabling it so I expect their marketing will go in that direction something like "less brute force but more intelligent and useful" We will see. I am also waiting for CP2077 with RT.
 
I'm really curious how amd will market RDNA2 on PC. If the rasterisation perfs are on par with 3080, but RT 2 steps behind, do you sell it same price as 3080 ? How do you communicate about RT in this situation ? In 2021, I don't see myself buying a card which do not allow me to play with RT a little (CyberPunk 2077 and the remaster of W3 are the 2 main game to come for me).

Or, a good scenario would be yes, RT is not at ampere level, but sit between Turing and Ampere ?
IIRC someone calculated that theoretical 80 CU @ 2.x GHz (can't remember exact number) would be clearly above anything NVIDIA has in ray-box but would be slower than 3080 (or was it 2080 Ti? 2080?) in ray-triangle
 
I think for raytracing the reality is going to be that it doesn't depend on raytracing hardware at all, but that the raytracing HW on both sides is probably going to be fast enough that the memory/cache hierarchy and divergency become the bottleneck. Those see stark differences between raytracing and "traditional" workloads in ways that are much harder to optimize for raytracing.

e.g walking the BVH tree for raytracing is going to be way more latency sensitive than most GPU things workloads we've seen and both the tree walk as well as the rays scattering around are horrible for shader divergence.
 
IIRC someone calculated that theoretical 80 CU @ 2.x GHz (can't remember exact number) would be clearly above anything NVIDIA has in ray-box but would be slower than 3080 (or was it 2080 Ti? 2080?) in ray-triangle

How would you even begin to calculate that? AMDs patent suggests each intersection engine can do 4 boxes or 1 triangle per clock. Even if you assume that’s true what numbers would you use for Turing and Ampere?
 
How would you even begin to calculate that? AMDs patent suggests each intersection engine can do 4 boxes or 1 triangle per clock. Even if you assume that’s true what numbers would you use for Turing and Ampere?
Couldn't find the image, I think it was explained at the time where they got their numbers for GeForces. I'll try to dig it up later today if I have the time, if someone else doesn't have it at hand (I'm pretty sure it was in this particular thread)
 
Couldn't find the image, I think it was explained at the time where they got their numbers for GeForces. I'll try to dig it up later today if I have the time, if someone else doesn't have it at hand (I'm pretty sure it was in this particular thread)

If it’s the image I think you’re talking about I thought that was made up nonsense. Nvidia hasn’t shared any details about its RT units.
 
I'm really curious how amd will market RDNA2 on PC. If the rasterisation perfs are on par with 3080, but RT 2 steps behind, do you sell it same price as 3080 ? How do you communicate about RT in this situation ? In 2021, I don't see myself buying a card which do not allow me to play with RT a little (CyberPunk 2077 and the remaster of W3 are the 2 main game to come for me).

Or, a good scenario would be yes, RT is not at ampere level, but sit between Turing and Ampere ?
RT perf is irrelevant. AMD simply has to be cheaper considering their weaker brand recognition and mainly their lacking SW - utterly horrid last gen driver experience, possibly broken OpenCL, lack of DLSS, CUDA, "driver utils" like Ansel, etc.
 
Status
Not open for further replies.
Back
Top