Speculation: GPU Performance Comparisons of 2020 *Spawn*

Status
Not open for further replies.
Sweclockers reports that it still has many things that it carries over from GCN
Well no shit, as does RDNA2, for GCN ISA and most hardware bits not lifted directly from previous VLIW design weren't designed by morons.
This meme has to die.
AMD shifting all of their optimizations for gaming
Weird way to say "we've sent our physdes manpower into reeducation camps and they returned as cyborgs outta there".
 
Alright, math time.

Ampere:

384bit bus, 21gbps GDDR6x. Linear scaling from Turing for gaming IPC (other than raytracing). Samsung 8nm lpu
24 teraflops- 50% increase in performance from RTX Titan. Probable RTX Titan II (similar name). 250-300 watts(?) board power. 24gb ram. $2000+
(3090 ti salvage w/352bit bus next year?)

320bit bus, 21gbps GDDR6x. 20 teraflops- 50% increase in performance from RTX 2080 ti. Probable RTX 3090.
200-250 watts(?), 10/20gb of ram (different AIB configs allowed?) $1000+(?)

256bit bus, 21gbps GDDR6x. 15 teraflops- 50% increase from 2080. Probable RTX 3080, 16gb ram. $800(?)

256bit bus (Salvage from above?) 16 (probable)-18gbps GDDR6. 11.4(probable)-12.9 teraflops- 14-28% increase from 2080.
16gb/8gb ram (aib choice?) $500.

Lower? 192bit bus??

RDNA 2:

512bit bus. 14-18gbps GDDR6. IPC/delta compression gains from RDNA1 unknown. TSMC 7nm(6nm, 7+?).
19.5-25 teraflops, limit might be bandwidth but probably tdp, 300 watts board power. 16gb ram. $1000+

Salvage of above? 16-20 teraflops(?), 14-18gps, 512-448bit bus(?). 14-16gb ram. $800(?)

256bit bus 18gbps GDDR6, 12.5 teraflops, -28% increase from RX 5700xt. 16gb ram $400-500.

256bit bus 14gbps GDDR6, salvage from above, 10 teraflops. 8-16gb ram $329-400.

Lower? 128bit bus??

There, predictions for this year recorded.
 
Last edited:
24 teraflops
With 84 Turing SMs this would require 2.23 GHz clock speed. Latest leaks point to 1.75 GHz boost clocks for GA102.
So there are two possibilities: it'll be 19 tflops, around +30% to TU102.
Or they've doubled FP32 SIMD width again which would result in 37 tflops peak for GA102.
 
With 84 Turing SMs this would require 2.23 GHz clock speed. Latest leaks point to 1.75 GHz boost clocks for GA102.
So there are two possibilities: it'll be 19 tflops, around +30% to TU102.
Or they've doubled FP32 SIMD width again which would result in 37 tflops peak for GA102.

Or as discussed in Ampere thread, there's 2 FP2 + 1 INT SIMDs fighting for 2 execution slots. If that leads to FP making up 2/3rd of the execution on average: 37 * 0.66 = 24.42
It could also be that we'd be dealing with a cut down 82 SM chip and a slightly higher FP ratio, different clocks or many other combinations, but IMO most combinations would be centered around those effective 24 TFLOPs, just with bigger or smaller deviations.
 
Where do the 24TF rumors come from? Doubt we will see such high performance numbers, from pascal to turing (1080ti to 2080ti) was a 4TF increase? Between 18 to 20TF maybe?
 
Where do the 24TF rumors come from? Doubt we will see such high performance numbers, from pascal to turing (1080ti to 2080ti) was a 4TF increase? Between 18 to 20TF maybe?

Came from the same 2-3 guys leaking everything else afaik.
As for the increase, it doesn't work just like that. Different architecture changes may focus on different things, but most importantly, Turing to Pascal was bigger than it seems thanks to the separate INT units freeing up the FP units. Depending on INT load a 10 tflop Pascal could easily end up being comparable to as low as a 6 tflops Turing card.
 
Came from the same 2-3 guys leaking everything else afaik.
As for the increase, it doesn't work just like that. Different architecture changes may focus on different things, but most importantly, Turing to Pascal was bigger than it seems thanks to the separate INT units freeing up the FP units. Depending on INT load a 10 tflop Pascal could easily end up being comparable to as low as a 6 tflops Turing card.

Yes Turing was an absolute beast, it added raytracing, tensor/dlss AI, mesh shading and monstrous rasterization performance, besides other improvements like vram setup etc. Just getting turing on 7nm would be very intresting, Ampere must be a very nice upgrade over Turing to say the least. Only the ram quantities don't increase so much, but 12gb just for vram alone would suffice i assume, in special if they can achieve 1tb/s with their gddr6x.
 
Yes Turing was an absolute beast, it added raytracing, tensor/dlss AI, mesh shading and monstrous rasterization performance, besides other improvements like vram setup etc.
True dat, although all those features negatively impacted the die size. For the Ti die it was 754mm2 vs 470mm2 (Tu is 160% of Pa) despite being on 12mm vs 16mm TSMC. Turing is very large and thus $$$. Looking forward to see the Amper's 7/8mm sizes.
 
True dat, although all those features negatively impacted the die size. For the Ti die it was 754mm2 vs 470mm2 (Tu is 160% of Pa) despite being on 12mm vs 16mm TSMC. Turing is very large and thus $$$. Looking forward to see the Amper's 7/8mm sizes.

12nm and 16nm is basically the same node, differences are so small not even TSMC seems to care to differentiate anymore: https://www.tsmc.com/english/dedicatedFoundry/technology/16nm.htm

So it makes more sense to compare based on performance so it's 545mm2 vs 470mm2. And RTX 2080 is typically faster enough in "standard" rasterization to make up for the difference, I'd say. But then, on top of that for what's been calculated in die shots as being <10% of the SM area (<5% actual die size) Tensor cores enable DLSS 2.0 to provide 50%-80% performance increase, with generally equal or better image quality (subjective to a point of course). Plus RT, mesh shaders, VRS, etc, etc, which are still not really influencing the performance of most game yet, but eventually will.

EDIT: In fact, according to the numbers the transistor density on the GP102 is higher (25.4 Mt/mm2) than TU104 (24.9 Mt/mm2) and TU102 (24.6 Mt/mm2)
 
Last edited:
Where do the 24TF rumors come from? Doubt we will see such high performance numbers, from pascal to turing (1080ti to 2080ti) was a 4TF increase? Between 18 to 20TF maybe?

No rumors in specific used for compute, just math going from maximum bandwidth to compute ratios for both. The confirmed 21gbps GDDR6x would, with a 384bit bus and the same ratio of compute to bandwidth, give 24 teraflops maxed out. Though hells, maybe they'll stick it in one of their 400 watt sockets and charge $3k+ for it.
I'm also assuming whatever happens with compute/bandwidth in RDNA2 equals itself out.

For Ampere, I suspect they're less "doubling FP throughput" as "halving int throughput" as they've found it doesn't bottleneck most titles a lot compared to how much die space they can save. Which would explain the 3090 benchmark would hit at 40% increase from a 2080ti at times rather than a 50%, any int heavy things could lower performance.

For RDNA2, I feel like I've got nothing. We know some details of both the PS5 and XSX, but like... what the power draw to compute ratio is, what IPC and/or Compression improvements have been made, etc. Well there doesn't feel like anything. Obviously a handful of the Nvidia leaks have proved true, but there's been almost no leaks concerning AMD, so it's just a shot in the dark from the 1 supposed one that sounded even close to true.
 
For RDNA2, I feel like I've got nothing. We know some details of both the PS5 and XSX, but like... what the power draw to compute ratio is, what IPC and/or Compression improvements have been made, etc. Well there doesn't feel like anything. Obviously a handful of the Nvidia leaks have proved true, but there's been almost no leaks concerning AMD, so it's just a shot in the dark from the 1 supposed one that sounded even close to true.

I thought this was a funny comment (which I agree with).
But if you look on other forums its:
Oh AMD is up to their old hype tricks dont they ever learn!
AMD said Big Navi is going to be their Nvidia Killer!
AMD has been over-hyping again like every generation.
Blah
Blah
Blah

I really can't believe that people see some random web site say something and attribute it to the company that is making the product like its a fact.
 
Status
Not open for further replies.
Back
Top