Bondrewd
Veteran
Welcome to the end of CMOS as we knew them!What really bothers me is a very little gain from process transition itself
Process gains are now harder to come by and even harder to realize, particularly on larger dies.
Welcome to the end of CMOS as we knew them!What really bothers me is a very little gain from process transition itself
More hardware, less clocks and xtor-level optimizations, and yes.Is a big navi (64cu or more) really a coming thing ? I mean, they are already at 225w with the 40cu version and 7nm, I don't get how they can do a bigger chip not consuming around 300w...
Infinity fabric ?Big edit - Going over this again is slightly confusing. The block diagram clearly shows 2 shader engines. Each Shader Engine has 10 "Dual compute units".
Now are these dual compute units the mixed wavefront thing they're doing? So 1 "Dual Compute Unit" = 2 32 thread (stream processor) or 1 64 SP unit. If so then the 5700 would need two of the block diagrams shown to match the given numbers.
Or these dual compute units are 2 64 SP units, and the block diagram represents a complete 5700. I'm honestly not sure which one it is, the terminology doesn't match up here.
Welcome to the end of CMOS as we knew them!
Process gains are now harder to come by and even harder to realize, particularly on larger dies.
Given that phone SoCs and CCDs are <80mm^2, Y E S.251 mm2 is a large die ?
And phone SoCs also draw way less power which helps a lot at these nodes.Given that phone SoCs and CCDs are <80mm^2, Y E S.
Yea, this is my first impression. GCN finally got significant changes done to its architecture. Some of those should have been in Fiji or Vega already.Navi 10 seems, like always, 1 or 2 years late...
I mean, N5/3 also offer plenty of area, just even more marginal and harder to realize perf/power uplifts.A positive observation comes from another slide though where they claim 2.3 times the performance per unit area
Fascinating that work can be issued in 32-work item hardware threads. I was expecting 64 and 128
I don’t really know enough about the properties of the 3nm options (I only know that AMDs chief of server products said that they’ll use it but beyond that the crystal balls go really murky).I mean, N5/3 also offer plenty of area, just even more marginal and harder to realize perf/power uplifts.
Cost per mm^2 yielded is also going up.But since it seems to offer potentially significant gain in density, products like GPUs can improve their performance per Watt by compromising frequency and dialing back power draw and still gain in absolute performance as well as in performance/W and /mm2
Navi 10 has a similar amount of transistor cores and performance compared to TU106 but its not faster. It seems that raytracing- and tensor-cores dont need much space.
I don't think you can really compare the transistor usage that closely.Navi 10 has a similar amount of transistor cores and performance compared to TU106 but its not faster. It seems that raytracing- and tensor-cores dont need much space.
Pascal had 7.2B transistors for 2560 SP. Turing has 10.8B for 2304 SP. That’s 66% more transistors per SP. Obviously there are other arch changes, but RT and tensor cores have a huge footprint collectively, clearly.Navi 10 has a similar amount of transistor cores and performance compared to TU106 but its not faster. It seems that raytracing- and tensor-cores dont need much space.
To be fair, some of it was in Vega, just disabled for various reasons. I'm curious how close were those features to working, and if a silicon respin or revision would have been enough to enable the functionality.Yea, this is my first impression. GCN finally got significant changes done to its architecture. Some of those should have been in Fiji or Vega already.
Pascal or Volta? Both have to be considered, especially since quite a bit of Turing seems to be Volta derived.Pascal had 7.2B transistors for 2560 SP. Turing has 10.8B for 2304 SP. That’s 66% more transistors per SP. Obviously there are other arch changes, but RT and tensor cores have a huge footprint collectively, clearly.
Volta is actually higher than Turing. 21.1B for 5120 SP.To be fair, some of it was in Vega, just disabled for various reasons. I'm curious how close were those features to working, and if a silicon respin or revision would have been enough to enable the functionality.
Pascal or Volta? Both have to be considered, especially since quite a bit of Turing seems to be Volta derived.
Ok guys tell me if I am crazy here: Is Navi using a Chiplet setup ?
Navi uses infinity fabric:
https://pics.computerbase.de/8/8/1/1/7/16-1080.9ce6ffcb.jpg
Ryzen 3000 chiplets:
http://www.comptoir-hardware.com/images/stories/_cpu/7nm_amd/ryzen3000-package.jpg