Nvidia Ampere Discussion [2020-05-14]

The relative high transistor density of the A100, might possibly be a result from switching to the latest N7+ process.
Nah, it's due to all the MAC piles required for GEMM going brrrrr at alarming speeds.
See any other ML part; they all hit very nice density figures for their respective nodes.
Edit: I'm probably wrong about this as N7+ seems to be high density and not high performance.
Nah N7+ has all the same options as vanilla N7.
 
Nah, it's due to all the MAC piles required for GEMM going brrrrr at alarming speeds.
See any other ML part; they all hit very nice density figures for their respective nodes.
Tensor cores take only about 10% of the die in Turing, even if that would be 20% for Ampere, that can not explain the high transistor density.

Nah N7+ has all the same options as vanilla N7.
If that is the case, N7+ is a more plausible explanation for the high density compared to vanilla N7.
 
Last edited:
Just read about this new Graphcore AI processor, eclipsing the A100, with 59,4 billion transistors on 823 mm2.
That is a transistor density of 72 MT/mm2.
As this is more than the max of 66,7 MT/mm2 for vanilla 7nm, this also must be using N7+.
The high density is a result of the 900 MB on chip memory this Graphcore has.
1 bit of SRAM requires 6 transistors, so the 900 MB translates to 43,2 B transistors...
 
Last edited:
Just read about this new Graphcore AI processor, eclipsing the A100, with 59,4 billion transistors on 823 mm2.
That is a transistor density of 72 MT/mm2.
As this is more than the max of 66,7 MT/mm2 for vanilla 7nm, this also must be using N7+.
The high density is a result of the 900 MB on chip memory this Graphcore has.
1 bit of SRAM requires 6 transistors, so the 900 MB translates to 43,2 B transistors...
It doesn't say it's using HP cells, so doesn't need to be N7+ necessarily.
 
Last edited:
It doesn't say it's using HP cells, so doesn't need to be N7+ necessarily.
If it would use HD at 91.2 MT/mm2 (which is only suitable for mobile SoCs AFAIK), the 900 MB would fit in 473mm2, leaving the remaining 16.2 B non SRAM transistors on 350 mm2, for a density of 46.3 MT/mm2. I don't think so.
And in case you would further doubt, SRAM is the densest way you can pack transistors.
 
Last edited:
If it would use HD at 91.2 MT/mm2 (which is only suitable for mobile SoCs AFAIK), the 900 MB would fit in 473mm2, leaving the remaining 16.2 B non SRAM transistors on 350 mm2, for a density of 46.3 MT/mm2. I don't think so.
And in case you would further doubt, SRAM is the densest way you can pack transistors.
Or then TSMCs estimates on transistor density aren't counted on just SRAM cells, but assume certain proportion of memory vs logic vs phys and whatnot.
For what it's worth, Hexus says it's N7

https://hexus.net/tech/news/cpu/144154-graphcore-ipu-machine-m2000-1u-blade-capable-1petaflop/
 
Or then TSMCs estimates on transistor density aren't counted on just SRAM cells, but assume certain proportion of memory vs logic vs phys and whatnot.
For what it's worth, Hexus says it's N7

https://hexus.net/tech/news/cpu/144154-graphcore-ipu-machine-m2000-1u-blade-capable-1petaflop/
No that would be ridiculous. The first thing tried on a new process is always some SRAM and based on that density is specified.
Graphcore says: "...Graphcore Colossus™ Mk2 GC200 IPU. Developed using TSMC’s latest 7nm process"
Though they don't tell anywhere it is N7+, it doesn't need a genius to figure that out.
If it proves not to be N7+, I'll buy you a beer, or maybe something stronger :)
 
Last edited:
No that would be rediculous. The first thing tried on a new process is always some SRAM and based on that density is specified.
Graphcore says: "...Graphcore Colossus™ Mk2 GC200 IPU. Developed using TSMC’s latest 7nm process"
Though they don't tell anywhere it is N7+, it doens't need to be a genious to figure that out.
If it proves not to be N7+, I'll buy you a beer, or maybe something stronger :)
I'm not an expert on these on any level, but let's see
https://en.wikichip.org/wiki/7_nm_lithography_process#Industry
N7 with HD cells is supposed to offer around 91-92 MTrans/mm^2. We also know that with HD cells one SRAM cell is 0.027 µm^2, which fits into 1 mm^2 37 million times.
Since 1 SRAM cell is actually 6 transistors, that 37 million times turns into 222 million transistors.
So if the density was reported on just SRAM cells, it would be 222 MTrans/mm^2, not 91-92 MTrans/mm^2 for the HD cells, right?
(For what it's worth, wikichip did feel the need to point out that N7 can offer really dense SRAM cells)
 
I'm not an expert on these on any level, but let's see
https://en.wikichip.org/wiki/7_nm_lithography_process#Industry
N7 with HD cells is supposed to offer around 91-92 MTrans/mm^2. We also know that with HD cells one SRAM cell is 0.027 µm^2, which fits into 1 mm^2 37 million times.
Since 1 SRAM cell is actually 6 transistors, that 37 million times turns into 222 million transistors.
So if the density was reported on just SRAM cells, it would be 222 MTrans/mm^2, not 91-92 MTrans/mm^2 for the HD cells, right?
(For what it's worth, wikichip did feel the need to point out that N7 can offer really dense SRAM cells)
Apparently you can't just compute density from SRAM cell size.
Like in this table a 14 nm SRAM cell is 0.05um2, which would be 6x20 MT/mm2, but as you see in the table it is actually only 6x13.7 MT/mm2.
In practice it must be even less, as for the 7nm Samsung chip with 0.026um2 cell size on that same page, the 256 Mbit SRAM is 69.3mm2, which equates to a mere 22 MT/mm2. If anybody has access to this paper, there might be more information there.
 
The OctaneBench results are comparing A100 (no RT cores available ) vs the best Turing result (with RT cores).
 
Last edited by a moderator:
Back
Top