NVIDIA Tegra Architecture

The A57/A53 variant has taped out quite some time ago under 20SoC. If a second Erista variant exists it can't hide, we'll hear of its tape out fairly soon.

So I've heard some more news. Apparently there is no Denver based Erista variant in the works. I was told we may see something Denver based late this year at the earliest. Now it wasn't explicitly mentioned that it was an SoC. Maybe I'm reading too much into it but could it be a GPU?
So Anand folks what happened to Duke Nexus Forever? :devilish:

Haha..and its 2015 already! I'm curious as to what they have to say about it as well!
 
Delayed due to reasons. I'm hopeful for after CES...

Erineyes is pretty much spot on about Erista and Denver. We'll see tomorrow.

Sorry folks, but after CES most will try to analyze NV's Erista based marketing rubbish and will care less about the Nexus9 anymore.

Erinyes is usually pretty spot on on things, but it seems it's only you and I that notice ;)

http://www.hardwarezone.com.sg/feature-preview-nvidia-tegra-x1-benchmark-results

256 maxwell cores, no Denver (A57/53 big little like everyone else) and a 10w power envelope - which K1 was apparently specced to already.

Great performance, on 16nm I'm sure this could end up in phones.

Erista is on 20SoC TSMC from what I've heard and I haven't seen anything planned yet (no matter what) for 16FF for 2015. As for the rather interesting power measurements they have in that link, is that with or without throttling?
 
Anandtech's take on Tegra X1:
http://www.anandtech.com/show/8811/nvidia-tegra-x1-preview

The use of the ARM Cortex A57 and A53, as NVIDIA tells it, was based on a time-to-market decision, and that NVIDIA could bring an off-the-shelf Cortex-based SoC to the market sooner than they could another Denver SoC.


EDIT: It seems that in Erista's Maxwell, each core is capable of doing either one FP32 or two FP16 operations as long as the same operation (multiply/add/madd) is being done.
 
Last edited by a moderator:
Anandtech's take on Tegra X1:
http://www.anandtech.com/show/8811/nvidia-tegra-x1-preview




EDIT: It seems that in Erista's Maxwell, each core is capable of doing either one FP32 or two FP16 operations as long as the same operation (multiply/add/madd) is being done.

I need time to study that article since especially the CPU integration sounds quite interesting. As for FP16 related optimisations let's hear what the usual suspects have to say NOW about it. :rolleyes:
 
The full notes from NVIDIA's whitepaper, for anyone curious:

while Tegra X1 also includes native support for FP16 Fused Multiple-Add (FMA) operations in addition to FP32 and FP64. To provide double rate FP16 throughput, Tegra X1 supports 2-wide vector FP16 operations, for example a 2-wide vector FMA instruction would read three 32-bit source registers A, B and C, each containing one 16b element in the upper half of the register and a second in the lower half, and then compute two 16b results (A*B+C), pack the results into the high and low halves of a 32 bit output which is then written into a 32-bit output register. In addition to vector FMA, vector ADD and MUL are also supported.
 
Thanks for the article Ryan.
As for FP16 related optimisations let's hear what the usual suspects have to say NOW about it. :rolleyes:
I am glad to see that first AMD and now Nvidia joined the ranks. Now everybody has FP16 ALU support in their forthcoming chips. This is good news for post processing performance (and in general you need less lookup tables for complex math, meaning savings in memory BW). ALU shouldn't be that big bottleneck anymore for complex kernels (for example modern screen space AO techniques are quite ALU heavy).
 
Looking at the reference board picture, it looks like the X1's heatspread is close to 100mm^2 (compared to the micro-sd slot), so the chip itself should be even smaller.
 
Looking at the reference board picture, it looks like the X1's heatspread is close to 100mm^2 (compared to the micro-sd slot), so the chip itself should be even smaller.

Eventually someone will have a die shot and we'll find out. Without knowing though the exact transistor density, die estate is only half the useful information. If they've used a comparable density to the A8X (estimated 24Mio/mm2) and if your estimate is true it could be a healthy bit over 2b transistors.
 
You could drink shots of coloured paints and then hover over a sheet of paper and wait for nature to take its course, and you'd end up with a more accurate die shot than what's in NV Tegra marketing.
 
The first page had a marketing die shot... Don't know if it helps any.
You shouldnt mention it in Rys' presence. Last time he managed to make me look like a complete fool with the K1 mockup die shot LOL :D
 
You shouldnt mention it in Rys' presence. Last time he managed to make me look like a complete fool with the K1 mockup die shot LOL :D
I am a software engineer so I am quite limited in my die shot reading skills. I can (barely) distinguish caches from other logic :D

I would be perfectly happy with any photoshopped die shot :D
 
Back
Top