NVIDIA Tegra Architecture

The 64-bit Geekbench 3 data is based on a preliminary [alpha] build of the software for Android. This is what Ars had to say:

"This is the first major 64-bit Android device on the market, so the team at Geekbench sent us a preliminary 64-bit build of its benchmarking software. There were some lower-than-expected results in a few workloads; Primate Labs' John Poole noted that those numbers may increase in the near future as the team refines the 64-bit version of Geekbench for Android. Given the improvements inherent to the new ARMv8 instruction set, performance across the spectrum should increase as 64-bit Android devices become more commonplace."
 
Well that's progress. Software updates can't fix the build and design problems though
You speak of the Nexus 9 right? If so I agree, no matter the lack of 64 bit support I find the Shield tablet and Xiaomi Mipad more attractive than the Nexus 9.
there are a lot of optin non based on Nvidia hardware that provide significantly more bang for bucks, last Asus Tab, last lenovo yoga tab, Samsung tab 8.4 s, etc.)

The nexus 9 is nothing like the steal that the 2 preview Nexus 7 were.
I think that gen of nexus devices is the least interesting generation of nexus devices ever
 
GK20A vs GX6650 perf/w

Now that the Anandtech iPad Air 2 review is out, I thought it'd be interesting to compare Kepler's performance per watt to the GX6650 in the iPad Air 2. It's never possible to get a fair comparison, since they're on different processes, have different costs and different displays, but still, it's worth taking a look. The Time and FPS numbers come from the GFXBench 3.0 battery test in Anand's review.
Code:
              | SHIELD Tablet | iPad Air 2
Battery (Wh)  | 19.75         | 27.62
Time (h)      | 2.24          | 3.84
Perf (FPS)    | 46.13         | 49.49
Pixels        | 2304000       | 3145728
Perf (Mpix/s) | 106.28        | 155.68
Watts         | 8.82          | 7.19
Perf/W        | 12.05         | 21.64

Looks like the iPad Air manages almost 2x the perf/w of the SHIELD Tablet.
 
Now that the Anandtech iPad Air 2 review is out, I thought it'd be interesting to compare Kepler's performance per watt to the GX6650 in the iPad Air 2. It's never possible to get a fair comparison, since they're on different processes, have different costs and different displays, but still, it's worth taking a look. The Time and FPS numbers come from the GFXBench 3.0 battery test in Anand's review.
Code:
              | SHIELD Tablet | iPad Air 2
Battery (Wh)  | 19.75         | 27.62
Time (h)      | 2.24          | 3.84
Perf (FPS)    | 46.13         | 49.49
Pixels        | 2304000       | 3145728
Perf (Mpix/s) | 106.28        | 155.68
Watts         | 8.82          | 7.19
Perf/W        | 12.05         | 21.64
Looks like the iPad Air manages almost 2x the perf/w of the SHIELD Tablet.

It'll get easier to decypher if you consider two things as a start:

1. It's not a GX6650 but most likely a GX6850 which means 16 TMUs and 256 FP32 SPs @~500MHz.
2. It's a DX10.0 vs. DX11.0 GPU.
 
It'll get easier to decypher if you consider two things as a start:

1. It's not a GX6650 but most likely a GX6850 which means 16 TMUs and 256 FP32 SPs @~500MHz.
2. It's a DX10.0 vs. DX11.0 GPU.

Anand seems to think it's a GX6650 - what makes you think otherwise? I suppose if it's a GX6850, Apple has traded off perf/mm^2 to improve perf/w - and a theoretical GK20A that had 2 SMs but ran slower might do better. I don't think DX10 vs 11 should impact perf/w that much - GFXBench 3.0 doesn't use any of the new features, and I don't think DX11 GPUs require that much more abstraction than DX10.

Makes me curious for the upcoming Erista comparisons. =)
 
Anand seems to think it's a GX6650 - what makes you think otherwise?

Joshua seems to think so lack of any other data, not Anand himself (just to avoid mistunderstandings).

http://gfxbench.com/device.jsp?benchmark=gfx30&os=iOS&api=gl&D=Apple iPad Air 2

You have a fillrate of 7625 MTexels/s at Kishonti; with 6 cluster you have 12 and with 8 clusters 16 TMUs.

7625/12= 635
7625/16= 476

Considering that no GPU has a 100% fillrate efficiency frequencies should be a healthy bit higher than those values. A roughly 500MHz frequency for an 8 cluster Rogue sounds more like Apple to me.

I suppose if it's a GX6850, Apple has traded off perf/mm^2 to improve perf/w - and a theoretical GK20A that had 2 SMs but ran slower might do better.

Die area and consumption would rise significantly on a hypothetically GK20A with 2 clusters, unless you mean 2 SMXs@425MHz or else half the peak frequency of the current GK20A@K1.

I don't think DX10 vs 11 should impact perf/w that much - GFXBench 3.0 doesn't use any of the new features, and I don't think DX11 GPUs require that much more abstraction than DX10.

Apple A8 has a 4 cluster GX6450 clocked at estimated =/>520MHz (6 Plus) which captures 19.1mm2@20SoC TSMC according to Anandtech from chipwork's die shots. Now die are won't scale linearly with the amount of clusters since they won't scale everything, but assuming the A8X GPU weighs 30-35mm2 under 20SoC with DX10 and you'd get in the 45-50mm2 region with all DX11.x bells and whistles that would raise your perf/mm2 and perf/W ratios.

Makes me curious for the upcoming Erista comparisons. =)

Which will again lie somewhere in the middle between the A8X and A9/tablet-whateverSoC from Apple since NV and Apple are updating at completely different timings within the year. Besides Erista will be on 20SoC TSMC, while Apple's next tablet SoC either Samsung 14nm or TSMC 16FF wherever they can get decent volume at.
 
Now that the Anandtech iPad Air 2 review is out, I thought it'd be interesting to compare Kepler's performance per watt to the GX6650 in the iPad Air 2. It's never possible to get a fair comparison, since they're on different processes, have different costs and different displays, but still, it's worth taking a look. The Time and FPS numbers come from the GFXBench 3.0 battery test in Anand's review.
Code:
              | SHIELD Tablet | iPad Air 2
Battery (Wh)  | 19.75         | 27.62
Time (h)      | 2.24          | 3.84
Perf (FPS)    | 46.13         | 49.49
Pixels        | 2304000       | 3145728
Perf (Mpix/s) | 106.28        | 155.68
Watts         | 8.82          | 7.19
Perf/W        | 12.05         | 21.64

Looks like the iPad Air manages almost 2x the perf/w of the SHIELD Tablet.
Well there are a few holes in such a statement:
It is only about the GPUs not the whole SOCs
Bottlenecks: A8X benefit from a 128 bit bus, Kepler performance could be hold by a bandwidth bottleneck.

Taking in account the bandwidth and looking at GPU alone, I believe it is easier to compare the A8 and the tegra K1(s) SOC. Too bad no A8 in tablet :LOL:

I expect Apple to come on top anyway, process advantage and looking at the jump from SMX to SMM in Nvidia main line GPU it looks like there is room for significant improvement on NV side.
 
Actually Ryan wrote the SoC page, to be correct.

I stand corrected (albeit if there's credit for it I missed it, since I'm used to look at the author's name in the upper left corner). It doesn't change the point though; for the A8 Anandtech initially was pointing at 2 clusters more (GX6650) while it turned out to be a GX6450 and in the given case for the A8X again at the GX6650 most likely 2 clusters less. In the end either way it's 12 clusters all together (whether 6/6 or 4/8) ROFL :LOL:
 
Indeed that is a great showing for the first iteration of a new, sort of alien, CPU architecture.
For the ref here are Anandtech's results from the iPad Air 2 review, they give the Shield Tablet at ~135 minutes :
http://www.anandtech.com/show/8666/the-apple-ipad-air-2-review/5

The EDP was encouraging already (27 Watts is a big number but it was still better than the +30 Watts for the A15 version of the Tegra K1).

I'm also eagerly waiting for a serious review of that chip, it would be great if Nvidia manages to bring even more competition in the field, to do so they need proper CPU cores.
 
What would be really nice is a devkit based on TK1-Denver, so that we could install a standard Linux distribution on it and run a whole bunch of real benchmarks. Those might not be terribly representative of the sort of workload a tablet will face, but they would tell us a lot about Denver.
 
What would be really nice is a devkit based on TK1-Denver, so that we could install a standard Linux distribution on it and run a whole bunch of real benchmarks. Those might not be terribly representative of the sort of workload a tablet will face, but they would tell us a lot about Denver.

I think we're already getting an idea about the strengths and weaknesses of Denver. It will excell in native apps with hard working kernels where it can get the most from the runtime optimizations and wide internals. It will suck on everything Javascript, where code is initially interpreted until JIT optimizations kicks in.

Some Javascript engines has up to three levels of JIT optimization (Nitro/Safari). The interpreter is control-dependency heavy, - not a good match with an in-order VLIW. Newly jitted javascript code will run in single-instruction mode on Denver until the internal code optimization kicks in. Then the Javascript code may be re-jitted with more aggressive optimzations and Denver has to start over.

This interaction between the external (Javascript JIT) and internal code optimization engines won't be an easy thing to solve.

Cheers
 
What would be really nice is a devkit based on TK1-Denver, so that we could install a standard Linux distribution on it and run a whole bunch of real benchmarks. Those might not be terribly representative of the sort of workload a tablet will face, but they would tell us a lot about Denver.
Not that is has any relevance but looking at the current Nexus line I keep thinking that the Tegra k1 should have ended up in the STB and the Atom should have ended in the tablet.
 
Last edited:
Back
Top