NVIDIA Tegra Architecture

mavere · Nov 6, 2014

Comparison mode does not show 32-vs-64bit but the individual score pages do.

ams · Nov 6, 2014

The 64-bit Geekbench 3 data is based on a preliminary [alpha] build of the software for Android. This is what Ars had to say:

"This is the first major 64-bit Android device on the market, so the team at Geekbench sent us a preliminary 64-bit build of its benchmarking software. There were some lower-than-expected results in a few workloads; Primate Labs' John Poole noted that those numbers may increase in the near future as the team refines the 64-bit version of Geekbench for Android. Given the improvements inherent to the new ARMv8 instruction set, performance across the spectrum should increase as 64-bit Android devices become more commonplace."

mavere · Nov 6, 2014

...you think they'd put alpha scores on the online browser?

The 64bit update is already out.

trinibwoy · Nov 6, 2014

Well that's progress. Software updates can't fix the build and design problems though

liolio · Nov 6, 2014

trinibwoy said:
Well that's progress. Software updates can't fix the build and design problems though

You speak of the Nexus 9 right? If so I agree, no matter the lack of 64 bit support I find the Shield tablet and Xiaomi Mipad more attractive than the Nexus 9.

there are a lot of optin non based on Nvidia hardware that provide significantly more bang for bucks, last Asus Tab, last lenovo yoga tab, Samsung tab 8.4 s, etc.)

The nexus 9 is nothing like the steal that the 2 preview Nexus 7 were.

I think that gen of nexus devices is the least interesting generation of nexus devices ever

trinibwoy · Nov 7, 2014

Yeah the shield is looking better in comparison. Will wait a while until the dust settles.

RecessionCone · Nov 7, 2014

GK20A vs GX6650 perf/w

Now that the Anandtech iPad Air 2 review is out, I thought it'd be interesting to compare Kepler's performance per watt to the GX6650 in the iPad Air 2. It's never possible to get a fair comparison, since they're on different processes, have different costs and different displays, but still, it's worth taking a look. The Time and FPS numbers come from the GFXBench 3.0 battery test in Anand's review.

Code:

              | SHIELD Tablet | iPad Air 2
Battery (Wh)  | 19.75         | 27.62
Time (h)      | 2.24          | 3.84
Perf (FPS)    | 46.13         | 49.49
Pixels        | 2304000       | 3145728
Perf (Mpix/s) | 106.28        | 155.68
Watts         | 8.82          | 7.19
Perf/W        | 12.05         | 21.64

Looks like the iPad Air manages almost 2x the perf/w of the SHIELD Tablet.

Ailuros · Nov 7, 2014

RecessionCone said:
Now that the Anandtech iPad Air 2 review is out, I thought it'd be interesting to compare Kepler's performance per watt to the GX6650 in the iPad Air 2. It's never possible to get a fair comparison, since they're on different processes, have different costs and different displays, but still, it's worth taking a look. The Time and FPS numbers come from the GFXBench 3.0 battery test in Anand's review.

Code:

| SHIELD Tablet | iPad Air 2 Battery (Wh) | 19.75 | 27.62 Time (h) | 2.24 | 3.84 Perf (FPS) | 46.13 | 49.49 Pixels | 2304000 | 3145728 Perf (Mpix/s) | 106.28 | 155.68 Watts | 8.82 | 7.19 Perf/W | 12.05 | 21.64

Looks like the iPad Air manages almost 2x the perf/w of the SHIELD Tablet.

It'll get easier to decypher if you consider two things as a start:

1. It's not a GX6650 but most likely a GX6850 which means 16 TMUs and 256 FP32 SPs @~500MHz.
2. It's a DX10.0 vs. DX11.0 GPU.

RecessionCone · Nov 7, 2014

Ailuros said:
It'll get easier to decypher if you consider two things as a start:

1. It's not a GX6650 but most likely a GX6850 which means 16 TMUs and 256 FP32 SPs @~500MHz.
2. It's a DX10.0 vs. DX11.0 GPU.

Anand seems to think it's a GX6650 - what makes you think otherwise? I suppose if it's a GX6850, Apple has traded off perf/mm^2 to improve perf/w - and a theoretical GK20A that had 2 SMs but ran slower might do better. I don't think DX10 vs 11 should impact perf/w that much - GFXBench 3.0 doesn't use any of the new features, and I don't think DX11 GPUs require that much more abstraction than DX10.

Makes me curious for the upcoming Erista comparisons. =)

Ailuros · Nov 7, 2014

RecessionCone said:
Anand seems to think it's a GX6650 - what makes you think otherwise?

Joshua seems to think so lack of any other data, not Anand himself (just to avoid mistunderstandings).

http://gfxbench.com/device.jsp?benchmark=gfx30&os=iOS&api=gl&D=Apple iPad Air 2

You have a fillrate of 7625 MTexels/s at Kishonti; with 6 cluster you have 12 and with 8 clusters 16 TMUs.

7625/12= 635
7625/16= 476

Considering that no GPU has a 100% fillrate efficiency frequencies should be a healthy bit higher than those values. A roughly 500MHz frequency for an 8 cluster Rogue sounds more like Apple to me.

I suppose if it's a GX6850, Apple has traded off perf/mm^2 to improve perf/w - and a theoretical GK20A that had 2 SMs but ran slower might do better.

Die area and consumption would rise significantly on a hypothetically GK20A with 2 clusters, unless you mean 2 SMXs@425MHz or else half the peak frequency of the current GK20A@K1.

I don't think DX10 vs 11 should impact perf/w that much - GFXBench 3.0 doesn't use any of the new features, and I don't think DX11 GPUs require that much more abstraction than DX10.

Apple A8 has a 4 cluster GX6450 clocked at estimated =/>520MHz (6 Plus) which captures 19.1mm2@20SoC TSMC according to Anandtech from chipwork's die shots. Now die are won't scale linearly with the amount of clusters since they won't scale everything, but assuming the A8X GPU weighs 30-35mm2 under 20SoC with DX10 and you'd get in the 45-50mm2 region with all DX11.x bells and whistles that would raise your perf/mm2 and perf/W ratios.

Makes me curious for the upcoming Erista comparisons. =)

Which will again lie somewhere in the middle between the A8X and A9/tablet-whateverSoC from Apple since NV and Apple are updating at completely different timings within the year. Besides Erista will be on 20SoC TSMC, while Apple's next tablet SoC either Samsung 14nm or TSMC 16FF wherever they can get decent volume at.

liolio · Nov 8, 2014

RecessionCone said:
Now that the Anandtech iPad Air 2 review is out, I thought it'd be interesting to compare Kepler's performance per watt to the GX6650 in the iPad Air 2. It's never possible to get a fair comparison, since they're on different processes, have different costs and different displays, but still, it's worth taking a look. The Time and FPS numbers come from the GFXBench 3.0 battery test in Anand's review.

Code:

| SHIELD Tablet | iPad Air 2 Battery (Wh) | 19.75 | 27.62 Time (h) | 2.24 | 3.84 Perf (FPS) | 46.13 | 49.49 Pixels | 2304000 | 3145728 Perf (Mpix/s) | 106.28 | 155.68 Watts | 8.82 | 7.19 Perf/W | 12.05 | 21.64

Looks like the iPad Air manages almost 2x the perf/w of the SHIELD Tablet.

Well there are a few holes in such a statement:
It is only about the GPUs not the whole SOCs
Bottlenecks: A8X benefit from a 128 bit bus, Kepler performance could be hold by a bandwidth bottleneck.

Taking in account the bandwidth and looking at GPU alone, I believe it is easier to compare the A8 and the tegra K1(s) SOC. Too bad no A8 in tablet

I expect Apple to come on top anyway, process advantage and looking at the jump from SMX to SMM in Nvidia main line GPU it looks like there is room for significant improvement on NV side.

Nebuchadnezzar · Nov 8, 2014

Ailuros said:
Joshua seems to think so lack of any other data, not Anand himself (just to avoid mistunderstandings).

Actually Ryan wrote the SoC page, to be correct.

Ryan Smith · Nov 8, 2014

Nebuchadnezzar said:
Actually Ryan wrote the SoC page, to be correct.

Correct.

Ailuros · Nov 8, 2014

Nebuchadnezzar said:
Actually Ryan wrote the SoC page, to be correct.

I stand corrected (albeit if there's credit for it I missed it, since I'm used to look at the author's name in the upper left corner). It doesn't change the point though; for the A8 Anandtech initially was pointing at 2 clusters more (GX6650) while it turned out to be a GX6450 and in the given case for the A8X again at the GX6650 most likely 2 clusters less. In the end either way it's 12 clusters all together (whether 6/6 or 4/8) ROFL

Ailuros · Nov 11, 2014

Nexus9 battery lifetime and long time performance scores appeared in the Kishonti database: http://gfxbench.com/device.jsp?benchmark=gfx30&os=Android&api=gl&D=Google Nexus 9

Roughly 20% throttling and an estimated 3 hours battery lifetime for TRex onscreen is perfectly fine in my book. Now where's Anand's full review for it for some more detailed insights?

liolio · Nov 11, 2014

Indeed that is a great showing for the first iteration of a new, sort of alien, CPU architecture.
For the ref here are Anandtech's results from the iPad Air 2 review, they give the Shield Tablet at ~135 minutes :
http://www.anandtech.com/show/8666/the-apple-ipad-air-2-review/5

The EDP was encouraging already (27 Watts is a big number but it was still better than the +30 Watts for the A15 version of the Tegra K1).

I'm also eagerly waiting for a serious review of that chip, it would be great if Nvidia manages to bring even more competition in the field, to do so they need proper CPU cores.

Alexko · Nov 11, 2014

What would be really nice is a devkit based on TK1-Denver, so that we could install a standard Linux distribution on it and run a whole bunch of real benchmarks. Those might not be terribly representative of the sort of workload a tablet will face, but they would tell us a lot about Denver.

Gubbi · Nov 11, 2014

Alexko said:
What would be really nice is a devkit based on TK1-Denver, so that we could install a standard Linux distribution on it and run a whole bunch of real benchmarks. Those might not be terribly representative of the sort of workload a tablet will face, but they would tell us a lot about Denver.

I think we're already getting an idea about the strengths and weaknesses of Denver. It will excell in native apps with hard working kernels where it can get the most from the runtime optimizations and wide internals. It will suck on everything Javascript, where code is initially interpreted until JIT optimizations kicks in.

Some Javascript engines has up to three levels of JIT optimization (Nitro/Safari). The interpreter is control-dependency heavy, - not a good match with an in-order VLIW. Newly jitted javascript code will run in single-instruction mode on Denver until the internal code optimization kicks in. Then the Javascript code may be re-jitted with more aggressive optimzations and Denver has to start over.

This interaction between the external (Javascript JIT) and internal code optimization engines won't be an easy thing to solve.

Cheers

liolio · Nov 11, 2014

Alexko said:
What would be really nice is a devkit based on TK1-Denver, so that we could install a standard Linux distribution on it and run a whole bunch of real benchmarks. Those might not be terribly representative of the sort of workload a tablet will face, but they would tell us a lot about Denver.

Not that is has any relevance but looking at the current Nexus line I keep thinking that the Tegra k1 should have ended up in the STB and the Atom should have ended in the tablet.

Laurent06 · Nov 11, 2014

liolio said:
the Atom should have ended in the tablet.

Atom CPU performance is just meh.

NVIDIA Tegra Architecture

mavere

ams

mavere

trinibwoy

Meh

liolio

Aquoiboniste

trinibwoy

Meh

RecessionCone

Ailuros

Epsilon plus three

RecessionCone

Ailuros

Epsilon plus three

liolio

Aquoiboniste

Nebuchadnezzar

Ryan Smith

Ailuros

Epsilon plus three

Ailuros

Epsilon plus three

liolio

Aquoiboniste

Alexko

Gubbi

liolio

Aquoiboniste

Laurent06

Similar threads