NVIDIA Tegra Architecture

Ailuros · Jan 7, 2014

DSC said:
More advanced than desktop Kepler too, I guess Maxwell will have ASTC support in the texture units?

K1 is DX11.2. What would any Maxwell standalone GPU (outside the future Parker ULP GPU) have ASTC for?

ninelven · Jan 7, 2014

Because there is a world outside of DX?

hkultala · Jan 7, 2014

DSC said:
It's obvious Nvidia will make announcements at Mobile World Congress in February about T4i, wait until then.

Though I wished Nvidia would replace the A9 with the A53, if this table is correct, it's faster than A9 and has 64bit ARMv8 support too.

http://www.anandtech.com/show/7573/...ed-on-64bit-arm-cortex-a53-and-adreno-306-gpu

It's only faster in 64-bit mode. When executing old 32-bit code, it's slower.

Exophase · Jan 7, 2014

hkultala said:
It's only faster in 64-bit mode. When executing old 32-bit code, it's slower.

"Below are some core-level performance numbers, all taken in AArch32 mode, comparing the Cortex A53 to its A5/A7 competitors:"

Ailuros · Jan 7, 2014

ninelven said:
Because there is a world outside of DX?

For a desktop GPU? I must be missing something essential here and yes ASTC would eulogy for windows,but I can't imagine Microsoft adopting it.

ninelven · Jan 7, 2014

OpenGL gaming on Linux/Android/SteamOS?

And then there are technical merits...

mczak · Jan 7, 2014

DSC said:
More advanced than desktop Kepler too, I guess Maxwell will have ASTC support in the texture units?

In SOCs for smartphones/tablets yes, others remain to be seen.
(For the record, just about the only visible feature change from Ivy Bridge Graphics to Bay Trail graphics is the latter supports ETC, something Haswell graphics still does not.)

DSC · Jan 7, 2014

http://nvidianews.nvidia.com/Releases/Audi-and-NVIDIA-Expand-Visual-Computing-in-the-Car-a90.aspx

http://blogs.nvidia.com/blog/2014/0...egra-k1-to-power-piloted-driving-initiatives/

Congrats to Nvidia on the design win.

Ailuros · Jan 7, 2014

ninelven said:
OpenGL gaming on Linux/Android/SteamOS?

And then there are technical merits...

Of course are there a lot of advantages, but OGL gaming on niche use cases (if until the landscape changes) doesn't sound all too convincing for NV to implement ASTC in something like a desktop GPU.

mczak · Jan 7, 2014

Ailuros said:
I'm afraid most of you are missing viable points if it comes to GLB2.7. The specific benchmark has a ton of alpha test based foliage.

Well I'm not entirely sure it's really completely bottlenecked by that, but regardless it's the only datapoint we have right now. And even if z tests got a proportionally larger increase compared to Tegra 4 than some other parts of the chips (which I don't know as it's very difficult to keep track of these usually undisclosed things), shader gflops increased big time too, so I don't think it's unreasonable to expect it to be a lot faster in general. I agree though certainly other benchmarks need to be used too to judge the worthiness of the competitors.

Despite that nobody missed me (which is nice honestly) and since I am a man of my word: I must admit in public that I have to eat my words and they truly seem to have integrated a full Kepler cluster into GK20A; I don't care if it's almost "full" either since I'm generous enough to stand by my mistakes.

Well the SMX seems to be a really full Kepler SMX (as in GK208, so half the TMUs of older ones), though we don't know yet if GPC etc. survived more or less the same. They might be redesigned though maybe just dropping all the stuff needed for scaling things up to multiple GPCs (which we know nvidia did) is all that's really changed.

Ailuros · Jan 8, 2014

mczak said:
Well I'm not entirely sure it's really completely bottlenecked by that, but regardless it's the only datapoint we have right now. And even if z tests got a proportionally larger increase compared to Tegra 4 than some other parts of the chips (which I don't know as it's very difficult to keep track of these usually undisclosed things), shader gflops increased big time too, so I don't think it's unreasonable to expect it to be a lot faster in general. I agree though certainly other benchmarks need to be used too to judge the worthiness of the competitors.

It doesn't take much to see what the bottleneck is if you look at it. On a side note their GPU slide might reveal early GLB3.0 results. G6430 seems to be at 11fps and Adreno330 at 9fps. GFXbench 3.0 seems to be close & the public will forget about the 2.x benchmarks fairly quickly.

Well the SMX seems to be a really full Kepler SMX (as in GK208, so half the TMUs of older ones), though we don't know yet if GPC etc. survived more or less the same. They might be redesigned though maybe just dropping all the stuff needed for scaling things up to multiple GPCs (which we know nvidia did) is all that's really changed.

It's DX11.2 & supports ATSC. If you're asking whether the raster supports 8 pixels/clock I'm not sure it really matters.

GLX · Jan 8, 2014

Ailuros said:
It doesn't take much to see what the bottleneck is if you look at it. On a side note their GPU slide might reveal early GLB3.0 results. G6430 seems to be at 11fps and Adreno330 at 9fps. GFXbench 3.0 seems to be close & the public will forget about the 2.x benchmarks fairly quickly.

You could look at the slide in the NV keynote and try to guess a figure for the K1 GFXBench 3.0 Manhattan score .. go on, you know you want to

Ailuros · Jan 8, 2014

GLX said:
You could look at the slide in the NV keynote and try to guess a figure for the K1 GFXBench 3.0 Manhattan score .. go on, you know you want to

I'm patient enough to wait for final devices perf and more important perf/mW ratios and then I'll see how it compares to high end smartphone of the time (since that's what the specific slide is aiming to compare with) if K1 ever makes it into one that is.

Rys · Jan 8, 2014

Agreed. I think it's quite clear that the high-end performance demonstrated for T124 definitely requires active cooling and sits outside the design limits for modern smartphones and tablets by a fairly big margin. What it's capable of at ~1.5W and ~5W will be much more interesting.

Deleted member 13524 · Jan 8, 2014

Ailuros said:
Despite that nobody missed me (which is nice honestly) and since I am a man of my word: I must admit in public that I have to eat my words and they truly seem to have integrated a full Kepler cluster into GK20A; I don't care if it's almost "full" either since I'm generous enough to stand by my mistakes.

As I said at the time, I don't care about apologies but I'm proudly wearing my party hat.

Even though it's not a 2-SMX GPU running at ~350MHz but a 1-SMX part running at ~900MHz, the performance figures are even better than I could ever anticipate.

Rys said:
Agreed. I think it's quite clear that the high-end performance demonstrated for T124 definitely requires active cooling and sits outside the design limits for modern smartphones and tablets by a fairly big margin.

Why not tablets?

AFAIK, there were 7" tablets with K1 in the show floor for hands-on from the press:

We're seeing it working on 7-inch tablets that are relatively thin and compact, but would the same 2GHz+ power be available inside your next Android smartphone?

Rys · Jan 8, 2014

ToTTenTranz said:
Why not tablets?

AFAIK, there were 7" tablets with K1 in the show floor for hands-on from the press:

Because (I believe) the power at 852 MHz+ is too much, without active cooling. If there were tablets, it's hard to see how they were running the clocks needed to meet the performance claims of almost 3x iPhone 5s in GFXBench 3's new test.

Ailuros · Jan 8, 2014

ToTTenTranz said:
As I said at the time, I don't care about apologies but I'm proudly wearing my party hat.

Even though it's not a 2-SMX GPU running at ~350MHz but a 1-SMX part running at ~900MHz, the performance figures are even better than I could ever anticipate.

Which performance figures? You're sure you want to keep that hat on your head?

For the record it was a looooong time ago that Exophase agreed with me here that Tegra5 in its upper boundaries would be an excellent candidate to fight against ULP Haswell configurations. Food for thought perhaps? Unless of course you're as naive to believe that you can cram a 951MHz GPU / 2.3GHz 4+1 A15 into a smartphone; well you can but won't be mobile anymore

ams · Jan 8, 2014

Rys, TK1 was closer to 2.6x faster than the A7 GPU (rather than 3x).

The perf. per watt seems to improve as GPU power consumption goes above 1.5w.

Assuming that perf. per watt increases at a linear rate when moving from 1.5w to 3.5w GPU power consumed (which is unreasonable, but easy for illustrative purposes), then for TK1 to fit into a phone at similar power consumption levels as A7 in iphone 5s, it would have just slightly below 2x more performance in GFXBench 3.0

It would only need ~ 1w more GPU power consumption beyond that to match the claimed 2.6x improvement.

If that is truly the case, then I don't think that TK1 should have any problem meeting these performance claims in a thin fanless tablet (and will work in a high end smartphone too at slightly reduced frequencies). An actively cooled variant in Shield 2 would likely have even greater performance than what was claimed in the slide.

Deleted member 13524 · Jan 8, 2014

I bet we'll see the TK1 in a 6" phablet, at least.

In fact, using an aluminum body that is thermally connected to the SoC (like the HTC One), I bet we could see a slightly underclocked TK1 in a 5" smartphone and get a nice battery life using a 3200mAh battery to boot.

Exophase · Jan 8, 2014

ams said:
The perf. per watt seems to improve as GPU power consumption goes above 1.5w.

What is this comment based on? At 1.5W they're probably already vastly dominated by dynamic power over static, I don't think they'd have such strong performance vs their competitors otherwise. From there, getting more performance means increasing clock speed (linear) necessitating increasing voltage (quadratic) which results in a superlinear expansion. Not sublinear.

NVIDIA Tegra Architecture

Ailuros

Epsilon plus three

ninelven

PM

hkultala

Exophase

Ailuros

Epsilon plus three

ninelven

PM

mczak

DSC

Ailuros

Epsilon plus three

mczak

Ailuros

Epsilon plus three

GLX

Ailuros

Epsilon plus three

Rys

Graphics @ AMD

Deleted member 13524

Guest

Rys

Graphics @ AMD

Ailuros

Epsilon plus three

ams

Deleted member 13524

Guest

Exophase

Similar threads