NVIDIA Tegra Architecture

Logan will not be out until 2014.

Logan SoC will be announced by December 2013/January 2014.

On the plus side, it will probably feature Denver cores. It also won't be 50x faster than T2, more like 10-20x.

Based on the NVIDIA Tegra Roadmap, Logan was shown to be 50x faster than Tegra 2: http://www.technobuffalo.com/wp-content/uploads/2011/11/tegra-roadmap.jpg . Of course, I'm not sure how exactly NVIDIA came to the conclusion that Tegra 3 is 5x faster than Tegra 2, so depending on the performance metric used you could be right. :D
 
They could announce it in 2013, but I doubt they will. 20nm won't ramp until 2014 (if TSMC hits its target...), so there is no reason to undermine existing product sales.
 
They could announce it in 2013, but I doubt they will. 20nm won't ramp until 2014 (if TSMC hits its target...), so there is no reason to undermine existing product sales.

Agreed, availability in 2014 with announcement and hopefully some details provided in January 2014 at CES.
 
Compared to upcoming SGX544MP2, Adreno 320, or anything else with A5X-like performance, yes that would be enough raw performance to be competitive. Compared to upcoming Mali 6xx/Rogue 6 series/etc., it may be a different story. But that is what Batman is for, right? :smile:

I think you mean Wolverine. Batman is Tegra 4.
 
As said earlier.....nvidia usually likes to stretch performance characteristics of its upcoming products and tech...tegra 3 "12 core" anyone!? .

With that in mind...it would be very very hard to switch to a brand new architecture....with dual channel lpddr3....and on full node better manufacturing process...and possibly be ONLY 2X more powerfull....not to mention the a15s which nvidia has typically used to pad their previous performance claims.

Like I said...if they said 3-4 times tegra 3 performance OR mentioned halti...then I could believe it...
 
I think you mean Wolverine. Batman is Tegra 4.

What I meant is that, if previous rumors are to be believed, then there will be two SoC variants in 2013 with [Bruce] Wayne/Tegra 4: Batman and Robin. Batman will be the higher performance variant, and Robin will be the lower performance variant. Whether this is really the case is anyone's guess, but 2x GPU performance improvement over Tegra 3 throughout 2013 is not going to cut it against the most powerful GPU's used in SoC's from Apple and Samsung in 2013.
 
For modern mobile GPU's, I would think everyone would be wary of using flat numbers like "2x" anymore and expecting it to be anywhere reasonable.

Even Adreno 320, in some tests, end up being "only 2x" that of Adreno 225.
 
Sure, those absolute performance multipliers have to be taken with a grain of salt. That said, in some instances, it is certainly justifiable to say up to 2x GPU performance increase, such as A5X vs. A5 where execution units doubled. So if the upcoming Tegra 4 variant hypothetically has double the GPU execution units vs. Tegra 3, running at the same GPU operating frequency as Tegra 3 (ie. up to 2x GPU performance increase), then that will not be good enough to directly compete against the GPU performance of the top SoC's next year from Apple and Samsung.
 
Last edited by a moderator:
When has Tegra ever been comparable to its counterpart generations from Samsung and Apple in GPU performance? 2x Tegra 3 would be enough to compete with Adreno 320, which is pretty much the only competitor they need to worry about when it comes to getting BOM slots.

But I wouldn't expect it to be 2x across the board. We are moving to a generation of benchmarks that use more complex shaders.
 
Don't get me wrong, 2x performance increase over Tegra 3 is substantial and plenty good enough to get many design wins for NVIDIA at year end. But as you can see from the Tegra roadmap, the goal is to go leaps and bounds above that within the next 1-2 years.
 
Looks like recycled info from one of their previous rumours on the 29th of March: http://news.mydrivers.com/1/223/223107.htm

That story had so many things wrong it's hard to count really (Adreno 225 in APQ8064 instead of 320, saying OMAP5 is 300MHz and implying it's much slower than the PS Vita, quoting BSN, etc...) - it looks more like uninformed speculation to me. And 520*64*2 is only 67GFlops anyway, I personally wouldn't be very impressed by that...

Actually the story did say (or at least has been updated to say) Adreno 225 and not Adreno 320. And the OMAP5 they were talking about may have actually had a GPU operating frequency of 300MHz (ie. not final production clock speeds): http://www.anandtech.com/show/5406/ti-shows-off-omap-5-arm-cortex-a15-at-ces . So what exactly is so outrageous or unbelievable about these claims regarding pure pixel shader performance? And why all the fuss about GFLOP throughput? Differences in GFLOP throughput between different GPU architectures have never correlated well with differences in real world gaming performance between different GPU architectures.
 
Last edited by a moderator:
When has Tegra ever been comparable to its counterpart generations from Samsung and Apple in GPU performance? 2x Tegra 3 would be enough to compete with Adreno 320, which is pretty much the only competitor they need to worry about when it comes to getting BOM slots.

But I wouldn't expect it to be 2x across the board. We are moving to a generation of benchmarks that use more complex shaders.

The problem is where in NV's roadmap and that recent public statement a performance increase would be an "up to" figure and where "average".

The older roadmaps showed a 5x times difference (SoC level/theoretical maximum) between T3 and T2 and a 2x times difference between T4 and T3. It hardly can be that it's an "up to" figure in the first case and an "average" for the second.

Either NV is just firing around smoke to surprise its competition or the performance increase for Wayne is way more humble than many of us expected.

From what I recall the 5x times for T3 was:

2*A9@1GHz vs. 4*A9@1.5GHz = 2.5x
8 GPU ALU lanes@333MHz vs. 12 GPU ALU lanes@520MHz = 2+x
and the remaining to reach the 5x probably for the 5th CPU companion core

The point isn't how realistic any of the above is; we all know how marketing works in that regard and those funky up to 5x or more times are nearly everywhere to be found and not just at NV. What raises a question mark is that 2x times SoC performance claim for T4 vs. T3 if it should follow the same reasoning. As I said either a nasty trick to fool everyone and it's an average increase or it's merely a shrinked T3 at higher frequencies. I still consider the latter scenario unlikely, but the question mark remains with that kind of vague marketing parlance.
 
Actually the story did say (or at least has been updated to say) Adreno 225 and not Adreno 320. And the OMAP5 they were talking about may have actually had a GPU operating frequency of 300MHz (ie. not final production clock speeds): http://www.anandtech.com/show/5406/ti-shows-off-omap-5-arm-cortex-a15-at-ces . So what exactly is so outrageous or unbelievable about these claims regarding pure pixel shader performance? And why all the fuss about GFLOP throughput? Differences in GFLOP throughput between different GPU architectures have never correlated well with differences in real world gaming performance between different GPU architectures.

It was updated for a reason don't you think? They even slightly lowered the supposed Wayne GPU frequency to 500MHz because something probably wasn't adding up.

GLBenchmark2.5 sounds very ALU intense. If you look at the results the 543MP4@250MHz in the iPad3 scores 2739 frames at the moment with 32 GFLOPs and the ULP GF in T3 scores 1200 frames with 12.5 GFLOPs. A 2.3x score difference for a 2.5x GFLOPs difference is hardly a coincidence.

Now if the top dog Wayne GPU should truly sport 64 GFLOPs that's a 5x times increase in terms of GFLOPs vs. the T30 GPU, which again builds quite a nice oxymoron against the 2x times performance increase between Wayne and T30.

Besides that, keep in mind that Wayne is supposed to range from clamshells down to superphones, with quite a difference in power consumption per device. 64 GFLOPs is anything but impressive under that light.
 
2X would not cut it with a new architecture moving to unified shaders which would multiply the number of alu s /flops more than 2x.

I don't think this is a new uarch....again taking into consideration tegra 3 on 40nm is pretty average die size...and ok power consumption wise....also at least in the lpddr 2 versions...memory bandwidth starved at 1080p resolutions....
.....so moving to 28nm dual channel lpddr3 and some cortex a15s to supply it with is going to increase performance anyway...hell you could probably up the clocks and increase the cpu cache which would hit near 2x with out any more execution units.

If nvidia really is using a uarch...and they are telling the truth about 2x...then the gpu die space allocation compared to the cpu portion is going to be miniscule.

For that reason then I'm pumping for a revised tegra 3 gpu...with double the execution units...slightly more efficient and maybe lower clock speed...then nvidia can use the "24 core" slogan and say they kept the 2xpromise.

There may even be a high end tablet version...like ams said called robin with double that..for a "48" core gpu...lower clocked.
 
Last edited by a moderator:
2x Tegra 3 GPU would be plenty competitive with Nvidias main competition, Texas Instruments and Qualcomm
 
2x Tegra 3 GPU would be plenty competitive with Nvidias main competition, Texas Instruments and Qualcomm

No it wouldn't if that 2x times value would be a peak for clamshell devices and GPU performance due to lower power envelopes shrinks for "superphones" which are TI's and Qualcomm's main targets.

Adreno320 still smells to me like 4*SIMD16/4 TMUs@=/>400MHz. What probably Qualcomm might need would be better drivers/compiler. In a year from now they could easily increase the unit amount or GPU frequencies or both.

As for TI who says that OMAP5 is exclusively Series5XT?
 
No it wouldn't if that 2x times value would be a peak for clamshell devices and GPU performance due to lower power envelopes shrinks for "superphones" which are TI's and Qualcomm's main targets.

True since we dont know how Nvidia came up with the 2x Tegra 3 performance or how they are calculating it its all speculation.


As for TI who says that OMAP5 is exclusively Series5XT?

TI does since nothing else has been announced. They do have a series 6 license but there is no evidence that suggest it will be used with OMAP 5 series. Given that TI is using a 5 series GPU in late 2012-early 2013, i think its more likely that OMAP + Rogue is a late 2013 product that will be competing with Tegra 5
 
2 x tegra 3 would not be competitive.

Why not? Look at GLBenchmark 2.5, it's a much more realistic bench than 2.1 ever was and it puts Tegra 3 quite well compared to the other SoCs.


Although the software optimizations / exclusivity could in theory close the gap.

Close the gap?
Performance leader or not, Tegra 3 is the current platform of choice for anything 3D in Android. OUYA, and that tablet with a gamepad are using Tegra 3, not a Snapdragon S4 or Exynos 4412.

Even the streaming apps have exclusive versions for Tegra 3 with more functionality, it's crazy.
 
Back
Top