NVIDIA Tegra Architecture

metafor · Jun 26, 2012

Exophase said:
Sure, but you don't really think that 6x number involves compute tasks, do you? How much compute do you really think will be happening on phones and tablets even in that timeframe?

Aren't a lot of the use-cases (gesture recognition, facial feature detection, etc.) compute-intensive tasks that can be available via API's and accelerated via GPU/CPU compute?

french toast · Jun 26, 2012

Exophase said:
Sure, but you don't really think that 6x number involves compute tasks, do you? How much compute do you really think will be happening on phones and tablets even in that timeframe?

I think your taking my comments out of context, I'm not giving my opinion on what is good for mobiles..I'm mearly trying to put some rationale on that mythical 6x number.
How else can it get to 6x if not compute?

And besides we are past the days when phones did dumb tasks..my phone operates as my home computer

Exophase · Jun 26, 2012

metafor said:
Aren't a lot of the use-cases (gesture recognition, facial feature detection, etc.) compute-intensive tasks that can be available via API's and accelerated via GPU/CPU compute?

Can be, maybe. Will be any time soon, I doubt it. Especially not with the vast assortment of GPU types and quality vs CPU type, and especially not if there's a continued trend towards having 4 cores and nothing to do on them.

Look at how long it took to even use the GPU for accelerating graphics properly, on Android..

On iOS the story may be different, but neither of these GPUs will be used on iOS.

Silent_Buddha · Jun 26, 2012

Exophase said:
Sure, but you don't really think that 6x number involves compute tasks, do you? How much compute do you really think will be happening on phones and tablets even in that timeframe?

On phones, I can't see why you'd need compute. For tablets, it's entirely possible. All depends on when compute finally takes off for consumer level applications. Presumably if it takes off on x86 desktop metro then it would also potentially take off on RT metro.

That said, compute has yet to make any significant inroads in consumer applications in desktop Windows. So, who knows when/if it'll happen.

Regards,
SB

Ailuros · Jun 27, 2012

french toast said:
Could something like compiler efficiency pull the scores up? Maybe some cache coherency with the cpu ala t-604?
Also nvidia does have the best drivers..probably only reason why tegra pulls ahead of adreno 225?

Again given the original source where that rumor was based on, it's nothing serious to go by.

I havnt got a clue too be honest, if it is real then it's going to be some obscure nvidia inhouse compute benchmark.

As I said before more than once Wayne will most likely put NV into a completely different GPU performance league, yet it still doesn't change any of the above.

Ailuros · Jun 27, 2012

Exophase said:
Can be, maybe. Will be any time soon, I doubt it. Especially not with the vast assortment of GPU types and quality vs CPU type, and especially not if there's a continued trend towards having 4 cores and nothing to do on them.

Look at how long it took to even use the GPU for accelerating graphics properly, on Android..

On iOS the story may be different, but neither of these GPUs will be used on iOS.

As Silent Buddha said above compute has yet to make a signficant breakthrough in the desktop space, yet I'd say that you need hw to exist in order to create the relevant sw for it.

IMHO it might be even more a priority for the small form factor markets than desktop, since due to power consumption restrictions neither amount of CPU cores nor frequencies can scale forever. Here scaling GPU ALUs at way lower frequencies is much more affordable and that's the reason why I personally believe that GPUs will start to grow significantly in die area proportion for small form factor SoCs. Besides it's not really a coincidence either that several major players form HSA foundations to concentrate more and more in that direction: http://www.eetindia.co.in/ART_88006...140,8629866839,2012-06-13,EEIOL,ARTICLE_ALERT

Adding to metafor's points about guest recognition and what not, here's TI marketing fluff video for OMAP5 obviously meant for a smart-phone platform amongst others:

http://www.youtube.com/watch?v=RuyOb6W4bas

....obvious marketing and related exagerrations aside I'd be very surprised if the first OMAP5 variant has from its GPU much more than say 30GFLOPs in floating point power. The generation past that (Halti/=/<28nm) from all GPU vendors sounds like a huge increase in terms of FLOPs.

ams · Jul 14, 2012

Ahem, did anyone notice that Tegra 2 uses a "ninja" (aka companion) core and was the first commercial implementation of big.LITTLE?

french toast · Jul 15, 2012

ams said:
Ahem, did anyone notice that Tegra 2 uses a "ninja" (aka companion) core and was the first commercial implementation of big.LITTLE?

No it wasn't, there has not been any implementation of big little..big little is not just a different core that switches on when the main cores switch off...it actually combines together to crunch numbers if needed..as it carries the same isa ...only cortex a15 and cortex a7s can be used for big little...not a9s and arm11s...

ams · Jul 15, 2012

Sure, NVIDIA's implementation of the A7 + A9's in Tegra 2 is not identical to ARM's implementation of A7 + A15 in what is marketed as "big.LITTLE", but the general idea and the general concept is still somewhat similar (note that the term "big.LITTLE" actually refers to high power core + low power companion core, but I should have avoided using that exact terminology because it refers to ARM's implementation and not NVIDIA's). In big.LITTLE, the A15 can be utilized for heavy workloads, while the A7 can take over for lighter workloads. With Tegra 2, the A9's can be utilized for heavy workloads, while the A7 can take over for lighter workloads. So again, same general concept in play here. The goal with big.LITTLE is not to use A15 with A7's all number crunching at the same time. The goal is to use A15 with heavy workloads for high performance, and A7 with lighter workloads for improved battery life.

Anyway, it was Tegra 2 (and not Tegra 3) where the companion core was first commercially implemented, but NVIDIA didn't disclose that fact until after Tegra 3 was released.

ams · Jul 15, 2012

And just to clarify, when A15 and A7 are both executing code at the same time, ARM refers to this as "big.LITTLE MP" for multi-processing mode. So I take it that "big.LITTLE" refers to use of either high power core or low power core, but not both.

Arun · Jul 15, 2012

ams said:
Anyway, it was Tegra 2 (and not Tegra 3) where the companion core was first commercially implemented, but NVIDIA didn't disclose that fact until after Tegra 3 was released.

A common misconception. People have been using the concept of a 'companion core' in that general sense for decades. Before SoC integration became widespread, it was common for many chips to have their Universal Turing Machine processor integrated (usually a microcontroller but possibly a DSP or even CPU). When those chips got integrated, that processor was often kept separate for system architecture, power efficiency, and simplicity reasons. One simple and modern example is Bluetooth(/WiFi/GPS) chips which have their own microcontroller (usually Cortex-M3 nowadays) and the few companies that integrate them into single-chip SoCs with baseband/application processor always keep the microcontroller separate.

The fact both cores are ARM-based is irrelevant; they could be different ISAs and it would change nothing in this case as Tegra 2 would never move processing between the ARM7 and the Cortex-A9s dynamically; specific tasks would always happen on the same core. The unique thing about big.LITTLE is that any workload can be done on any of the cores, and even on all the cores in parallel with big.LITTLE MP. This is also the case with Tegra 3 which therefore has a true companion core even if the implementation details are extremely different.

Also I should point out that it's actually Tegra 1 which first used an ARM11+ARM7 design, but both Tegra 1 and Tegra 2 only used the ARM7 for multimedia tasks. Meanwhile Imagination's PowerVR MSVDX 1080p video decode core integrated a dedicated META processor (as do all new VXD/VXE cores) and it is public knowledge that an IMG video decode core was used in Intel's Menlow platform (original Atom for MIDs) which shipped in devices even before Tegra 1 (I shouldn't comment about power efficiency since that is specific to Intel, but I remember a fun argument I had about this very subject with NVIDIA's Mike Rayfield when I first met him in person

)

french toast · Jul 15, 2012

ams said:
Sure, NVIDIA's implementation of the A7 + A9's in Tegra 2 is not identical to ARM's implementation of A7 + A15 in what is marketed as "big.LITTLE", but the general idea and the general concept is still somewhat similar (note that the term "big.LITTLE" actually refers to high power core + low power companion core, but I should have avoided using that exact terminology because it refers to ARM's implementation and not NVIDIA's). In big.LITTLE, the A15 can be utilized for heavy workloads, while the A7 can take over for lighter workloads. With Tegra 2, the A9's can be utilized for heavy workloads, while the A7 can take over for lighter workloads. So again, same general concept in play here. The goal with big.LITTLE is not to use A15 with A7's all number crunching at the same time. The goal is to use A15 with heavy workloads for high performance, and A7 with lighter workloads for improved battery life.

Anyway, it was Tegra 2 (and not Tegra 3) where the companion core was first commercially implemented, but NVIDIA didn't disclose that fact until after Tegra 3 was released.

I think your getting a little mixed up like I did.
Cortex a7 has not been released yet..so tegra 1,2,3 did not contain this core.

Whilst your right the whole concept is similar..as in using a low power core for light tasks and a heavy core for heavy tasks...big little is a little different.

Tegra 1 uses arm 11 in single or dual core..with a little arm7 for little tasks.

Tegra 2 uses dual cortex a9s and a arm7 (I think)..but nvidia used a completely different idea with tegra 3 where they used 5 cortex a9s which share the same cache (I think)....with the soc and companion core built on a lp process...and the 4 big cores using a high power process...this is totally different to tegra 1&2...which in turn is different to arm big little.

None of the nvidia setups can combine resources for one task like big little can..also big little is all built on the same process.

ams · Jul 15, 2012

Thank you both for the clarification! That helps to explain why NVIDIA never bothered to market the third [Arm7, not A7

] core that was used in Tegra 2 (and Tegra 1).

ams · Jul 16, 2012

So is this Tegra 3+? http://phandroid.com/2012/07/12/new...ting-sporting-1-7ghz-tegra-3-processor-rumor/

An HTC One (X+?) phone with a Tegra 3 variant that is about 20-30% faster than the HTC One X (international phone) in terms of CPU/GPU operating frequencies should be very competitive with even the quad-core Samsung Galaxy S III (international phone). Of course, the rumor is that this upcoming HTC phone will be available on the AT&T network, so that means a domestic phone with 4G LTE. So if the rumor is true, it will be interesting to see how this variant is marketed and positioned relative to the HTC One X domestic phone.

ams · Jul 16, 2012

Looks like Google truly underestimated demand for the Nexus 7 tablet: http://www.talkandroid.com/122432-us-retailers-reporting-google-nexus-7-tablet-as-sold-out/

Knowing that Amazon sold several million Kindle Fire tablets in the first two or three months after introduction, very strong demand for the Nexus 7 should have been expected and anticipated. Hopefully Asus will be able to ramp up production of these tablets.

french toast · Jul 16, 2012

ams said:
So is this Tegra 3+? http://phandroid.com/2012/07/12/new...ting-sporting-1-7ghz-tegra-3-processor-rumor/

An HTC One (X+?) phone with a Tegra 3 variant that is about 20-30% faster than the HTC One X (international phone) in terms of CPU/GPU operating frequencies should be very competitive with even the quad-core Samsung Galaxy S III (international phone). Of course, the rumor is that this upcoming HTC phone will be available on the AT&T network, so that means a domestic phone with 4G LTE. So if the rumor is true, it will be interesting to see how this variant is marketed and positioned relative to the HTC One X domestic phone.

Two words...battery life. :/

ams · Jul 16, 2012

french toast said:
Two words...battery life. :/

I'm not sure I follow you. Are you trying to say that battery life would be severely compromised in Tegra 3+? I doubt that. I would expect Tegra 3+ to be a die-shrunk version of Tegra 3, fabricated on 28nm process rather than 40nm process. Also, on the HTC One X international phone, the software had much room for improvement. With software updates from HTC, NVIDIA, and Google, newer phones running on Jelly Bean should be just fine.

french toast · Jul 16, 2012

It's got the Same manufacturing process..just higher clocks and at least in the tablet version ddr 3L....

28nm is as rare as rocking horse poo

ams · Jul 16, 2012

I think you are thinking of T33, which is ~ 20-30% faster in terms of CPU/GPU clocks compared to T30L, but fabricated on the same 40nm process. The link at phandroid is referring to something newer, T37.

Transitioning from 40nm to 28nm during 2H 2012 would seem logical for Tegra 3+, for two reasons. One is that most competitors have already made the transition. Two is that 28nm supply constraints will be eased by 4Q 2012.

french toast · Jul 16, 2012

Your right..I was thinking of t33...in that case batterylife should be very interesting indeed...tegra 4 is what I'm interested in

NVIDIA Tegra Architecture

metafor

french toast

Exophase

Silent_Buddha

Ailuros

Epsilon plus three

Ailuros

Epsilon plus three

ams

french toast

ams

ams

Arun

Unknown.

french toast

ams

ams

ams

french toast

ams

french toast

ams

french toast

Similar threads