NVIDIA Tegra Architecture

It seems OUYA is set for a yearly refresh:
http://www.slashgear.com/ouya-to-re...feed&utm_campaign=Feed:+slashgear+(SlashGear)

At this rate, the OUYA's hardware will always be using hardware from the previous year, though.

OTOH, I could see 2018's OUYA actually being more powerful than the next-gen consoles from Microsoft and Sony. Auch?

I think you're overestimating this. I want to say grossly overestimating this.

In the last few years we've seen a huge increase in performance of phone and tablet GPUs. We've also seen a huge increase in power consumption. This was enabled by form factor changes (to tablets and bigger phones) allowing larger batteries and more heat dissipation, and because SoCs got better at power management so this was only used as needed.

So I expect that both SoC CPU and GPU performance improvements will slow down, so long as they're only targeting handheld devices. Until shown otherwise I don't believe that more power hungry devices will command enough volume to warrant the design of SoCs only for them, and I don't expect a company outside Intel to start toppling desktops and laptops with them.

I also expect development of new manufacturing nodes to continue to slow down and adoption costs to continue to rise.

At $99 OUYA harder is going to remain fairly trailing edge, just like the initial release will be. So the 2018 model (if they indeed can keep this up that long and aren't displaced by other things or a lack of interest overall) is probably going to use a 2017 or even 2016 SoC.

Consider that the highest end SoCs still haven't quite caught up with the ancient XBox 360 and PS3 hardware, so you'd need an over 8 year gap to shrink to just a 4 year gap, assuming that the new consoles draw a similar amount of power that the old ones did at launch.

The only way I could see this happening is if it's using a chip that nVidia designed for its high end compute line, that marries a conventional Project Denver based SoC with a bunch of GPU power. But it'd need to have the video output stuff that nVidia doesn't like putting on these cards (and probably other peripherals that don't make sense here) and it'd need to be sold at OUYA pricing. Hard to see this happening unless nVidia is very heavily investing in this concept.
 
Eh, Tegra 5 should already have comparable CPU power to Durango/Orbis, and probably a GPU with similar performance to PS360. That would be 2015/2016 for OUYA (assuming a 2014 launch @ 20nm). But who knows, Nvidia might decide to cash in themselves.

The CPU budget isn't likely to increase too much, so most of the transistors will probably go to the GPU. If something like HMC is affordable/common by then, that could both provide the bandwidth while reducing power consumption.

There is also the possibility of using a lower end desktop/laptop chip like GK106/7 assuming it is also an SoC. Given that the market for such a chip will largely be mobile by that time, it would make sense.
 
Last edited by a moderator:
Eh, Tegra 5 should already have comparable CPU power to Durango/Orbis, and probably a GPU with similar performance to PS360. That would be 2015/2016 for OUYA (assuming a 2014 launch @ 20nm). But who knows, Nvidia might decide to cash in themselves.

The CPU budget isn't likely to increase too much, so most of the transistors will probably go to the GPU. If something like HMC is affordable/common by then, that could both provide the bandwidth while reducing power consumption.

There is also the possibility of using a lower end desktop/laptop chip like GK106/7 assuming it is also an SoC. Given that the market for such a chip will largely be mobile by that time, it would make sense.

You do realize you're comparing what could be a 32 bit quad core ARM design no more than 4W to a 64 bit x86 octo core 25W+ chip?
 
It could be... It could also be a quad core A57 or Denver. And given that the chip is going into a console, I wouldn't be surprised to see higher clocks than in tablets and phones, say 2.5 to 3 GHz.

So how would an quad A57 at 3 GHz compare to an 8 core jaguar at 1.6 GHz?
 
It could be... It could also be a quad core A57 or Denver. And given that the chip is going into a console, I wouldn't be surprised to see higher clocks than in tablets and phones, say 2.5 to 3 GHz.

So how would an quad A57 at 3 GHz compare to an 8 core jaguar at 1.6 GHz?

There's a limit to how high you'll be able to clock it after greatly relaxing thermal limits. The CPU design, broader SoC design, manufacturing and packaging are all going to be optimized for some clock range with an upper limit to how high they can clock before they stop working. The SoC will also impose limits on how much current can be drawn and how high of a voltage it can take, and the PMIC will have its own limits on how much current it can supply (and something like OUYA probably won't use a custom power solution).

So I doubt Tegra 5 will have something like an A57 you can clock at 3GHz by slapping a fan on it.

But yes, if both Orbis and Durango are using octa-core 1.6GHz Jaguars then nVidia SoCs will catch up in CPU a lot faster than they catch up in GPU.
 
It could be... It could also be a quad core A57 or Denver. And given that the chip is going into a console, I wouldn't be surprised to see higher clocks than in tablets and phones, say 2.5 to 3 GHz.

So how would an quad A57 at 3 GHz compare to an 8 core jaguar at 1.6 GHz?

It's hard to tell. ARM claims 4.1 DMIPS/MHz compared to 3.5 for A15, but it's also a significant jump from to 64 bits from 32 for the ISA. I would bet the jaguar would still have a healthy margin that varied depending on the task.
 
Well, I am actually expecting Nvidia to use Denver in Tegra 5... and guessing that Denver will be slightly faster per clock than A57. So maybe Denver at 2.5 GHz would be more realistic... Unfortunately, we don't have any idea of what actual performance will be like. But I think my guess of "comparable" is not terribly inaccurate.
 
Honest question: how big would you estimate more or less would the upcoming console SoCs be assuming 28nm (which isn't granted, but it's just a theoretical exercize). Once there with an estimate how big would you estimate under realistic scenarios (not perfect shrinks) those very same SoCs will be two process nodes down the line?
 
How about a comparison then between the original XBox and the Tegra4 GPU.

Sure that's a better comparison for current ARM based SOCs, but then again how long ago was that? Quite a bit longer than the 4-5 years ToTTenTranz was guessing for ARM SOCs to match the next gen consoles.

As Exophase has noted, ARM has gotten most of the low hanging fruit in ramping up speeds. They're already in the process of hitting the power wall. Looking at Exynos 5, the GPU and/or CPU often has to be throttled quite a bit if both are being used simultaneously. And that's in a tablet power envelope.

Going forward I don't see as much room for great leaps and strides in performance while keeping power under check.

Regards,
SB
 
Sure that's a better comparison for current ARM based SOCs, but then again how long ago was that? Quite a bit longer than the 4-5 years ToTTenTranz was guessing for ARM SOCs to match the next gen consoles.

I said that if you stretch the timeframe for more than 4-5 years there's nothing against it. While I know what Tottentranz meant and we are admittedly in a Tegra thread, I don't concentrate just on NV's GPU architectures. Dreamcast shipped in 1998/9 and I think it was somewhere around 2005/6 when the SFF mobile MBX+VGP surpassed 100MHz in frequency and ended up more efficient than the Dreamcast GPU and with higher functionalities. The 554MP4@280MHz should easily level with the XBox1 GPU and in 2014 the latest their GPUs in tablets will level with the XBox360 GPU. The difference in processes between iPad1 and iPad4 is merely from 45 to [strike]28nm[/strike] (***edit: 32 and not 28nm) and while the scalability pace will slow down at some point it affects far less GPUs than CPUs, exactly because for the first performance can scale linearly with added cores or clusters and because they don't necessarily need exotic frequencies due their nature concentrating on high parallelism. If I just take an iPad1 and compare it to iPad4 in GL2.5 what kind of performance difference do I get exactly?

NV is still a corner case for the SFF mobile market; they're a SoC manufacturer and up to now they didn't have the luxury to have more than one SoC per year. Only now with T4 they'll go for Grey a mainstream SoC, but they still don't have the luxury to design and sell a higher number of SoCs on a yearly cadence. GPU IP providers are a completely different story there, as there any of their partners can chose from a wide portofolio of performance levels. Especially for someone like Apple which doesn't necessarily mind to have big SoCs that exceed the 160mm2 level for a tablet.

In cliff notes to take NV as a paradigm for the entire SFF mobile market is more than weird, since tendencies are usually where the real volume is.

As Exophase has noted, ARM has gotten most of the low hanging fruit in ramping up speeds. They're already in the process of hitting the power wall. Looking at Exynos 5, the GPU and/or CPU often has to be throttled quite a bit if both are being used simultaneously. And that's in a tablet power envelope.
And I don't disagree; but that still shouldn't mean that bigger semiconductors don't have any other options if they can afford it and really want to. As for Exynos 5250 don't tell me; we spent multiple pages trying to convince folks here that a 5250 is just not a good fit for today's standards for smartphone platform. Samsung's upcoming octacore Exynos doesn't use a T6xx for a good reason and no the result will not necessarily be slower in terms of GPU performance in the end.

Going forward I don't see as much room for great leaps and strides in performance while keeping power under check.

Regards,
SB
I asked already a "trick" question above how big some of you would expect the upcoming console SoCs to be after two process nodes. I'm obviously aiming for just a speculative guess, but things aren't as radical for the GPU side of things as I'm reading here.

Besides I'd rather go as far to say that the better console manufacturers can shrink their upcoming console SoCs with future processes the easier it would be for them to battle any other alternative generic home entertainment mean that might appear in the meantime, more so for SONY which also designs and manufactures TV sets.
 
The PowerVr 554 mp4 in the a6x chip is already far stronger that the gpu of xbox 1 (just check tech specification, there's not even competition). In my opinion next a7x chip will outperform xbox 360 easily (and we will probably found this in the next iPad 4).
Hard times for others SoC producers trying to rival Apple chips in terms of graphical power.
 
And how much more peak power does iPad 4's GPU take vs iPad 1's? I bet it's at least 5x, but probably much more. Not to mention extra power from driving twice the memory channels..

There just isn't enough room to keep scaling the power budget that much higher in tablets, and there isn't enough market for cheap SoCs that target heavier devices than phones or tablets.
 
The PowerVr 554 mp4 in the a6x chip is already far stronger that the gpu of xbox 1 (just check tech specification, there's not even competition).

Not only were my statements pretty vague, which means from one side that I'm not going too deep into such details on purpose since it's not easy to compare the two in real time measurements. As for the specifications, what makes you think that I'm not aware of them?

In my opinion next a7x chip will outperform xbox 360 easily (and we will probably found this in the next iPad 4).

Well then let's hear why. I'm sure you also have a full specification list of Apple's next generation SoC GPU :rolleyes:
 
And how much more peak power does iPad 4's GPU take vs iPad 1's? I bet it's at least 5x, but probably much more. Not to mention extra power from driving twice the memory channels..

I'm not sure anymore but I think it's somewhat over a factor 5x for which I'd gladly stand corrected if its more.

http://www.glbenchmark.com/compare.jsp?D1=Apple+iPad+4&D2=Apple+iPad&D3=Apple+iPad+2&cols=3

Unfortunately there's no 2.5 score for the iPad1 available. But for GL2.1 the difference is at >24x at 1080p. iPad4 compared to iPad2 in GL2.5 is at a factor of 3.5x.

On a pure theoretical hw level between a SGX535@200MHz and a SGX554MP4@280MHz the highest factor increase is for arithmetic at =/>45x and the lowest being fillrate at 5.6x.

There just isn't enough room to keep scaling the power budget that much higher in tablets, and there isn't enough market for cheap SoCs that target heavier devices than phones or tablets.
We'll see.
 
I agree with Exophase. Architectures have been iterating fast too. But we're going to hit a power envelope and the architectures are going to peak because there's no more blood to squeeze from the stone. There's been a gold rush in mobile but things are going to slow down simply because of physics and battery life. We'll get to the point where the only time we see big leaps is when we get a node jump.
 
I agree with Exophase. Architectures have been iterating fast too. But we're going to hit a power envelope and the architectures are going to peak because there's no more blood to squeeze from the stone. There's been a gold rush in mobile but things are going to slow down simply because of physics and battery life. We'll get to the point where the only time we see big leaps is when we get a node jump.

What's so bad about a case where you have SFF mobile GPUs scaling by a factor of 3x with every other node jump say every 2-3 years?

I asked on purpose if anyone dares to predict how big upcoming console SoCs could be with two node shrinks down the future but no one seems to be willing to bite the bullet.
 
I said 2018, which means >5 years from now, post project Denver and matured 14nm or smaller nodes.

It won't take as much time for mobile SoCs to catch up with home consoles this time as it did with the previous gen. The 2005 consoles were as powerful as top-end PCs when they were released. Next-gens are (theoretically) as powerful as a mid-end desktop.

Eventually, there should be a slowdown in the huge year-on-year performance increase on mobile GPUs, but we still don't know when that'll be and official roadmaps have yet to show such a thing.

Tegra 4's iGPU seems to be already very close to RSX and I have no doubts the next Tegra iteration will be more powerful (except for memory bandwidth limited scenarios, unless nVidia also uses stacked memory with a large bus like the Vita).
 
What's so bad about a case where you have SFF mobile GPUs scaling by a factor of 3x with every other node jump say every 2-3 years?

I asked on purpose if anyone dares to predict how big upcoming console SoCs could be with two node shrinks down the future but no one seems to be willing to bite the bullet.

Because the money in the market has saturated the improvement capacity? I imagine we'll get to the point like desktop GPUs where each node shrink and/or architecture change brings 20 to 30% improvement.

For the second question, you can assume a 75% scale factor for each full node drop. Let's say Durangos APU is 300 m^2. 28nm -> 20nm -> 14nm = 168 mm^2. Still bigger than nearly all mobile SoCs. 11nm would get you to 126 mm^2.

It won't take as much time for mobile SoCs to catch up with home consoles this time as it did with the previous gen. The 2005 consoles were as powerful as top-end PCs when they were released. Next-gens are (theoretically) as powerful as a mid-end desktop.

That's because desktop top TDP has ballooned too. Radeon 1900XTX was 135W. 7970 is 250W.
 
Back
Top