NVIDIA Tegra Architecture

For the Parker SoC, I wonder if there will be a companion SoC(like Grey) or a reference design(like Phoenix) called Watson(see who can guess this reference but it's really easy anyway).

For this current Tegra generation, they have three items of interest: Tegra 4 SoC, i500 baseband modem, and Tegra 4i SoC (with integrated i500 baseband modem). They will certainly need to keep these items up to date for future Tegra generations.
 
They're also pointing out that Logan will be more power efficient, albeit I'd word it as more power conscious in the given case but that's hair splitting. Logan obviously won't have a cooling fan and a heatsink as Kayla ;)

It will have a lot less bandwith, too. And a shared memory controller.
I wonder what we can expect Logan to use, ddr3 1866 and a 64bit bus?, probably with a one SMX GPU.
 
Did I imagine the slide DSC just posted then?

I hadn't seen that info, thanks for pointing it out. Still, I don't think it's necessary for Logan to be the same as Kayla. Kayla wasn't presented as a performance proxy for Logan, but rather a vehicle to prove and demonstrate CUDA on ARM infrastructure.
 
I hadn't seen that info, thanks for pointing it out. Still, I don't think it's necessary for Logan to be the same as Kayla. Kayla wasn't presented as a performance proxy for Logan, but rather a vehicle to prove and demonstrate CUDA on ARM infrastructure.

According to NVIDIA's CEO at 2013 GTC, the GPU used in Logan will likely be even faster than Kayla, while using less power too. Of course, while we know the number of CUDA cores in Kayla, we don't know anything about the operating frequency.
 
According to NVIDIA's CEO at 2013 GTC, the GPU used in Logan will likely be even faster than Kayla, while using less power too. Of course, while we know the number of CUDA cores in Kayla, we don't know anything about the operating frequency.

Wow, that seems hard to believe. I'll believe it when I see it, in other words. Though, now that I think about it, I don't think we'll get to see many, if any, direct performance comparisons between Kayla and Logan.
 
Other tidbits: 9.4W top TDP for Tegra 4, likely will never reach those 1.9GHz clocks on all 4 cores in any real situation.

They estimate 5W for 2.3GHz T4i, again something they say will never be reached in reality and 1.8GHz would be the more realistic target.

Now, take this with a grain of salt, but NVIDIA claimed at GTC that Tegra 4 has a TDP of 5w while Tegra 4i has a TDP of 1w. With respect to Tegra 4, this may be a reasonable claim if a 38 watt-hour Tegra 4-powered [Shield] device can get 5-10 hours of battery life during heavy gaming (which would imply somewhere between 3.8-7.6w of power consumption in this heavy gaming scenario). What is somewhat surprising is just how much lower the TDP is for Tegra 4i compared to Tegra 4. Does anyone know what was NVIDIA's TDP rating for Tegra 3?
 
Tegra 4 at 5W sounds plausible (not with all four cores at 1.9GHz though, I'm sure that'll be capped). But I don't believe for a minute that Tegra 4i can be kept under 1W while still providing even the same peak performance a Tegra 3 could, much less dramatically more. nVidia themselves has tacitly admitted that Tegra 3 with all four cores running at 1.6GHz would use over 3W from the CPU part alone, something that Anandtech's power consumption tests confirm. GPU also used around 1.8W while running a game. I don't see how they could have dropped that so much while making it so much more powerful..
 
Yeah, I don't quite get that either. Sadly we will probably have to wait quite a few months before the Tegra 4i platform is tested and reviewed.
 
According to NVIDIA's CEO at 2013 GTC, the GPU used in Logan will likely be even faster than Kayla, while using less power too. Of course, while we know the number of CUDA cores in Kayla, we don't know anything about the operating frequency.

Can you find a quote for that? This seems very unlikely.
 
Another interesting comment from NVIDIA at GTC is that the company is willing to license their GPU technology. Now we know why Bob Feldstein came to NVIDIA after leaving AMD...
 
According to NVIDIA's CEO at 2013 GTC, the GPU used in Logan will likely be even faster than Kayla, while using less power too. Of course, while we know the number of CUDA cores in Kayla, we don't know anything about the operating frequency.

Considering what they're demonstrating with Kayla in the video, it could easily be the case but mostly for GFLOPs and not necessarily of all other aspects of a Kepler SMX like TMUs for instance. Even worse there's no necessity either to go for 384SPs. It could easily also be for example half the units but with twice the MADDs/ALU lane.

I hope at least NOW it shows where the direction for SFF GPUs will be going in the foreseeable future and if GFLOPs will be or not a metric for what IHVs trim their GPUs for. Last but not least, obviously compared to any other unit ALUs are relatively cheap.

***edit: DO NOT click on the following link if you don't have a sense of humor: http://translate.google.gr/?hl=en&tab=wT#el/en/ΚΑΥΛΑ
 
So very similar pipeline organization.

btw. does the ALUs include the "pre-shifter" or are "pre-shifted" ops splitted into 2 ops?

I don't think that A57 diagram is more than taking an A15 diagram and adding a few new details to it.. that nothing else changed is more or less an assumption. It's implied that it'll be a grossly similar uarch since ARM showed pipeline pictures that were the same as A15's, but they too could have just been doing it for simplification or because they don't want to reveal more information yet. Or never, ARM documents less and less every generation.. :|

I'm not sure the A57 diagram is that up to date either.. for instance it's confirmed that it has a 1024 entry L2 TLB, not 512. Also, it's known that the ITLB is up to 48 entries from 32. I think that he actually confused ITLB and L1 icache size, because I don't think 48KB has been specified anywhere (but if it is, growing to 3-way set associative is almost a given)
 
On Anandtech's latest podcast, they made 2 eye-raising guesstimates regarding Tegra 5. Firstly, Logan will approach 400 Gflops and secondly it will still be made on the 28nm process.

The performance estimate came from Nvidia's assertion that Kayla will preview the capability of Logan, now to my simple mind, that doesn't infer performance equivalence, more like API compatibility.

AMS believes / knows that Kayla is based on the new GT 735M, which supports 4.3, so is a logical choice. But how will Logan even with 1 SMX and lower clocks even think about reaching a low enough power draw, to partner 4+1 A15s in a mobile SoC, unless 20nm is part of the equation?
 
On Anandtech's latest podcast, they made 2 eye-raising guesstimates regarding Tegra 5. Firstly, Logan will approach 400 Gflops and secondly it will still be made on the 28nm process.

The performance estimate came from Nvidia's assertion that Kayla will preview the capability of Logan, now to my simple mind, that doesn't infer performance equivalence, more like API compatibility.

AMS believes / knows that Kayla is based on the new GT 735M, which supports 4.3, so is a logical choice. But how will Logan even with 1 SMX and lower clocks even think about reaching a low enough power draw, to partner 4+1 A15s in a mobile SoC, unless 20nm is part of the equation?

What's so impossible about 400GFLOPs? 196SPs at "just" 1GHz and you're almost there. :LOL:

Jokes aside if memory serves well for synthesis alone you need under 28nm 0.01mm2 for a FP32 unit at 1GHz.
 
What's so impossible about 400GFLOPs? 196SPs at "just" 1GHz and you're almost there. :LOL:

Jokes aside if memory serves well for synthesis alone you need under 28nm 0.01mm2 for a FP32 unit at 1GHz.

Well 192SP @ 500 MHz & 200 Gflops would still be an impressive achievement.
 
Back
Top