NVIDIA Tegra Architecture

Rys · Jan 20, 2015

Yes, there's a lot more to a modern SoC subsystem than just making requests on ports of a certain width. There's internal buffering, transaction priorities and outstanding transaction queues, burst behaviour tuning, aggressive last level caching, etc. The major clients of the memory controller in a modern SoC all behave differently in their requests, too. Some are read only, some are write heavy, some are bursty, some are heavy streamers, some need as low latency as possible to satisfy internal block "QoS". The GPU is a strange one in particular because it puts different loads on the memory subsystem depending where in your render you are.

Tuning a memory controller and connected ports is therefore a balancing act, and one that makes "peak bandwidth" incredibly difficult to provide to any one requester, especially in a modern consumer device where there's always non-negligible bandwidth needed to serve the display at least.

RecessionCone · Jan 20, 2015

Entropy said:
Have to say I don't quite agree with this. I've seen large discrepancies between (speed x width) and ultimately achievable bandwidth. Since different usage scenarios results in different bus utilization efficiency, ideally you would test using different (and transparent for interpretation) methods. Which of course was originally the reason why John McCalpins STREAM didn't just use one test but four different ones.

Yup. Anyone ever run STREAM on Xeon Phi? KNF gets somewhere around 50% of peak.

Ailuros · Jan 22, 2015

I'm still waiting for that Xiaomi thingy (and yes I've written off Duke Nexus Forever...); in the meantime feast your eyes with that one: http://gfxbench.com/device.jsp?benchmark=gfx30&os=Android&api=gl&D=NVIDIA Tegra Note 8&testgroup=info

A Tegra Note 8, 7.9" 1080p and for some weird reason with OGL_ES3.0 drivers and not 3.1 like the other Tegra K1 devices.

***edit: thanks to a 3dc forum member I didn't notice that that's a quite old entry from April 2014. No idea why it re-appeared now in the database. It's probably one and the same with the Shield tablet before it was launched.

Erinyes · Jan 27, 2015

Ailuros said:
I've written off Duke Nexus Forever...)

In other news..new Nvidia Shield based on X1 on its way (no surprise there). Guess we'll find out more at MWC.

BadTB25 · Jan 27, 2015

Erinyes said:
In other news..new Nvidia Shield based on X1 on its way (no surprise there). Guess we'll find out more at MWC.

Link please. I was holding off on getting the Shield tablet hoping some of the issues (battery, cracking case, etc) would get resolved.

Erinyes · Jan 29, 2015

BadTB25 said:
Link please. I was holding off on getting the Shield tablet hoping some of the issues (battery, cracking case, etc) would get resolved.

Sorry no link..this is what I have heard from a source close to NV. Anyway..like I said..it's not really a surprise..it was expected.

BadTB25 · Jan 29, 2015

Got it. My wait continues.

Ailuros · Feb 4, 2015

Hallelujah!!!

Ladies and gentlemen: The Duke Nexus Forevah

http://www.anandtech.com/show/8701/the-google-nexus-9-review

Ailuros · Feb 5, 2015

Sorry boys but I expected Anandtech to dig way deeper in this one, unless the other followup Denver article does exactly that.

In practice, I didn't really notice any issues with the Nexus 9's performance, although there were odd moments during intense multitasking where I experienced extended pauses/freezes that were likely due to the DCO getting stuck somewhere in execution, seeing as how the DCO can often have unexpected bugs such as repeated FP64 multiplication causing crashes. In general, I noticed that the device tended to also get hot even on relatively simple tasks, which doesn't bode well for battery life. This is localized to the top of the tablet, which should help with user comfort although this comes at the cost of worse sustained performance.

Outside of the theoretical mambo jumbo about the architecture's technical details and synthetic benchmark results (how often can someone really read those in all related writeups?) right at the spot above when things are getting warmer I lost connection :no:

mboeller · Feb 5, 2015

overall the review seems to be quite negative regarding Denver:

Unfortunately, while the design of the CPU is academically interesting it doesn’t seem that this produces real-world benefits. The Nexus 9 has one of the fastest SoCs we’ve seen to date, but this comes at the cost of worse power efficiency than the Cortex A15 version of the Tegra K1.

Unfortunately, it seems that Kepler’s desktop-first design results in worse power efficiency than what we see on competing solutions such as the “GXA6850” found in competing SoCs.

One question:

Is SpecInt 2000 single-thread oder multi-thread?
http://www.anandtech.com/show/8701/the-google-nexus-9-review/5

If it is "single-threaded" then the performance of Denver would be roughly the same as an ARM A57. I had expected something way better. Therefore I expect that the compare the 4 A15 cores of the K1 with the two cores of the Denver-SoC, but I'm not sure.

Erinyes · Feb 5, 2015

Ailuros said:
Sorry boys but I expected Anandtech to dig way deeper in this one, unless the other followup Denver article does exactly that.

Outside of the theoretical mambo jumbo about the architecture's technical details and synthetic benchmark results (how often can someone really read those in all related writeups?) right at the spot above when things are getting warmer I lost connection

Indeed..just like Duke Nukem Forever, I was a bit underwhelmed. The article didn't really tell us anything we didn't already know. This was pretty much a standard launch day type review. Given the delay..I guess we were expecting a lot more.

Nebuchadnezzar said:
Hopefully both will be out next few days... (Actually all 3 articles, depending on scheduling)

Hopefully one of the remaining two (Exynos 5433/5430 analysis and Meizu MX4 Pro review) will give us some good fodder for discussion (That quote was from 27th January btw

).

Nebuchadnezzar · Feb 5, 2015

Erinyes said:
Hopefully one of the remaining two (Exynos 5433/5430 analysis and Meizu MX4 Pro review) will give us some good fodder for discussion (That quote was from 27th January btw ).

Sadly I can't control the timing on these things and Ryan is very swamped. Next is the 5433 Note 4 piece which includes my testing and hopefully more exciting content, hopefully Monday. After that is a new product briefing and performance preview, and then the MX4Pro within the same week.

Erinyes · Feb 5, 2015

Nebuchadnezzar said:
Sadly I can't control the timing on these things and Ryan is very swamped. Next is the 5433 Note 4 piece which includes my testing and hopefully more exciting content, hopefully Monday. After that is a new product briefing and performance preview, and then the MX4Pro within the same week.

That's alright..I know you guys are trying your best to get them out. Waiting for Monday then!

liolio · Feb 5, 2015

Android remains Android no matter anandtech team's effort it is tough to extensively review any processor (or GPU).
Result are all over the place from medium to great.
There is something clear, Lollipop got release in a really early stage it seems. There are oddity in how the Shield tablet performs for example.
As there are few devices running it out there it is tough to appreciate the impact on perf of the mass storage encryption for example.

I think that Denver is in fact quite decent, on newer process it should be able to compete against A57 (mostly does already). Now it sadly might not be worth the expense in R&D: ARM has the A53, the A57 the A72 and I guess they might have an ARM V8 version of the A17 coming.
Actually Denver would already have hindered Nvidia's ability to deploy new product if it were not for ARM design.

Imo if Nvidia wants to remain on that market (SOC) they will have focus on their strengths GPU/software and iterate their design faster, launching a comprehensive line of product could also be a good idea.
There are more and more SOC manufacturers, design wins are to get rougher and rougher to get by, the automotive niche is already challenged by Qualcomm, Apple.

With Windows 10 I wonder if Nvidia should take the risk to give up on Android altogether (for their line of product) and try to become MSFT primary choice for anything not X86.
In any case to do that they need to come down on earth with regard to their SOC. They need to target phones first in that regard the tegra k1 is a step in the wrong direction. More than that they need to design SOC(s) meant to run existing software not to win dick contests.
Apple went for years with pretty sucky CPU, now they go with greater GPU than what they really need the point is it is not what they are selling to their costumers.

liolio · Feb 5, 2015

Nebuchadnezzar said:
Sadly I can't control the timing on these things and Ryan is very swamped. Next is the 5433 Note 4 piece which includes my testing and hopefully more exciting content, hopefully Monday. After that is a new product briefing and performance preview, and then the MX4Pro within the same week.

Sorry for the OT but do you plan to test the Asus X205 eeebook?

Nebuchadnezzar · Feb 5, 2015

liolio said:
Sorry for the OT but do you plan to test the Asus X205 eeebook?

You'd have to ask Brett or Jarred.

Ailuros · Feb 5, 2015

liolio said:
Android remains Android no matter anandtech team's effort it is tough to extensively review any processor (or GPU).
Result are all over the place from medium to great.
There is something clear, Lollipop got release in a really early stage it seems. There are oddity in how the Shield tablet performs for example.
As there are few devices running it out there it is tough to appreciate the impact on perf of the mass storage encryption for example.

Android is perfectly fine for what it's called for to do and especially under the perspective that it has to serve such a multitude of different hw and sw configurations. As for Lollipop there is always a price for early adopters and it's not like Apple's new OS versions are always troublefree and not in the very least in windows land either. No idea where you're trying to get to, but if you should blame even indirectly Android for the state Denver is in or how it behaves, all you have to do is compare it in real time against Tegra K1/32bit either in Android Lollipop or Kitkat and then it'll be easier to find the real scapegoat.

With Windows 10 I wonder if Nvidia should take the risk to give up on Android altogether (for their line of product) and try to become MSFT primary choice for anything not X86.
In any case to do that they need to come down on earth with regard to their SOC. They need to target phones first in that regard the tegra k1 is a step in the wrong direction. More than that they need to design SOC(s) meant to run existing software not to win dick contests.
Apple went for years with pretty sucky CPU, now they go with greater GPU than what they really need the point is it is not what they are selling to their costumers.

How can they give up Android swing to windows10 and re-address the smartphone SoC market at the same time? It's not impossible at all, but I'd say that NV would have to have quite suicidal tendencies to go for windows phone right now.

Apple has its own OS and R&D can go hand in hand both for hardware and software and NONE of them out there has that kind of luxury. At the very least Cyclone is quite more efficient at its frequency, doesn't burn a hole in their devices while overheating and it doesn't cause any occassional stutters either. As for their GPUs yes Apple likes them big and with as high as possible sustainable performance. For the prices they're asking they SHOULD deliver whatever best is possible for any given timeframe.

Finally for the point that NVIDIA should have had mainstream SoCs for smartphones, that's something I've been saying for a very long time (and have been attacked for....). It should have happened years ago though; now I'm afraid there's only one Mediatek the market can have especially for chinese white box deals.

RecessionCone · Feb 5, 2015

Nebuchadnezzar said:
Sadly I can't control the timing on these things and Ryan is very swamped. Next is the 5433 Note 4 piece which includes my testing and hopefully more exciting content, hopefully Monday. After that is a new product briefing and performance preview, and then the MX4Pro within the same week.

Looking forward to your 5433 review. TechReport's testing was very interesting - as I predicted, ARM's HMP big.LITTLE doesn't seem to provide any measurable performance boost.
http://techreport.com/review/27539/samsung-galaxy-note-4-with-the-exynos-5433-processor/3

Maybe your testing will show otherwise...

Ailuros · Feb 5, 2015

RecessionCone said:
Looking forward to your 5433 review. TechReport's testing was very interesting - as I predicted, ARM's HMP big.LITTLE doesn't seem to provide any measurable performance boost.
http://techreport.com/review/27539/samsung-galaxy-note-4-with-the-exynos-5433-processor/3

Maybe your testing will show otherwise...

I doubt there are any synthetic tests for it, but I'd be more interested in big.LITTLE HMP power savings than performance boosts. For that I guess there should be a way to be able to turn of the LITTLE quad and operate entirely on A57 cores and then bounce back for the same test scenarios to big.LITTLE (or LITTLE.big to be more precise).

On a sidenote: holy shit....30.9mm2 at 20nm Samsung for the T760MP6? I can't translate that into transistors obviously, but it sounds like a whole damn LOT of waste of die area especially if you estimate what GK20A weighs under 28HPm or the upcoming X1 GPU at 20SoC and consider that the 5433 GPU gets barely 17+ fps in Manhattan offscreen....errrr uhmmm *cough* yeah

Nebuchadnezzar · Feb 5, 2015

RecessionCone said:
Looking forward to your 5433 review. TechReport's testing was very interesting - as I predicted, ARM's HMP big.LITTLE doesn't seem to provide any measurable performance boost.
http://techreport.com/review/27539/samsung-galaxy-note-4-with-the-exynos-5433-processor/3

Maybe your testing will show otherwise...

big.LITTLE isn't meant for boosting performance. Any performance boost is coming from reduced load latency by migrating faster to the big cores than the DVFS mechanism clocking up on the little cores, but that's it, it has no effect on constant loads such as those benchmarks.

Ailuros said:
I doubt there are any synthetic tests for it, but I'd be more interested in big.LITTLE HMP power savings than performance boosts. For that I guess there should be a way to be able to turn of the LITTLE quad and operate entirely on A57 cores and then bounce back for the same test scenarios to big.LITTLE (or LITTLE.big to be more precise).

I have a bunch of such test scenarios and more...

NVIDIA Tegra Architecture

Rys

Graphics @ AMD

RecessionCone

Ailuros

Epsilon plus three

Erinyes

BadTB25

Erinyes

BadTB25

Ailuros

Epsilon plus three

Ailuros

Epsilon plus three

mboeller

Erinyes

Nebuchadnezzar

Erinyes

liolio

Aquoiboniste

liolio

Aquoiboniste

Nebuchadnezzar

Ailuros

Epsilon plus three

RecessionCone

Ailuros

Epsilon plus three

Nebuchadnezzar

Similar threads