NVIDIA Tegra Architecture

So you admit it has more to do with the thermal capacities of the tablet itself than the efficiency of the SoC? Thanks for agreeing with me.Pretty high claim to make while misrepresenting your power argument from just 2 posts ago.
you are really something :devilish:
the magnesium shield is just a spreader that helps dissipate heat. Put one in a Snapdragon tablet and the SoC won't throttle. It proves nothing regarding TK1, except that Nvidia wanted a product that can be used for very long time at maximum performance...
 
you are really something :devilish:
the magnesium shield is just a spreader that helps dissipate heat. Put one in a Snapdragon tablet and the SoC won't throttle. It proves nothing regarding TK1, except that Nvidia wanted a product that can be used for very long time at maximum performance...
Still, SHIELD Tablet DESTROYS anything QC has to offer in the tablet form factor (more than 2 times the performance of 805 with same power envelop, ie a bit less than 10W). Even better, it does it without any long term throttle as your test shows (more than 110 iterations of GFX bench before slightly going down). So end-user will never see this 30W thing as it doesn't correlate in real world usage...
You are directly comparing the shield tablet to the Snapdragon 805. Even though your 10W argument for the SD is very far off (it's more like 6W), you directly compare the ST to a SoC. Or are you comparing the chassis now between tablets? Since you were comparing it to the S805 and not the MDP itself I suppose you weren't talking about the chassis. But your last post makes it seem Nvidia should be praised for its great heat shield.

Decide what you're on about, because your arguments are all over the place.
 
I wrote that it is not what it was hyped up to be and if you were following this thread over the last 8 months or so it was a notion in the ballpark of 2-3x the perf/W over competing products

Now you are just making $h#t up to suit your agenda. No one here ever said that perf. per watt of TK1 is 2-3x better than competing products. The performance of TK1 in a tablet is 2-3x better than competing products currently on the market (is it really that hard to understand the difference between performance and perf. per watt?). NVIDIA measured GPU perf. per watt of TK1 [in GFXBench 3.0 Manhattan 1080p Offscreen test] to be 1.5x better than A7 and S800 at the same performance level, and we all knew this more than six months ago.

The fact that the K1 has such a higher upper end in its dynamic range bodes badly for perf/W because the high frequency points are fed *extremely* high voltages for a 28nm HPm process.

This is just your negative spin on it. The reality is that TK1 has much higher GPU performance headroom than most other SoC's currently on the market, and can reach GPU performance levels that most other SoC's currently on the market can only dream about. I would much rather have that extra performance headroom than to not have it at all, and if I was concerned about maximizing battery life, I would set framerate target to 30fps which Shield tablet will allow me to do.

With a magnesium heat shield the size of the tablet itself and with burning hot temperatures at the end of the test. Please.

Now this is a classic case of trolling. Since when is having 2x better heat dissipation vs. other thin fanless tablets a bad thing? Since when is having virtually no thermal throttling at GPU performance levels that are ~ 2.4x higher than the best thin fanless tablets available today a bad thing? What a nonsensical statement.

FWIW, here is what Hardware Canucks had to say about Shield tablet thermals:

"The Tegra K1 is one of today’s fastest mobile processors and it will spend the vast majority of its life running demanding applications. Meanwhile, the SHIELD Tablet’s external skin is coated in a soft-touch finish rather than aluminum which can act as a quasi external heatsink. By all sane thinking, this SHIELD should run extremely hot….but it doesn’t. Nor does the K1’s performance throttle, even when it’s being used in Max Performance mode. After two hours of continual gaming in Trine 2 (which is arguably one of the most demanding games available right now) the SHIELD Tablet remains relatively cool to the touch. Granted, there is a hot spot over the K1 itself but we can see that NVIDIA’s internal heatsink design is able to disperse the heat across several square inches".
 
Last edited by a moderator:
I somewhat doubt those numbers; 2GHz is running at 1.22V and 2.3GHz at 1.36V nominal voltages before dynamic binning. That's insanely high for 28HPm. Qualcomm barely reaches or goes over 1V in the 801/805.

Yeah, that's really high. Do you have the V/MHz table for Tegra 4?
 
The CPU perf. per watt for TK1 should be quite competitive. Here are some measurements from NVIDIA:

slides03.jpg


Looking at pure performance, the R3 Cortex A15 in Tegra K1 @ 2.2GHz peak frequency is often way ahead of Krait 450 in Snapdragon 805 @ 2.7GHz peak frequency. The CPU/browser performance advantage for Shield tablet vs. S805 MDP/T tablet is as follows:

Google Octane v2: +88%
Kraken 1.1: +57%
Sunspider 1.0.2: +26%
WebXPRT: +94%
 
Now you are just making $h#t up to suit your agenda. No one here ever said that perf. per watt of TK1 is 2-3x better than competing products.
Are you for real? Xpea just tried to claim that it has 2.5x the performance at the same power versus the A7. Do I need to quote the post in larger font size? This exactly what people were saying.

FWIW, here is what Hardware Canucks had to say about Shield tablet thermals:

".....After two hours of continual gaming in Trine 2 (which is arguably one of the most demanding games available right now) the SHIELD Tablet remains relatively cool to the touch. Granted, there is a hot spot over the K1 itself but we can see that NVIDIA’s internal heatsink design is able to disperse the heat across several square inches[/b]"...

You are suffering from cognitive bias;




Even the AnandTech article mentioned how hot it got - maybe a little too downplayed because Josh told me it was the hottest mobile device he has ever tested and the only thing came which came near was the Galaxy S2 (and those could run ridiculously hot).


Yeah, that's really high. Do you have the V/MHz table for Tegra 4?
Only the nominal table:

Code:
		.cvb_table = {
			/*f       dfll: c0,     c1,   c2  pll:  c0,   c1,    c2 */
			{204000,        {1112619, -29295, 402}, {800000, 0, 0}},
			{306000,	{1150460, -30585, 402}, {800000, 0, 0}},
			{408000,	{1190122, -31865, 402}, {800000, 0, 0}},
			{510000,	{1231606, -33155, 402}, {800000, 0, 0}},
			{612000,	{1274912, -34435, 402}, {800000, 0, 0}},
			{714000,	{1320040, -35725, 402}, {800000, 0, 0}},
			{816000,	{1366990, -37005, 402}, {820000, 0, 0}},
			{918000,	{1415762, -38295, 402}, {840000, 0, 0}},
			{1020000,	{1466355, -39575, 402}, {880000, 0, 0}},
			{1122000,	{1518771, -40865, 402}, {900000, 0, 0}},
			{1224000,	{1573009, -42145, 402}, {930000, 0, 0}},
			{1326000,	{1629068, -43435, 402}, {960000, 0, 0}},
			{1428000,	{1686950, -44715, 402}, {990000, 0, 0}},
			{1530000,	{1746653, -46005, 402}, {1020000, 0, 0}},
			{1632000,	{1808179, -47285, 402}, {1070000, 0, 0}},
			{1734000,	{1871526, -48575, 402}, {1100000, 0, 0}},
			{1836000,	{1936696, -49855, 402}, {1140000, 0, 0}},
			{1938000,	{2003687, -51145, 402}, {1180000, 0, 0}},
			{2014500,	{2054787, -52095, 402}, {1220000, 0, 0}},
			{2116500,	{2124957, -53385, 402}, {1260000, 0, 0}},
			{2218500,	{2196950, -54665, 402}, {1310000, 0, 0}},
			{2320500,	{2270765, -55955, 402}, {1360000, 0, 0}},
			{2422500,	{2346401, -57235, 402}, {1400000, 0, 0}},
			{2524500,	{2437299, -58535, 402}, {1400000, 0, 0}},
			{      0 , 	{      0,      0,   0}, {      0, 0, 0}},
		},
First column frequency, third to last voltage. However these are the nominal voltages before binning the chip. Nvidia applies dynamic binning on a per chip basis and uses those other values as coefficients, I didn't reverse engineer the formula yet. The problem with this stuff is that you need to root the device to be able to have deeper access to the components, which we don't have / can't do. However the difference shouldn't be too big and you get a ballpark figure of where the voltages are at.

Edit: That's the K1 table, I misread your post saying T4, here's one of the T4 chips:

Code:
		.cvb_table = {
			/*f       dfll:  c0,      c1,    c2  pll:   c0,   c1,    c2 */
			{ 306000, { 2190643, -141851, 3576}, {  900000,    0,    0} },
			{ 408000, { 2250968, -144331, 3576}, {  950000,    0,    0} },
			{ 510000, { 2313333, -146811, 3576}, {  970000,    0,    0} },
			{ 612000, { 2377738, -149291, 3576}, { 1000000,    0,    0} },
			{ 714000, { 2444183, -151771, 3576}, { 1020000,    0,    0} },
			{ 816000, { 2512669, -154251, 3576}, { 1020000,    0,    0} },
			{ 918000, { 2583194, -156731, 3576}, { 1030000,    0,    0} },
			{1020000, { 2655759, -159211, 3576}, { 1030000,    0,    0} },
			{1122000, { 2730365, -161691, 3576}, { 1090000,    0,    0} },
			{1224000, { 2807010, -164171, 3576}, { 1090000,    0,    0} },
			{1326000, { 2885696, -166651, 3576}, { 1120000,    0,    0} },
			{1428000, { 2966422, -169131, 3576}, { 1400000,    0,    0} },
			{1530000, { 3049183, -171601, 3576}, { 1400000,    0,    0} },
			{1606500, { 3112179, -173451, 3576}, { 1400000,    0,    0} },
			{1708500, { 3198504, -175931, 3576}, { 1400000,    0,    0} },
			{1810500, { 3304747, -179126, 3576}, { 1400000,    0,    0} },
			{1912500, { 3395401, -181606, 3576}, { 1400000,    0,    0} },
			{      0, {       0,       0,    0}, {       0,    0,    0} },
		},
And yes those 1.4V values are really odd, but it has been verified on the original Shield that the regulator actually went that high. That jump from 1.3GHz to 1.4Ghz makes little sense to me too.
 
Last edited by a moderator:
Is that the table for Tegra 4 or K1? It seems to match the numbers you gave earlier and it goes up to a much higher frequency than I'd expect to even be populated for Tegra 4, so I assume K1.
 
Is that the table for Tegra 4 or K1? It seems to match the numbers you gave earlier and it goes up to a much higher frequency than I'd expect to even be populated for Tegra 4, so I assume K1.
Yes sorry I misread your post, edited above. In any case the voltage drop is what you would expect from going from LP to HPm -> ~125mV decrease.
 
Please do not blatantly misquote what I wrote, I wrote that it is not what it was hyped up to be and if you were following this thread over the last 8 months or so it was a notion in the ballpark of 2-3x the perf/W over competing products,
As AMS already noted Nvidia have not claimed those 2-3x the perf/W gains, claimed by Nvidia perf/W advantage is ~1.5x at ISO performance in Manhattan offscreen test, obviously it could be smaller or bigger depending on the test itself, though ISO perf is't the best perf/watt case for high frequency design like Kepler, if you will try overvoltage and clock something like Adreno 420 to GK20A performance levels you will probably get 2-3x perf/W advantage for K1
 
As AMS already noted Nvidia have not claimed those 2-3x the perf/W gains, claimed by Nvidia perf/W advantage is ~1.5x at ISO performance in Manhattan offscreen test, obviously it could be smaller or bigger depending on the test itself, though ISO perf is't the best perf/watt case for high frequency design like Kepler, if you will try overvoltage and clock something like Adreno 420 to GK20A performance levels you will probably get 2-3x perf/W advantage for K1

as Nebuchadnezzar noted:

Are you for real? Xpea just tried to claim that it has 2.5x the performance at the same power versus the A7. Do I need to quote the post in larger font size? This exactly what people were saying.

Seriously. there is this magical thing called context..........
 
I for one am not in the least knocked off as a consumer for the last two pages for the future HTC Volantis; then again I don't expect Google to be as aggressive with DVFS settings either.

IMHO both the Shield tablet and the MiPad are a bit on the too aggressive power settings side; a wee bit less wouldn't had hurt and the device would still had been screamingly fast and damn hard for any Android competition to catch up.

A quite long time ago I found it very hard to believe that NVIDIA managed to squeeze the entire Kepler featureset "as is" into the ULP mobile SoC form factor. When it became bleedingly obvious that they did, I stood corrected in public, while at the same time I received a "thank you but I still want to see power consumption" reply from one of the members here.

We now have both performance and power consumption figures. Frankly there are no "told you so's" essentially for anyone here and I really don't see what the fuzz is about.
 
Are you for real? Xpea just tried to claim that it has 2.5x the performance at the same power versus the A7. Do I need to quote the post in larger font size? This exactly what people were saying.

LOL, can you not even read your own posts? You clearly said "if you were following this thread over the last 8 months or so it was a notion in the ballpark of 2-3x the perf/W over competing products". That is utter nonsense. We have known since January 2014 that GPU perf. per watt at iso-performance level vs. A7 and S800 is ~ 50% better (and TK1 is fabricated on the same process node as S800 too). xpea made a comment one or two days ago (where he actually said "similar" power and not "same" power compared to A7), which is not necessarily correct, because NVIDIA never talked about power consumption other than at iso-performance.


You are suffering from cognitive bias

Right. I just directly quoted a paragraph from a reputable third party reviewer who actually played Trine 2 for two straight hours (rather than doing something unrealistic like looping a benchmark 150 times), and he clearly stated that the tablet was "relatively cool to the touch", presumably in the areas where his hands actually hold the tablet! Obviously the SoC itself will get quite warm to the touch (and the reviewer noted that there is a hot spot), but it is positioned in an area of the tablet that one would normally not hold with handheld gaming.
 
Last edited by a moderator:
IMHO both the Shield tablet and the MiPad are a bit on the too aggressive power settings side; a wee bit less wouldn't had hurt and the device would still had been screamingly fast and damn hard for any Android competition to catch up.

Why would anyone want a device neutered where the max performance of the SOC could never be obtained?

That capability may be needed on a future game and leaving that performance less that it could be is downright dumb on a Gaming Tablet.

If you want less aggressive power then enable the Nvidia software that caps performance.
 
Last edited by a moderator:
Yes sorry I misread your post, edited above. In any case the voltage drop is what you would expect from going from LP to HPm -> ~125mV decrease.

Okay, so at least it's true that there should be a big perf/W improvement over Tegra 4, if all else were kept equal K1 would only use ~70% the power of Tegra 4 at 1.9GHz. Then adding the uarch and layout improvements it goes down even further. That's a pretty huge deal.

I think it's fair to say K1 just has a broader dynamic range even for CPU perf, it's kind of hard to judge it based on how much power it consumes at > 2GHz when none of the competition on Android can really match the performance at that level. I'd love to see some independent third party measurements that are similar to nVidia's, measuring power consumption over broad performance curves.

I always thought Tegra 4 was using HPL, not LP, so I'm actually surprised to see the improvement is so high.
 
LOL, can you not even read your own posts? You clearly said "if you were following this thread over the last 8 months or so it was a notion in the ballpark of 2-3x the perf/W over competing products". That is utter nonsense. We have known since January 2014 that GPU perf. per watt at iso-performance level vs. A7 and S800 is ~ 50% better (and TK1 is fabricated on the same process node as S800 too). xpea made a comment one or two days ago (where he actually said "similar" power and not "same" power compared to A7), which is not necessarily correct, because NVIDIA never talked about power consumption other than at iso-performance.

Yup, and as mentioned earlier, the Jetson TK1 Dev. Kit is not optimized for mobile platforms, and as a result consumes ~ 40% more power than a similarly performing TK1 SKU that is optimized specifically for mobile platforms. So a ~ 326 GFLOPS GPU throughput variant of TK1 (operating at ~ 850MHz GPU clock operating frequency) optimized for a mobile platform would have AP+mem. power consumption of ~5w at max GFXBench 3.0 performance, which matches almost perfectly with the 5w TDP that NVIDIA specified for Tegra K1.

Estimating TK1 perf. per watt on a mobile platform, and estimating AP+mem power consumption on different mobile form factors, will result in a reasonably good estimate of performance on said platforms.

At ~2.6w AP+mem power consumption (ie. equivalent to A7 in iPhone 5S), TK1 in an optimized mobile platform is ~1.45x faster than A7 and ~ 1.7x faster than S800. => ~19fps

At closer to ~4w AP+mem power consumption, TK1 in an optimized mobile platform should be ~2x faster than A7 and ~ 2.3x faster than S800. => ~26fps

At closer to ~5w AP+mem power consumption, TK1 in an optimized mobile platform should be ~2.5x faster than A7 and ~2.9x faster than S800. => ~32fps

The above estimate of 2.5x vs. A7 is coincidentally very close to the number NVIDIA showed on a chart a few months ago. Also note that NVIDIA is using iOS 7.1 results, so they are comparing to the latest and greatest drivers for that platform.

Note that, as impressive as Tegra K1 appears to be, Tegra M1 in Erista will be ~1.6x faster than Tegra K1 at all power consumption levels!

There are various SKU's of Tegra K1, some which are lower performance than Jetson TK1 (presumably for smartphones and small, thin tablets), and some of which are higher performance than Jetson TK1 (presumably for micro game consoles that can make use of active cooling). Note that the 365 GFLOPS throughput number was really only mentioned in comparison to other consoles such as PS3 and Xbox360.

TK1 as implemented in the Shield tablet turns out to be slightly below 32 FPS in Manhattan and well above 5W. Performance per watt is therefore not what you said it would be.
 
Why would anyone want a device neutered where the max performance of the SOC could never be obtained?


Because attaining max performance would require unacceptable compromises on other very important things like power consumption, temperature and battery life. And even then you still may not get to max given inherent limitations of the form factor.

You can argue for giving the user software knobs to crank performance to the limits of the cooling system but that's a very bad idea for a mass market device like a tablet.
 
TK1 as implemented in the Shield tablet turns out to be slightly below 32 FPS in Manhattan and well above 5W. Performance per watt is therefore not what you said it would be.

I was talking about power consumption of the application processor + mem. measured at the voltage rails! As far as I can tell, no reviewer has [publically] measured Shield tablet at the voltage rails, but the average power consumption of the AP + mem. in a benchmark such as GFXBench 3.0 Manhattan Offscreen should be below ~ 5w. According to NVIDIA, the average power consumption for AP + mem. measured at the voltage rails is well below 2w at iso-performance levels of A7 and S800.
 
Last edited by a moderator:
Why would anyone want a device neutered where the max performance of the SOC could never be obtained?

That capability may be needed on a future game and leaving that performance less that it could be is downright dumb on a Gaming Tablet.

If you want less aggressive power then enable the Nvidia software that caps performance.

Since you obviously aren't a spokesperson neither for NVIDIA nor Google I'd suggest you wait for the relevant Google device to ship and then we'll see again. As for my own preferences they aren't negotiable. I know what I want and for which reason thank you.
 
A1xLLcqAgt0qc2RyMz0y said:
Why would anyone want a device neutered where the max performance of the SOC could never be obtained?
Well, in the CPU industry, Intel and AMD provide these kind of chips to go in such obscure devices as "laptops" and "ultrabooks".

A1xLLcqAgt0qc2RyMz0y said:
leaving that performance less that it could be is downright dumb on a Gaming Tablet
Not when you don't have any competition.
 
Back
Top