nVidia Project Th-.. Shield (Tegra4)

According to a Heise news about Project Shield, the console needs between 4 and 8 watt running a game. So not so far from the 9W.

At GTC 2013, NVIDIA claimed that Tegra 4 has a TDP of 5w (while Tegra 4i has a TDP of 1w), but it is unclear exactly how those numbers were determined.
 
Yea but im also willing to bet most consumers wouldnt buy a shield anyway :)

Maybe so, but at least those who do buy it will have a better gaming experience than others who game on a smartphone (especially for those with a Kepler-equipped PC), without actually draining their smartphone battery in the process.
 
It doesn't matter what frequency plane the cores are on if the cores are off. If mobile games don't know what to do with > 2 cores then those last two cores will be idle most of the time and can be power gated for a majority of that time. The fact that Cortex-A15 uses a shared inclusive L2 means that turning the cores on and off should be relatively low latency since they just have to save internal state and push L1 to L2. Those other cores aren't going to need high frequency on and off switching. They'll probably tend to go to sleep and wake up once per frame, so can handle latency even as high as a few ms.
You're under the impression that things actually work like that, in reality the CPUIdle drivers are much dumber than that and unless Nvidia invested some engineering on the kernel front, things aren't that optimal. Power gating is still done via hot-plugging on most implementations and that is not even near a 20ms latency. Power collapse on the 5250 per CPUIdle for example is also only done via synchronized turning off of all cores during "runtime"; i.e. not hot-plugging.

I don't have any info on the new chips such as Tegra 4 but if they're even similarly as backwards as before, then reality is very far off what you describe here.

The scheduler is also extremely dumb so it wouldn't know to keep a single core up at say 1600MHz versus 2 cores at 800, that is if the thread load would even allow to spread the threads around in an even manner. All you need is a heavy thread loading a single core and a few mediocre ones keeping the runqueue depth high and you're already in a situation I described earlier.
 
You're right, I was under the impression that the kernel didn't suck this much at this, sorry if I'm mistaken :( With single threaded performance pushing as hard as it is vs 4 cores this really needs to be improved. I wonder how well Apple's handling it on iOS.
 
No, you don't try to peg four 1.9GHz Cortex-A15 cores just because they're there, regardless of how much nVidia is paying you. Most "Shield specific" optimizations will be raising the bar on graphics features, not driving up CPU utilization to the max. If they're really developing for Shield making a game that doesn't kill your battery is a consideration.

One thing to consider (which Linley did not) is that I'm pretty sure Tegra 4 won't even let you run all four cores at 1.9GHz. I don't recall if the actual limit has been mentioned but this is standard for nVidia.

Tegra 4 makes separate design decisions to handle separate use cases. There are Cortex-A15 cores that clock to 1.9GHz to address latency sensitive uses where you need something to complete ASAP. There are four cores to balance independent thread execution and leverage use cases that are inherently parallel but not as latency sensitive (like decompressing a large file). And there's a big GPU for games.

But just because all of these things are there doesn't mean that there was an intention to max out all of these resources simultaneously.

Actually 9W @ peak for Tegra 4 would actually be pretty good considering Exynos 5250 will peak at 8W under CPU & GPU load before throttling is triggered and that only had 2 A15 cores at 1.7Ghz: http://www.anandtech.com/show/6536/arm-vs-x86-the-real-showdown/13

My very crude guess is you could see all 4 cores peak to 1.9Ghz for short bursts, but not sustained performance over a long period of time. And like you already said, most games are GPU-bound, not CPU-bound.
 
No one said 9W at theoretical peak, french toast implied 9W while running real games.

Anandtech's scenario was running a game then running Coremark in the background. That's not simulating a real load. That's pathological. Their power consumption numbers show that the game isn't coming anywhere close to heavily loading the CPU. This won't change with Tegra 4. Even if the OS scheduling is bad at dealing with powering off cores clock gating will still bring down a large amount of the power consumption on even if there's a "heavy" thread driving the clocks up decently.

I doubt you'll be physically able to use the four cores at peak 1.9GHz because that's how nVidia did Tegra 3. These aren't Intel or AMD CPUs that have automatic turbo capabilities that are regulated by on-chip logic. They're not going to trust the OS to reliably regulate it for short bursts only.
 
No one said 9W at theoretical peak, french toast implied 9W while running real games.

Anandtech's scenario was running a game then running Coremark in the background. That's not simulating a real load. That's pathological. Their power consumption numbers show that the game isn't coming anywhere close to heavily loading the CPU. This won't change with Tegra 4. Even if the OS scheduling is bad at dealing with powering off cores clock gating will still bring down a large amount of the power consumption on even if there's a "heavy" thread driving the clocks up decently.

I doubt you'll be physically able to use the four cores at peak 1.9GHz because that's how nVidia did Tegra 3. These aren't Intel or AMD CPUs that have automatic turbo capabilities that are regulated by on-chip logic. They're not going to trust the OS to reliably regulate it for short bursts only.

Actually, it's not as pathological as you may think. It's throttling the CPU as well when only running the game. It's not quite on the level of the synthetic test they did with the multitasking situation, but if you look at only the game run...

http://www.anandtech.com/show/6536/arm-vs-x86-the-real-showdown/12

You'll see that the CPU spikes up to 2.5 W but then throttles back to 0.5-1.0 W whenever the GPU requires 3.5-4.0 W. Basically the same behavior as you have with the synthetic multitasking case. It's not as extreme in this case but it does show that the CPU is generally throttled in favor of the GPU in order to maintain a 4-5 W TDP which is what the synthetic test was showing.

But you also see this in the synthetic test...

http://www.anandtech.com/show/6536/arm-vs-x86-the-real-showdown/13

If you look only at the Modern Combat section (with the yellow bar), you'll notice that the CPU spikes up to 4 W, but is rapidly throttled when the GPU ramps up to 4 W. Anytime the CPU spikes up, the GPU is simultaneously reduced. Or it could be that anytime GPU load is reduced the CPU is allowed to spike up as it is no longer as heavily throttled.

So, it's true that the hardware itself has a max TDP of about 8 W, but that you will almost never see that as the CPU and/or GPU are generally heavily throttled if one or the other approaches the throttle limit.

So it's not like the game doesn't need more CPU power than it uses, but that it isn't allowed to use as much CPU power as a developer might want if they also want to use the GPU.

Hence, you'll never really see AAA level games on mobile devices that potentially push the CPU while simultaneously pushing the GPU.

Something like Crysis 3 or StarCraft 2 would never be able to exist on something like this no matter how theoretically powerful it is for example because they both push the CPU and GPU fairly equally. As in this case, I'd imagine either the CPU or GPU would get throttled down or both would get throttled to half its theoretical performance. Imagine what would happen with these games if the CPU was throttled everytime the GPU was pushed. It'd be a disaster.

Which then makes you realize why there's no good benchmarks. If you bench the CPU and then bench the GPU that doesn't give any indication of what the hardware will actually do as if you push both, the performance of the CPU and GPU (combined) will not be able to reach the theoretical limits. And if you bench both, then you come down to what hardware throttles CPU and/or GPU preferentially? How quickly do they throttle? Does it deal gracefully if it has to throttle both? Etc. If hardware from IHV X throttles GPU more heavily than CPU while hardware from IHV Y does the opposite, what do the benchmarks even mean when theoretically they are both equal if you bench the CPU and GPU separately? Does it even then reliably equate to application performance? If App C uses the GPU slightly more than CPU while App D uses CPU slightly more than GPU, they'll perform completely differently on IHV X hardware versus IHV Y hardware.

Things like that don't happen for PCs unless the hardware is just really bad (CPU/GPU throttling) or people are overclocking way too much. Hence benchmarks on PC can reliably be used to gauge performance across different types of hardware.

Regards,
SB
 
Meh. Played with it at pax east. It looks nice but its a wierd product
It looks like crap TBH. Most toilet-like gaming product I've since the Konix Multisystem, and that was like back in 1988...

I sure won't touch one, ever. Couldn't bring myself even if Jen-Hsun himself paid me real money.
 
Actually, it's not as pathological as you may think. It's throttling the CPU as well when only running the game. It's not quite on the level of the synthetic test they did with the multitasking situation, but if you look at only the game run...

http://www.anandtech.com/show/6536/arm-vs-x86-the-real-showdown/12

You'll see that the CPU spikes up to 2.5 W but then throttles back to 0.5-1.0 W whenever the GPU requires 3.5-4.0 W. Basically the same behavior as you have with the synthetic multitasking case. It's not as extreme in this case but it does show that the CPU is generally throttled in favor of the GPU in order to maintain a 4-5 W TDP which is what the synthetic test was showing.

Why do you think that the CPU is being throttled in this case? A more likely scenario is that the big spike at the start is because the CPU is actually doing constant work to load the game, while it doesn't yet have much to give the GPU. If it were throttling later you'd see it continually ramp down every few seconds like you do when he started running Coremark. Instead we see one spike up to 2.5W lasting two seconds and then relatively flat periods for long durations. When Coremark is loaded the CPU is constantly ramping up to 4W then being pushed down every 5 seconds.

Your argument desperately needs evidence that the game was both CPU and GPU limited at the same time. If that were really the case loaded Coremark wouldn't cause the SoC to draw power much more aggressively, in fact it'd result in lower power consumption becaus a reduced number of cycles would be alloted to the game which would then drop its frame rate and no longer be GPU limited.
 
Why do you think that the CPU is being throttled in this case? A more likely scenario is that the big spike at the start is because the CPU is actually doing constant work to load the game, while it doesn't yet have much to give the GPU. If it were throttling later you'd see it continually ramp down every few seconds like you do when he started running Coremark. Instead we see one spike up to 2.5W lasting two seconds and then relatively flat periods for long durations. When Coremark is loaded the CPU is constantly ramping up to 4W then being pushed down every 5 seconds.

Your argument desperately needs evidence that the game was both CPU and GPU limited at the same time. If that were really the case loaded Coremark wouldn't cause the SoC to draw power much more aggressively, in fact it'd result in lower power consumption becaus a reduced number of cycles would be alloted to the game which would then drop its frame rate and no longer be GPU limited.

That isn't the only time the CPU spikes up, however.

Looking at the game only portion of the test on the next page is even more telling. Anytime the CPU sees an increase in power consumption the GPU sees an almost equal reduction in power consumption. From 23.4 seconds to ~58 seconds shows that quite well. We can assume that all assets are fully loaded at this point. Perhaps it's just a coincidence.

Either way, it still doesn't address the fact that a well implemented game can push both CPU and GPU relatively equally. Although some games (MMOs for instance) tend to push the CPU much harder while other games (shooters with very little physics and little to no enemy AI) push the GPU much harder.

If Project Shield is expected to natively run AAA style games, then either it has to allow both GPU and CPU to be potentially maxed out simultaneously or they just won't be able to run certain AAA style games.

For just streaming a game being run on another device that obviously won't be a problem.

Regards,
SB
 
What I said about GPU and CPU switching priorities applies to more than just the startup. It isn't a coincidence. Anytime the test switches areas or anything like that you're liable to see a transition from GPU usage to CPU usage.

The power profile looks NOTHING like the profile where it's clearly throttling. There's no reason it'd look so different if it were throttling in both cases. There's no reason it'd throttle before letting it hit nearly as high of a threshold. It makes no sense.

Games almost never push CPU and GPU almost equally. It has nothing to do with being well implemented or not.
 
Games almost never push CPU and GPU almost equally. It has nothing to do with being well implemented or not.

In the PC and especially the console space it's a lot more common than you think.

For PC it's understandable that a Core i7 may not have it's CPUs pegged since the game also has to run on a Core i3 or Pentium. Nor can they necessarily depend on a CPU having more than 2 cores able to hand 2 threads at a time. For console games, however, there is no need to factor in lower power CPUs and hence a game designed for consoles is likely to push the CPU and GPU equally.

Even with the above, you still see games that push both relatively equally on PC. Although with most games it's going to depend on what CPU you have.

Project Shield is being designed as a console with fixed hardware. Assuming games are developed to run natively on it, then it's far more similar to the console space. Either a game will be developed to push both equally and hence offer AAA levels of gameplay and graphics, or it'll be designed to mobile SOC specifications which means sacrifices to gameplay or graphics as you can't push both the CPU and GPU simultaneously without having one or the other throttled. OR, Project Shield will allow both the CPU and GPU to be maxed which means high TDP. The last situation is what I would expect to happen if Nvidia are serious about this competing as a portable console.

Regards,
SB
 
It might be designed that way, but expectations of that actually occurring seem low. It's an android player until proven otherwise imo.
 
No matter how Project Shield is designed it's not going to get a bunch of games developed specifically for it. Some enhancements, maybe, but not a hefty development that maximizes resources. And regardless of what nVidia claims they really aren't serious about competing as a portable console.

And we have to make this clear, even if something is CPU and GPU limited simultaneously (hard to do, it's not just a matter of making the best of your hardware but very carefully balancing the two) it doesn't mean that every core is going to be pegged. That balance is even harder. There's a huge difference between one core at full speed and four, even if the others aren't power gated and the frequency is kept at max. Clock gating still makes a huge difference.
 
No matter how Project Shield is designed it's not going to get a bunch of games developed specifically for it. Some enhancements, maybe, but not a hefty development that maximizes resources. And regardless of what nVidia claims they really aren't serious about competing as a portable console.

In which case I see even less chance of Project Shield being even a minor success. And even more that it's just a marketing avenue for Tegra 4.

Regards,
SB
 
In which case I see even less chance of Project Shield being even a minor success. And even more that it's just a marketing avenue for Tegra 4.

Regards,
SB

This.
I cant see shield as something to worry sony, if no dedicated games then its no different to a smartphone+moga type controller....at considerable cost.

It also looks incredibly ugly and un innovative...like a bulky xbox 360 controller with a screen tacked on(how many buttons!?)...an opinion shared by more than myself I expect.
 
It also looks incredibly ugly and un innovative...like a bulky xbox 360 controller with a screen tacked on(how many buttons!?)...an opinion shared by more than myself I expect.
It's the ugliest gaming device I've ever seen I think. It's huge, yet has a tiny screen compared to the total size of the device. The design is incredibly busy, a total mess with curves and angles and lines in a complete jumble. Also, being clamshell means it's thick and unwieldy. Clamshell devices always bulk up extra in thickness, you need a bag to put this bastard in when you're out and about. Totally impractical. A Nintendo (3)DS(i) goes in your pocket. An iphone goes in your pocket.

This thing's gonna bomb so hard if it's ever actually launched. It's totally, completely dead on arrival. Shit, before arrival.
 
Back
Top