Hardware utilization: PC vs console question

bgassassin · Jul 9, 2012

I hope this is the right board for this, but since it ties in with consoles I figured why not as it will get moved or closed if it's not. This comes from a discussion with a fellow B3D member elsewhere. :smile:

So is it really possible to estimate how much of a PC's hardware is utilized vs a console? This is based in part on a tweet from Carmack that essentially says a PC's hardware is only used at about half of it's power. He makes a comparison based on a PC and console having the same hardware. This is obviously based on things like APIs, inability to focus on one hardware spec, PC OS, and whatever else I'm forgetting to list. First is it really possible to estimate how much a PC's hardware power is utilized for a game? And if so is it really 50%? That sounds like a lot.

tongue_of_colicab · Jul 9, 2012

obviously a pc is using 100% of the hardware available as well only some power gets ''wasted'' on the OS, API's etc. So it's a question of which platform is utalizing it's hardware more efficient. That's something a pc just cannot win because it's an open box and besides the OS, API's etc devs also have to keep in mind their games will likely be running on atleast 3 generations of hardware.

As Carmack wrote most of it probably comes from single platform focus. I don't believe pc overhead vs consoles means a pc could do 50% more of whatever it's doing just by using more efficient/streamlined API's etc. Most of it would come from having one single spec.

Kb-Smoker · Jul 9, 2012

We had a thread kinda about this already. Good post in that thread hat goes with carmack quote.

http://forum.beyond3d.com/showthread.php?t=61841

Thread got locked btw

Originally Posted by egoless How many X should be removed from PC for the API overhead, inefficiency, and unoptimized references you mentioned?

Click to expand...

All of those factors apply in different ways so you can't just add them up. API overhead would be fairly universal depending on which API you're using. Just a wild guess but maybe the PC loses ~30% performance on average thanks to the API. Possibly less with DX11.

Inefficiency and unoptimised are more or less the same thing. The level of optimisation that's performed for a particular hardware set determines how efficiently that hardware is used. If you want to add it all up then the highest estimate of relative console efficiency I've heard from a reliable source is 2x from Carmack. That no doubt takes into account the API obverhead as well as the massive performance gains you can achieve from optimising your code for a specific hardware set.

From personal experience i'd say 2x is about right for reasonable ports later in a consoles life. Obviously there will always be exceptions on both sides that break that ratio.

antwan · Jul 9, 2012

Also lack of optimization. Not because of PC not being "a single spec", but in general. I believe it has to do with the market/climate; why should developers work hard; the majority of PC gamers pirate anyway. While the ones who do pay for games, probably also buy some extra GPU's, ram, faster HDD's, ...
Not a single PC game out now, feels like it really makes use of all the extra hardware. To me that is lack of optimization. They just port a console game for a little while and then call it a day, that's how PC development feels to me at the moment.

So Carmack is right in a way, but I strongly believe there is also a severe lack of optimization of which Carmack is afraid to speak of.

Squilliam · Jul 9, 2012

Why optimise towards a continually improving target? If developers want a better game on a console they have to optimise, if they want a better game on the PC they just increase the min requirements.

bgassassin · Jul 9, 2012

To help add some persperctive on this by using a rough scenario, how it sounds to me is that if let's both the console and PC have a 1 TFLOP GPU. It would seem that at best using Carmack's tweet, the console's GPU would have a max of say 900 GFLOPs used (it still deals with the same things only much leaner) while the PC GPU is only having 450 GLOPs used. I'm not saying this is the case before anyone believes that. I'm just trying to give an idea of how I'm seeing this to get a better answer.

Side note: I remember seeing that thread, but even saying "kinda" is pushing it IMO.

antwan · Jul 9, 2012

Squilliam, that kind of attitude is the reason why PC is lagging behind when you compare the specs to the things that devs do with it.

GraphicsCodeMonkey · Jul 9, 2012

I think 'power' is a poor term, hardware utilisation would be a better term.

No games could possibly hope to achieve 100% utilisation, ever, fact. There will always be occasions when the CPU is waiting on memory, memory is loaded into the cache which isn't used wasting bandwidth, SIMD units are unused, branch predictions goes wrong and we get a roll back etc.. Keeping all CPU pipes 100% utilised in near impossible even for the most optimal hand assembly coded loops, we might get near for some very specifc cases but this is extremely rare in practice.

Similarly on the GPU it is impossible to keep texture units, render target bandwidth and all the cores perfectly utilised all the time.

On consoles we have a much thinner API so less 'fat' between the game code and the metal and very little or no contention for the CPU/GPU resources.

Context switching CPU cores to perform other work is incredibly painful.

So without even trying by running on a console your title is getting more out of the machine.

Then of course there is optimisation, we can optimise for the consoles because they are fixed specifications, because we get to know what the hardware is doing and because we get good tools to help with performance tuning. The real killer on PC's is having to spend time coding scalability into your title (particularly when this means different data) rather than spending that time optimising.

ERP · Jul 9, 2012

On the technical side trying to support a wide variety of hardware is really hard and introduces a lot of overhead, you spend your optimization dollars on your min-spec machine, not the just released mega machine. CPU/GPU balance is all over the board, often lower end machines have "fast" CPU's and terrible GPU's, and even if you were just targeting the high end, it's impossible to answer simple questions like "what is a high end PC".

But to my mind the real reason PC's fall short is money, with a few notable exceptions there just isn't the $ in the PC market to justify a $70M spend on an a title aimed at high end PC's.

Kb-Smoker · Jul 9, 2012

Does anyone else think its crazy carmack can even post on twitter?

To get him to limit himself to 140 characters is crazy..

Anyway some good quotes from carmack about this topic.

http://www.pcper.com/reviews/Editor...-Graphics-Ray-Tracing-Voxels-and-more/Intervi

Ryan Shrout: Focusing back on the hardware side of things, in previous years’ Quakecons we've had debates about what GPU was better for certain game engines, certain titles and what features AMD and NVIDIA do better. You've said previously that CPUs now, you don't worry about what features they have as they do what you want them to do. Are we at that point with GPUs? Is the hardware race over (or almost over)?

John Carmack: I don't worry about the GPU hardware at all. I worry about the drivers a lot because there is a huge difference between what the hardware can do and what we can actually get out of it if we have to control it at a fine grain level. That's really been driven home by this past project by working at a very low level of the hardware on consoles and comparing that to these PCs that are true orders of magnitude more powerful than the PS3 or something, but struggle in many cases to keep up the same minimum latency. They have tons of bandwidth, they can render at many more multi-samples, multiple megapixels per screen, but to be able to go through the cycle and get feedback... “fence here, update this here, and draw them there...” it struggles to get that done in 16ms, and that is frustrating.

Ryan Shrout: That's an API issue, API software overhead. Have you seen any improvements in that with DX 11 and multi-threaded drivers? Are those improving that or is it still not keeping up?

John Carmack: So we don't work directly with DX 11 but from the people that I talk with that are working with that, they (say) it might [have] some improvements, but it is still quite a thick layer of stuff between you and the hardware. NVIDIA has done some direct hardware address implementations where you can bypass most of the OpenGL overhead, and other ways to bypass some of the hidden state of OpenGL. Those things are good and useful, but what I most want to see is direct surfacing of the memory. It’s all memory there at some point, and the worst thing that kills Rage on the PC is texture updates. Where on the consoles we just say “we are going to update this one pixel here,” we just store it there as a pointer. On the PC it has to go through the massive texture update routine, and it takes tens of thousands of times [longer] if you just want to update one little piece. You start to advertise that overhead when you start to update larger blocks of textures, and AMD actually went and implemented a multi-texture update specifically for id Tech 5 so you can bash up and eliminate some of the overhead by saying “I need to update these 50 small things here,” but still it’s very inefficient. So I’m hoping that as we look forward, especially with Intel integrated graphics [where] it is the main memory, there is no reason we shouldn't be looking at that. With AMD and NVIDIA there's still issues of different memory banking arrangements and complicated things that they hide in their drivers, but we are moving towards integrated memory on a lot of things. I hope we wind up being able to say “give me a pointer, give me a pitch, give me a swizzle format,” and let me do things managing it with fences myself and we'll be able to do a better job.

bgassassin said:
To help add some persperctive on this by using a rough scenario, how it sounds to me is that if let's both the console and PC have a 1 TFLOP GPU. It would seem that at best using Carmack's tweet, the console's GPU would have a max of say 900 GFLOPs used (it still deals with the same things only much leaner) while the PC GPU is only having 450 GLOPs used. I'm not saying this is the case before anyone believes that. I'm just trying to give an idea of how I'm seeing this to get a better answer.

Side note: I remember seeing that thread, but even saying "kinda" is pushing it IMO.

I look at it as the console having 2 TFLOP of "pc power" and pc having 1 Tflop.

Again flops are a very bad measure of "power."

Yeah that thread wasnt just about this but by page 3 it was taking about this topic. That is where I got that quote from.

bgassassin · Jul 9, 2012

Thanks for the responses so far, but the answers are saying what I have a decent understanding on already. I have a good enough grasp on the "why". I'm trying to find out is it possible to estimate how poorly utilized PC hardware is vs a console. Because when I see Carmack's comment I'm pretty much left to believe that in a perfect scenario a PC using a 7970 with no bottlenecks to the GPU and the PS4 with it's target GPU and no bottlenecks is almost on par with the aforementioned PC because of how underutilized the PC's hardware would be.

Kb-Smoker said:
I look at it as the console having 2 TFLOP of "pc power" and pc having 1 Tflop. Again flops are a very bad measure of "power."

Yeah that thread wasnt just about this but by page 3 it was taking about this topic. That is where I got that quote from.

For those wondering this is the person I had the discussion with. Still you can't look at it as the GPU surpassing it's theoretical target. That's not logical. It's just better utilized.

pjbliverpool · Jul 9, 2012

It's worth noting that when Carmack says 2x he's talking about DX9. DX11 will reduce that somewhat.

Also, that level of optimisation will only apply to games at least a couple of years into the console lifescycle because of the time it takes developers to optimise console hardware. So I wouldn't expect a 1.8 TFLOP GPU in PS4 to be matching the 7890 on day 1. Two years down the line in newer games it might but of course by then the 7970 will be mainstream level performance.

Finally, when we say PC's have half the efficiency of consoles that would only be at the console level graphics. i.e. it would take double RSX performance to achieve PS3 level visuals in a modern game. Once you start scaling the graphics up I expect PC games get far less efficient than that due to the lack of optimisation given over to graphics beyond the console level.

ERP · Jul 9, 2012

No it's not possible, because it's not utilization of resources the way your thinking about it.

Carmack is talking about a very specific usecase, updating textures, in that one case there is an enormous overhead resulting in a speed penalty of several orders of magnitude, this is especially true if you intend to update only a portion of the texture.
The same is true to a lesser extent of most GPU level resources (index buffers/vertex buffers etc.).

In the more general case, it's not that simple, just because you're running on a PC your pixel or vertex shaders don't magically run at 1/2 speed.
The only thing that the API/driver overhead can do to hurt performance is to starve the GPU, if you are dynamically updating GPU resources this can certainly happen because of fences inserted by the driver in order to respect locks.

In practice however if you understand the restrictions the environment places on you, you can get similar utilization to consoles for the general submit triangles and render them use cases. you have to limit my batch counts and you have to be careful with resource locking, but unless you are trying to do something overly clever % utilization can be similar.

What you can't do it tailor your art/design to a known quantity and that is a huge disadvantage, but you can't quantify it in flops, or as a percentage.

FWIW the last time I sat through an MS conference some 360 games did still ptimize shaders by hand which will buy you something, and it's something you wouldn't see on a PC, but outside of pathological cases where the compiler generates stupid code, it's not going to be a huge saving.

Kb-Smoker · Jul 9, 2012

pjbliverpool said:
It's worth noting that when Carmack says 2x he's talking about DX9. DX11 will reduce that somewhat.

He answered just that in my quote:

Ryan Shrout: That's an API issue, API software overhead. Have you seen any improvements in that with DX 11 and multi-threaded drivers? Are those improving that or is it still not keeping up?

John Carmack: So we don't work directly with DX 11 but from the people that I talk with that are working with that, they (say) it might [have] some improvements, but it is still quite a thick layer of stuff between you and the hardware. NVIDIA has done some direct hardware address implementations where you can bypass most of the OpenGL overhead, and other ways to bypass some of the hidden state of OpenGL. Those things are good and useful, but what I most want to see is direct surfacing of the memory. It’s all memory there at some point, and the worst thing that kills Rage on the PC is texture updates. Where on the consoles we just say “we are going to update this one pixel here,” we just store it there as a pointer. On the PC it has to go through the massive texture update routine, and it takes tens of thousands of times [longer] if you just want to update one little piece. You start to advertise that overhead when you start to update larger blocks of textures, and AMD actually went and implemented a multi-texture update specifically for id Tech 5 so you can bash up and eliminate some of the overhead by saying “I need to update these 50 small things here,” but still it’s very inefficient. So I’m hoping that as we look forward, especially with Intel integrated graphics [where] it is the main memory, there is no reason we shouldn't be looking at that. With AMD and NVIDIA there's still issues of different memory banking arrangements and complicated things that they hide in their drivers, but we are moving towards integrated memory on a lot of things. I hope we wind up being able to say “give me a pointer, give me a pitch, give me a swizzle format,” and let me do things managing it with fences myself and we'll be able to do a better job.

*

pjbliverpool said:
Also, that level of optimisation will only apply to games at least a couple of years into the console lifescycle because of the time it takes developers to optimise console hardware. So I wouldn't expect a 1.8 TFLOP GPU in PS4 to be matching the 7890 on day 1. Two years down the line in newer games it might but of course by then the 7970 will be mainstream level performance.
*
Finally, when we say PC's have half the efficiency of consoles that would only be at the console level graphics. i.e. it would take double RSX performance to achieve PS3 level visuals in a modern game. Once you start scaling the graphics up I expect PC games get far less efficient than that due to the lack of optimisation given over to graphics beyond the console level.

I agree 100%.

bgassassin said:
For those wondering this is the person I had the discussion with. Still you can't look at it as the GPU surpassing it's theoretical target. That's not logical. It's just better utilized.

I never said it would "surpass its theoretical target."

First off the whole debate was the "PS4/nextbox would not be able to handle square enix demo, UE4 demo and starwars 1313 because those games were running on "high end pc." If even if they do they will never look like that because these games were running on a 680 series card which the console could not match."
Which i said was untrue because the given specs of the ps4 would be able to handle any PC game running on a 680 gtx and look about the same at console resolutions. I give many reasons why and look they been repeated many times in this thread....

So they real question was, given the PS4 specs could it match a 680 GTX running a game?

bgassassin · Jul 9, 2012

ERP said:
No it's not possible, because it's not utilization of resources the way your thinking about it.

Carmack is talking about a very specific usecase, updating textures, in that one case there is an enormous overhead resulting in a speed penalty of several orders of magnitude, this is especially true if you intend to update only a portion of the texture.
The same is true to a lesser extent of most GPU level resources (index buffers/vertex buffers etc.).

In the more general case, it's not that simple, just because you're running on a PC your pixel or vertex shaders don't magically run at 1/2 speed.
The only thing that the API/driver overhead can do to hurt performance is to starve the GPU, if you are dynamically updating GPU resources this can certainly happen because of fences inserted by the driver in order to respect locks.

In practice however if you understand the restrictions the environment places on you, you can get similar utilization to consoles for the general submit triangles and render them use cases. you have to limit my batch counts and you have to be careful with resource locking, but unless you are trying to do something overly clever % utilization can be similar.

What you can't do it tailor your art/design to a known quantity and that is a huge disadvantage, but you can't quantify it in flops, or as a percentage.

FWIW the last time I sat through an MS conference some 360 games did still ptimize shaders by hand which will buy you something, and it's something you wouldn't see on a PC, but outside of pathological cases where the compiler generates stupid code, it's not going to be a huge saving.

Well see this is kb's fault because that perspective started with him.

And that's why I'm trying to get the proper understanding. He just posted roughly how the original debate started, but here is the original post and to see why I'm asking that question from that perspective.

http://www.neogaf.com/forum/showpost.php?p=39484743&postcount=5967

WE have the specs for the ps4. Its has a 1.86 tflop GPU. The highest end pc card out there is Radeon HD 7970 at 3.79 tflops.

In a close box john carmack said you can double the performance of a gpu compare to PC. So you have the best gpu out against the PS4 at 3.72 tflops[2x 1.86 glfops]. This is from the API software overhead that you do not get on a console.

The ensuing debate lead to them changing the thread title.

And with the last sentence I didn't feel like there was a dramatic benefit so you're confirming what I felt, but wasn't sure about due to no personal experience.

Kb-Smoker said:
I never said it would "surpass its theoretical target."

I didn't say you did. I'm saying you can't look at it from that perspective like when you said the GPU in my scenario was "2 TFLOPs" in your view.

Kb-Smoker · Jul 9, 2012

bgassassin said:
I didn't say you did. I'm saying you can't look at it from that perspective like when you said the GPU in my scenario was "2 TFLOPs" in your view.

I dont know what is hard to understand. We are talking about running console games vs pc games.

The console gains performance, the pc doenst some how lose power. For you example you have 1 tflop pc gpu and 1 tflop ps4. The ps4 would get around double the performance running a game built for it compared to a pc game. The pc cannot lose performance. Like he said the pc hardware doesnt start running at 50%

Think of it this way. You have 2 stock mustangs, now you take one and dyno tune it. You still have the same engine but this improves the performance. The stock mustang doent lose performance. Now consoles take this one step farther, they design the system just to run games. Using the mustang again you remove the seats, radio, a/c and improve performance by reducing weight.

There is no debate there is performance improvements in consoles. The only debate is how much but even then there is no one answer. Not sure why you are so focus on this when I was talking about the next gen demos running at E3. I was using john carmack as an example of how it was possible, not saying it some golden rule.

Sonic · Jul 9, 2012

Kb-Smoker said:
I dont know what is hard to understand. We are talking about running console games vs pc games.

The console gains performance, the pc doenst some how lose power. For you example you have 1 tflop pc gpu and 1 tflop ps4. The ps4 would get around double the performance running a game built for it compared to a pc game. The pc cannot lose performance. Like he said the pc hardware doesnt start running at 50%

Think of it this way. You have 2 stock mustangs, now you take one and dyno tune it. You still have the same engine but this improves the performance. The stock mustang doent lose performance. Now consoles take this one step farther, they design the system just to run games. Using the mustang again you remove the seats, radio, a/c and improve performance by reducing weight.

There is no debate there is performance improvements in consoles. The only debate is how much but even then there is no one answer. Not sure why you are so focus on this when I was talking about the next gen demos running at E3. I was using john carmack as an example of how it was possible, not saying it some golden rule.

I quite dislike your mustang analogy. Why bother ripping out the AC, seats, and other weight adders like power windows when you can keep all this shit in there and still go faster with a little investment? I'd much rather pull up in a fast and functional mustang then one that has taken comfort and shoved it out the window. Pointless to me to tune a stock Mustang when I could spend an extra couple hundred bucks at performance upgrades and get the tune for free. That and the fact is that the PS4 might be the mustang, then that would make the PC a freaking tank that is pure brute force and is faster than a mustang. So the mustang might be more efficient, but the tank makes up for it in brute power and ends up faster in any case. It's not a stock mustang vs. a tuned mustang argument...it's a stock tuned mustang vs. a loaded tank with speed. Of course most PC's aren't like that, and will be like matchbox cars compared to mustang at PS4 launch. Still, why do an apples to apples comparison when we can do a apples to oranges comparison?

Kb-Smoker · Jul 10, 2012

Sonic said:
I quite dislike your mustang analogy. Why bother ripping out the AC, seats, and other weight adders like power windows when you can keep all this shit in there and still go faster with a little investment? I'd much rather pull up in a fast and functional mustang then one that has taken comfort and shoved it out the window. Pointless to me to tune a stock Mustang when I could spend an extra couple hundred bucks at performance upgrades and get the tune for free. That and the fact is that the PS4 might be the mustang, then that would make the PC a freaking tank that is pure brute force and is faster than a mustang. So the mustang might be more efficient, but the tank makes up for it in brute power and ends up faster in any case. It's not a stock mustang vs. a tuned mustang argument...it's a stock tuned mustang vs. a loaded tank with speed. Of course most PC's aren't like that, and will be like matchbox cars compared to mustang at PS4 launch. Still, why do an apples to apples comparison when we can do a apples to oranges comparison?

Sure comparing high end pc to console but I was comparing equal hardware. Like RSX vs 7800 gt running games. This is hard to do with benchmarks because the 7800gt cannot even run modern games like bf3.

Again this was really about PS4 running the next gen demo at E3, it got twisted into this debate.

bgassassin · Jul 10, 2012

Kb-Smoker said:
Sure comparing high end pc to console but I was comparing equal hardware. Like RSX vs 7800 gt running games. This is hard to do with benchmarks because the 7800gt cannot even run modern games like bf3.

Again this was really about PS4 running the next gen demo at E3, it got twisted into this debate.

It wasn't twisted into this debate. You started it that way. You tried to compare different powered hardware (PS4 and demos on a PC using GTX 680), and to back it up used a tweet from Carmack comparing similar PC and console hardware.

Kb-Smoker said:
I dont know what is hard to understand. We are talking about running console games vs pc games.

The console gains performance, the pc doenst some how lose power. For you example you have 1 tflop pc gpu and 1 tflop ps4. The ps4 would get around double the performance running a game built for it compared to a pc game. The pc cannot lose performance. Like he said the pc hardware doesnt start running at 50%

Think of it this way. You have 2 stock mustangs, now you take one and dyno tune it. You still have the same engine but this improves the performance. The stock mustang doent lose performance. Now consoles take this one step farther, they design the system just to run games. Using the mustang again you remove the seats, radio, a/c and improve performance by reducing weight.

There is no debate there is performance improvements in consoles. The only debate is how much but even then there is no one answer. Not sure why you are so focus on this when I was talking about the next gen demos running at E3. I was using john carmack as an example of how it was possible, not saying it some golden rule.

It's very easy to understand, but your explanations aren't logical. Your explanations try to make the console environment sound like it can exceed it's capability. You're even trying to twist ERP's post to justify what you are saying. Which really goes against the analogy you just made. I also agree with Sonic's analogy.

Andrew Lauritzen · Jul 10, 2012

Kb-Smoker said:
Think of it this way. You have 2 stock mustangs, now you take one and dyno tune it.

More like you take a prius and tune it... then race it against the stock mustang that no one bothered to tune because it's way faster than the prius is ever going to get

The whole topic is a bit silly TBH - outside of very specific cases like Carmack mentioning updating textures (where there's an abstraction penalty precisely because guess what, there's actually implementation differences!) you can't draw any general conclusions. Furthermore since almost no one bothers to optimize for PC (since frankly it's just a lot faster in the places that you'd typically optimize, even the day the new consoles come out), it's hard to compare the "speed of light" in both cases.

I've actually gotten a bit cynical about this entire argument lately since there's so many unsubstantiated comments flying around one way or another that are just outdated or untrue. Hell a lot of people on my twitter feed are just discovering DX11 (presumably finally moving to new console development) so I'm gonna go ahead and claim that the vast majority of game developers are not really even qualified to make a comment on this... again, excepting very specific use cases like Carmack's, but even he admitted to not having tried an API that has been out for years now.

There was talk of some of this a few months back and claims of how many draw calls or state changes could be done in one place or another, most of which turned out to be nonsense when Humus and I and a few others put them to the test on PC. Thus you can understand my cynicism to this entire discussion.

Let's just get to the heart of this - what exactly are you trying to do/figure out here? Because the question is ill-formed, and it makes it sound like you have some sort of agenda that you're just trying to justify with cherry-picked "facts". If that's not true, great, but please enlighten me to the end goal here.

Hardware utilization: PC vs console question

bgassassin

tongue_of_colicab

Kb-Smoker

antwan

Squilliam

Beyond3d isn't defined yet

bgassassin

antwan

GraphicsCodeMonkey

ERP

Kb-Smoker

bgassassin

pjbliverpool

B3D Scallywag

ERP

Kb-Smoker

bgassassin

Kb-Smoker

Sonic

Senior Member

Kb-Smoker

bgassassin

Andrew Lauritzen

Moderator

Similar threads