Does PS4 have excess graphics power intended for compute? *spawn*

I'm not saying that what you're saying can't be true, but if it is, then it raises some questions about the design decisions of Hawaii and in fact - almost every other GCN GPU out there since they're all generally more CU heavy compared with other resources than the PS4 is. In fact far from being CU heavy as is the claim of this thread, the PS4 is one of the lightest CU GCN designs. Is it really reasonable to expect that Pitcairn launched over 2 years ago was more compute focused than the PS4?

Isn't one if the easiest solutions just wondering if it's CPU limited?

Was looking at Anandtech's Beema/Mullins review, which features the Puma CPU which is the successor to Jaguar. This 4 core, ~4.5 watt part that goes in tablets can easily outpace 4 Jaguar cores in single threaded loads (partly because it can clock up to 2.2 ghz, versus a lowly 1.6 for PS4 CPU).

Basically we forget just how lightweight these consoles CPU's are probably.

63085.png


It would have been nice to get that Puma part in consoles just for the higher clockspeed.
 
How much performance increase does a 290X give you over a GPU equivalent to the PS4's running PC benchmarks?
Slightly above 2X fps. In BF4 using Mantle and High settings (same as PS4), which puts more strain on the CPU, you could get close to 3X fps.

However getting more frames is actually harder than running more graphics/pixels, it is not linear. It takes more than double the hardware resources to produce double the frames. high end GPUs need good CPUs too.

It might be that consoles CPUs are so slow that they incur a bottleneck on the GPU utilization rate. I was hoping to hear an actual developer take on this from an actual game (like sebbi), see if they have troubles with the CPU restricting the GPU potential.
 
Isn't one if the easiest solutions just wondering if it's CPU limited?

Was looking at Anandtech's Beema/Mullins review, which features the Puma CPU which is the successor to Jaguar. This 4 core, ~4.5 watt part that goes in tablets can easily outpace 4 Jaguar cores in single threaded loads (partly because it can clock up to 2.2 ghz, versus a lowly 1.6 for PS4 CPU).

Basically we forget just how lightweight these consoles CPU's are probably.

63085.png


It would have been nice to get that Puma part in consoles just for the higher clockspeed.

CPU is the main bottleneck, the main reason behind the "CUs balance" or "extra CUs" topic.

And another aspect of this fact is the reason why Infamous SS (that has an incredible graphic and IQ) presents the most empty and lifeless sand-box games in year (PS2 era games like Vice City were miles ahead) and after all is also a much constricted open-world game.
 
But as to where the 14+4 comes from and its meaning, what I have said is definite fact.

As I understand, one of the devcon slides showed a graph like performance vs ALU utilisation (or similar) which showed there is a knee in the performance curve, and beyond that point there is a significant drop off in the value of the additional ALU resources for graphics...
Only when using that (the tested) approach to graphics. By using compute and different graphical algorithms, the workloads and requirements shift. So we end up repeating ourselves - there's no such thing as hardware balance when it comes to software, as software will always adapt to fit the hardware. It is impossible to provide developers with programmable hardware and have them not use it for graphics if they so choose, and the nature of compute means it'll be very effective for lots of workloads so it's not like 4CUs won't be used for graphics because it's highly inefficient.

If devs really wanted, they could create a noughts-and-crosses game with raytraced visuals using pretty much all CPU and GPU for graphics. And if they choose, they can create a super complex simulation using simple 2D visuals and turn CPU and GPU to almost entirely calculating AI and physics.

Cerny's remark is basically irrelevant and doesn't need interpreting. PS4 has 18CUs, 8 (6 available) CPU cores, 8 (4.5 - 5 available for games) GB's of 176 GB/s RAM, an audio encoder/decoder chip, and whatever else. That's the hardware for devs to use however they want. There are few hard-wired limits (I'm guessing the audio is pretty hard-wired for audio decompression, but a clever dev could still probably pack compressed graphics data in their somehow if they wanted!).
 
Only when using that (the tested) approach to graphics. By using compute and different graphical algorithms, the workloads and requirements shift. So we end up repeating ourselves - there's no such thing as hardware balance when it comes to software, as software will always adapt to fit the hardware. It is impossible to provide developers with programmable hardware and have them not use it for graphics if they so choose, and the nature of compute means it'll be very effective for lots of workloads so it's not like 4CUs won't be used for graphics because it's highly inefficient.

If devs really wanted, they could create a noughts-and-crosses game with raytraced visuals using pretty much all CPU and GPU for graphics. And if they choose, they can create a super complex simulation using simple 2D visuals and turn CPU and GPU to almost entirely calculating AI and physics.

Cerny's remark is basically irrelevant and doesn't need interpreting. PS4 has 18CUs, 8 (6 available) CPU cores, 8 (4.5 - 5 available for games) GB's of 176 GB/s RAM, an audio encoder/decoder chip, and whatever else. That's the hardware for devs to use however they want. There are few hard-wired limits (I'm guessing the audio is pretty hard-wired for audio decompression, but a clever dev could still probably pack compressed graphics data in their somehow if they wanted!).

Is it really possible? This could be an interesting idea for both PS4 and XB1.

But is it possible for 3rd party game developers to go this deep on a specific hardware? IF for a specific PS4 game, a developer wants to use 80% of ALUs for traditional graphic rendering and using the remaining 20% ALUs for compute rendering (or 70/30 or 90/10, that's just an example) what will happen for other platforms? Is it possible to have a lead platform and yet doing this kind of optimizations for each platform especially PC versions?
 
CPU is the main bottleneck, the main reason behind the "CUs balance" or "extra CUs" topic.

And another aspect of this fact is the reason why Infamous SS (that has an incredible graphic and IQ) presents the most empty and lifeless sand-box games in year (PS2 era games like Vice City were miles ahead) and after all is also a much constricted open-world game.

Surely there's a limit to your hyperbolic statements?
 
Is it really possible? This could be an interesting idea for both PS4 and XB1.
Not really. The audio hardeare is there provide the necessary audio aspect of a game. Repurposing it for graphics would require less audio. There's plenty enough capability in the rest of the HW for the visuals.

But is it possible for 3rd party game developers to go this deep on a specific hardware? IF for a specific PS4 game, a developer wants to use 80% of ALUs for traditional graphic rendering and using the remaining 20% ALUs for compute rendering (or 70/30 or 90/10, that's just an example) what will happen for other platforms?
CU's don't work that way. You can't partition some off to do some work. You just send work to the GPU and it spreads the workload across available CUs. For different platforms, you scale the amount of work you want doing, so on weaker hardware you build the game with simpler graphics. For PC with flexible HW, you either scale the code dynamically or let the user decide what level of work to use. Note that any PC can scale the graphics up to Super-Ultra level and the work will, as ever, be dealt with by the GPU as best it can. It'll just be rather slow on some GPUs. ;)

For a cross platform game, design may well favour the lowest common denominator, so the game will be tailored to XB1 and then the simpler aspects scaled up to PS4 and even further for PC. A game that targeted a taxing level of graphics on PS4 with a portion of GPU time given to non-graphics compute will have trouble fitting on XB1. Either the graphics will have to be reduced to free up the same compute time, or the compute aspect will be reduced, or some compromise.
 
How much performance increase does a 290X give you over a GPU equivalent to the PS4's running PC benchmarks?

Thanks for your input btw, I'm not denying what you're saying but I'd just like a clearer idea of what's causing this CU bottleneck. If it were the CPU then that makes a lot of sense (although seems a bit too easy for me) while if it's other parts of the GPU then it calls into question AMD's design decisions for the rest of the product range - although the ease of adding CU's vs expanding other parts of the GPU may be a valid explanation for that.

With regards toy your question above, if would depend where the bottleneck lies in the game. If it's primarily bandwidth or ROP limited then you'd looking at something aorund 2-2.5x the performance while if it's CU limited you could be looking at up to 3x more performance.

But that's my point, if typical high end workloads were not CU limited why do most GCN GPU's actually have a higher CU:Other GPU resources ratio than the PS4 which is apparently already past the knee of the curve in terms of optimal CU allocation.

I'd be interested to here Dave B's take on this although I'm guessing this isn't something he can speak publicly about.
 
With my limited technical knowledge, the bottleneck has got to be the cpu and thats why they want devs to use 20% of the gpu for compute, to bolster the weak cpu.
 
Not really. The audio hardeare is there provide the necessary audio aspect of a game. Repurposing it for graphics would require less audio. There's plenty enough capability in the rest of the HW for the visuals.

CU's don't work that way. You can't partition some off to do some work. You just send work to the GPU and it spreads the workload across available CUs. For different platforms, you scale the amount of work you want doing, so on weaker hardware you build the game with simpler graphics. For PC with flexible HW, you either scale the code dynamically or let the user decide what level of work to use. Note that any PC can scale the graphics up to Super-Ultra level and the work will, as ever, be dealt with by the GPU as best it can. It'll just be rather slow on some GPUs. ;)

For a cross platform game, design may well favour the lowest common denominator, so the game will be tailored to XB1 and then the simpler aspects scaled up to PS4 and even further for PC. A game that targeted a taxing level of graphics on PS4 with a portion of GPU time given to non-graphics compute will have trouble fitting on XB1. Either the graphics will have to be reduced to free up the same compute time, or the compute aspect will be reduced, or some compromise.

Thanks for explanations. What about soft-partitioning the CUs? 3dilettante talks about it on other thread.

http://forum.beyond3d.com/showpost.php?p=1843602&postcount=4285
http://forum.beyond3d.com/showpost.php?p=1843607&postcount=4287
 
With my limited technical knowledge, the bottleneck has got to be the cpu and thats why they want devs to use 20% of the gpu for compute, to bolster the weak cpu.

I think compute is definitely there to assist the CPU with A.I., physics, and other non-graphic task. The Jaguars CPUs are definitely more of a bottleneck for the PS4, than XB1, given the PS4 GPU is beefier. But that hasn't stopped Sony in the past or it's developers on finding ways of maximizing the hardware. Anyhow, the 14+4 scenario, is just that ...developers are not limited by an example that "some" have labeled to be the truth of how PS4 architecture works.
 
CPU is the main bottleneck, the main reason behind the "CUs balance" or "extra CUs" topic.
Being CPU limited is different area of analysis than whether fixed-function graphics loads experience diminishing returns at 14 CUs.
Compute kernels have their own dependence on inputs from the CPU, so the "extra" CUs would still experience problems if the CPU is limiting the GPU in a general fashion.

The CUs are part of the GPU, and shuffling allocations of a throttled whole doesn't change things much.
It is also not that difficult to flip things around by upping some graphical feature that significantly expands the burden on the GPU or a specific portion of it with little effect on the CPU.
 
Surely there's a limit to your hyperbolic statements?

Mmm.. Too much hyperbolic, eh? Probably..

In any case, the wonderfull but empty streets of I:SS are due to the CPU limits.
But also, to a not balanced use of all the CUs available.

I am quite confident that SS2 will be quite similar to SS1 in the graphic department (in other words it will not have a similar graphic jump like the one from Infamous 1 to Infamous 2) but probably it will finally have a next-gen city: bigger, populated and alive.
 
In any case, the wonderfull but empty streets of I:SS are due to the CPU limits.

Seriously what game are you playing?

ISS twist on Seattle has citizens and vehicles roaming about, just like any other open world game. What are you expecting ...goats and llamas as well? Well, you're going to be seriously disappointed in "The Division" then. Lifeless like a graveyard. But then again, it's not that type of game. And ISS isn't GTA ...two different styles of gaming.
 
But that's my point, if typical high end workloads were not CU limited why do most GCN GPU's actually have a higher CU:Other GPU resources ratio than the PS4 which is apparently already past the knee of the curve in terms of optimal CU allocation.
I don't know how strong the knee is, but it's only a knee, not a brick wall. I wonder what a typical game benchmark would look like between:
A. Current PS4 specs with 18CUs
B. 14CUs but using the die area difference to increase other GPU resources
C. 14CUs and using the die area difference to add CPU cores

If we follow what Cerny said, the system is fully utilized when the game engine is written to mesh it's compute tasks along with the graphics pipeline. He expects GPU compute to really shine in the next 2 to 3 years. That means 14CU might have looked like a good balance now, but future games would then be severely ALU starved. That seems to be his gamble, that the PS4 needs 18CUs to be balanced for a modern engine.
 
Anyhow, the 14+4 scenario

...again?? WTH Sony has to do, spraypaint the chinese bigwall to make this nonsense stop?

TBH I think that one of the biggest issue in this gen will be the ability to issue and retire CG in a timely fashion, for time sensitive tasks like audio or such.
 
I feel the problem here is that people are thinking about the gpu part of the apu in isolation. I think the whole 14+4 balance thing is more related to the whole apu.
With using 20% of the gpu part of the apu to augment the cpu part you have a much better balanced apu.
Thats my theory and I'm sticking with it :)
 
Thats my theory and I'm sticking with it :)

My point is, if you read the "Radeon Southern Islands Acceleration" PDF you can find on internet, you might notice there is no bitmask in those published PM4 for selecting the CU where your draw packets goes, or something like that.
Of course private AMD documentation may state different, but at least public one, which at least we can access and discuss, looks this way.
 
I agree, the so called 4 cu aren't any different from the other cu's and thats why I prefer to say they will use 20% of the gpu for compute or thats the plan further down the line.
 
Back
Top