Stupid question on PS4Pro

Why exactly do PS4 games need a patch or boost mode enabled to get more performance out of PS4Pro? I mean the API should abstract away things like the number of CU's, CU's in the Pro being a superset (FP16) of those in the vanilla PS4 shouldn't matter and I've never heard of any GPU techniques depend on clock speed or cycle counting... so why would you need either a patch or enabling something to get a performance boost? It really doesn't make sense to me.
 
I mean the API should abstract away things like the number of CU's, CU's in the Pro being a superset (FP16) of those in the vanilla PS4 shouldn't matter and I've never heard of any GPU techniques depend on clock speed or cycle counting...

What API?
 
Sony played it safe with downclocking 4Pro to 4Base speeds. In the end I think boost just works fine (system-wide function).

Games only really need a patch to utilize the additional shader engines & accompanying CUs, just like Scorpio. Can't remember if the speculation pointed towards scheduling issues as a potential source of problems between 2 & 4 engines or something something something async compute.
 
Purely speculation and I'm probably wrong. :p But my impression was that Sony allows coding much closer to the metal so to speak while MS allows that to an extent but more so through a structured API.

More importantly though, Microsoft's decades of experience on Windows with scaling 3D graphics over a wide range of disparate graphics architectures probably also plays a part in making it easier for them to allow XBO games to run mostly without issue on XBO-X.

Sony doesn't exactly have much experience in that area. For them, it was likely far easier to just adjust PS4-P to match PS4 as closely as possible when running PS4 games in the general cases.

Regards,
SB
 
I thought Scorpio gave a performance to all XboxOne games...? And I thought it automatically added 16xAF by default or something like that?

In fact according to this: https://www.eurogamer.net/articles/...-five-ways-your-existing-games-will-be-better
it says all CU's and clock speed will be brought to bear to all xboxone games.
Unpatched games can only see the two shader engines. With Scorpio each shader engine has gone from 7 to 11CUs, one disabled - 6 to 10. Couple that with an increase in clockspeed from 853MHz to 1172MHz, and there you've got 3TF vs 1.31TF (or 1.4TF for the OneS). It's enough overhead seemingly for 8-16xAF in the majority of existing games.

For 4Pro, they're still at 9 active CUs per shader engine, so the only boost is the 911MHz vs 800MHz for unpatched.

---

DF didn't update the original article regarding the half-GPU resources for unpatched Xbox titles. That was determined later on:
https://www.eurogamer.net/articles/digitalfoundry-2017-microsofts-xbox-one-x-benchmarks-revealed
Well, it turns out that compatibility with older games isn't a walk in the park, so pre-existing Xbox One titles default to a different set-up. In effect, half of the render back-end hardware is disabled and pixel and vertex shaders are each hived off to half of the 40 available compute units. It's a somewhat gross generalisation, but you could say that older games effectively get access to 3TF of power compared to the 1.31TF in the older Xbox One, and compared further to the 6TF accessible via the July XDK.
 
Last edited:
Which ties back with the OP - why? What's the difference between implementations where PC can chuck in more CUs (even on the same GPU architecture) and use them, but consoles can't? What's going on with the job management/scheduling that all the GPU can't be used?
 
Which ties back with the OP - why? What's the difference between implementations where PC can chuck in more CUs (even on the same GPU architecture) and use them, but consoles can't? What's going on with the job management/scheduling that all the GPU can't be used?
There is likely a lot of 'manual' management of GPU resources in many game engines and when you write a game when the only hardware configuration is X of these, Y of those and Z of them, your resource manager and scheduler is likely written to those fixed specifications.

Why would you do it any different on a console if you don't know things will change, or maybe there isn't even an API to determine number of accessible CUs or these specific capabilities. On a console you're writing to an API specification, not a hardware specification and if more resources get unlocked in latter firmware/API updates, it's implicit and a known quantity.
 
I didn't think there was though, not at the level of the GPUs internal operation. The whole point of the ACEs etc. is to divvy out the workloads to the resources - devs aren't manually choosing to run x shader on CUs 0 through 5 and y shader on the rest. It should be a case of sending work to the GPU and it using whatever resources are attached, and if there's a load more CUs attached, use them.

Of course, there isn't in the consoles because it's two GPUs slapped together and not one GPU with one set of controlling hardware (correct?). But the choice to go this route rather than double up CUs is confusing. Is it architectural limits in GCN?
 
Of course, there isn't in the consoles because it's two GPUs slapped together and not one GPU with one set of controlling hardware (correct?). But the choice to go this route rather than double up CUs is confusing. Is it architectural limits in GCN?

I thought I'd read, but can't find, references to scheduling. Mark Cerny made big on more advanced scheduling in PS4 Pro - at least I'm reading that it's advanced compared to PS4, rather than just scaled up to cope with more resources, so perhaps there's more in it.
 
Which ties back with the OP - why? What's the difference between implementations where PC can chuck in more CUs (even on the same GPU architecture) and use them, but consoles can't? What's going on with the job management/scheduling that all the GPU can't be used?
Exactly this. I doubt console GPU's do away with the Command Processor spawning GPU work, and the API communicates with the command processor so what difference is there to the game how many CU's or Shader Engines there are? Like AlNets mentioned it could be async compute, but couldn't the ACE's/commandprocessor/runtime/driver/api...whatever just artificially limit the amount of async work to the same amount that would occur on the original PS4? Or maybe they could've up'd the amount of L2 cache to try to keep the same cache thrashing 'characteristics' of how it would work on the PS4?
I guess I'm saying there can't be that big a difference between how a PC and a console GPU operates.
I don't think all titles use async compute either so why not give a bigger boost to those titles?

DF didn't update the original article regarding the half-GPU resources for unpatched Xbox titles. That was determined later on:
But is that for all BC titles? or is that what they found with the few games they tested?

Of course, there isn't in the consoles because it's two GPUs slapped together and not one GPU with one set of controlling hardware (correct?). But the choice to go this route rather than double up CUs is confusing. Is it architectural limits in GCN?
I've read the quote you are referring to and never really understood it. The only two ways I can think of to "slap another gpu down next to" the other one is either A. Crossfire/SLI or B. doubling the Shader engines or whatever AMD calls there top of hierarchy module. I would think crossfire would require a good bit of effort to get working for BC and might require a different code path for madefor titles. Multiple shader engines would be more transparent but might require a larger L2 cache to keep the same performance metrics. Again Async compute might exacerbate the L2 issues if alot of async jobs are dispatched.

I don't know it just doesn't make sense to me.
 
Which ties back with the OP - why? What's the difference between implementations where PC can chuck in more CUs (even on the same GPU architecture) and use them, but consoles can't? What's going on with the job management/scheduling that all the GPU can't be used?

IIRC frametimes/framerates in consoles are often synched with physics, input, scripted A.I., etc.
That's why some early console emulators had a hard time providing decent gameplay, because PCs would often run parts of the game code too fast or too inconsistently.

If the hardware is running a pseudo "virtual machine" like the xbone then I guess that problem is easier to overcome. But for the PS4 which apparently gives closer access to the metal letting the devs micro-manage latencies and cycles, I can se why giving access to significantly more compute resources could become a problem in some cases.

So to answer the OP, I don't think the API abstracts the number of CUs and their performance. At least not unless the devs tell it to.
 
Which ties back with the OP - why? What's the difference between implementations where PC can chuck in more CUs (even on the same GPU architecture) and use them, but consoles can't? What's going on with the job management/scheduling that all the GPU can't be used?
On PC there is a driver that's separate from the game. On console the user mode driver is part of the game so driver updates don't break shipping games.
 
IIRC frametimes/framerates in consoles are often synched with physics, input, scripted A.I., etc.
Not for many years. AI, physics, and input are all run parallel to the rendering. The game maintains a 'game state' and the rendering draws whatever that state happens to be whenever it's evaluated for a specific frame.
 
You could never tell if the framerate never drops. But even then, they could poll the input and logic at 120 or 240 Hz, same as racing games, and keep controls even tighter than a locked 60 fps framerate sync'd experience could offer. Uncorroborated Google suggests PS4 polls 250 Hz from DS4 and XB1 polls @ 125 Hz.
 
Not for many years. AI, physics, and input are all run parallel to the rendering. The game maintains a 'game state' and the rendering draws whatever that state happens to be whenever it's evaluated for a specific frame.

How many years?

There are still quite a number of recent multiplatform games for the PC that have a hard 60FPS limit, and it's because of the reasons I mentioned above.
MGS V, for example.
 
There may be some examples of some devs sticking to a fixed loop for some reason, as if multicore, multithreaded isn't a thing, but it's far from often. As you can see in your MGS example, sticking the rendering loop at the end of the update loop and then limiting your update rate means you don't scale up framerates on capable hardware, and the game slows down when your framerate can't keep up. That's crap both ways which is why development moved to the alternative standard, whiich was being talked about well over a decade ago.
 
Back
Top