Why do some console ports run significantly slower on equivalent PC hardware?

HZD in particular heavily leverages the fact that it's an APU and not a separate CPU/GPU. When the engine was architected, I suspect it was only ever intended to be PS4 only, and on that hardware, moving data between CPU and GPU is 'free' as they are the same memory pool.

Once it was ported to PC, it was found to be a very strong outlier in regards to performance scaling with PCIe bandwidth; on PC with a separate CPU/GPU, all of that data transfer that the engine was architected around being 'free' and instant with an APU and shared memory pool, now burn through:

1) CPU memory bandwidth
2) PCIe memory bandwidth
3) GPU memory bandwidth

And probably a few compute cycles for each on top of that.
Death Stranding, being initially a PS4 exclusive, also scales strongly with PCIe bandwidth, suspected for the same reason.


I'll try to find the exact graphs for both that I'm remembering in a bit.

As i mentioned before, the mentioned games are mostly ports from Playstation 4 games designed natively first and foremost for the PS and later on ported. I remember MGS2 it ran much and much worse on the OG Xbox, the game was designed for the PS2 in mind and later ported. I can think i can go on forever regarding ports.
It seems though that Sony is (willing to) improve on that regard.
 
These apps are video apps, they are leveraging the expanded Media engines in the M1 Max, all 3D benchmarks I have seen place the M1 Max firmly behind the 3060 mobile.
By 3D benchmarks you mean things that are either one or more of the following: ported over hastily, run on a lousy OpenGL (not even Vulkan) -> Metal wrapper, and/or run in Rosetta. And there's good reason for this: Which developer in their right mind will take 3D rendering seriously on the macOS platform?

There are some exceptions, obviously: In Aztec, which has good Metal implementation, the M1 Max is trading blows with the 3080 Mobile.
 
By 3D benchmarks you mean things that are either one or more of the following: ported over hastily, run on a lousy OpenGL (not even Vulkan) -> Metal wrapper, and/or run in Rosetta. And there's good reason for this: Which developer in their right mind will take 3D rendering seriously on the macOS platform?

There are some exceptions, obviously: In Aztec, which has good Metal implementation, the M1 Max is trading blows with the 3080 Mobile.

It better be trading blows with a 3080m seeing the price of a 32core mac max(or pro) device on a bleeding edge 5nm technology. Other then that, somethings that optimized for the 3080m will probably outperform the 32core m1 chip. Anyway, its a gaming-related thread, which ecompasses ports between console and PC. The M1 doesnt see any such games as discussed and probably never will, even if it were capable of running PS5/XSX/XSS/PC games which it isnt since ray tracing is totally not-there.
 
In Aztec, which has good Metal implementation, the M1 Max is trading blows with the 3080 Mobile.
Sorry, but that one very old "mobile" benchmark is not enough to establish this grand sweeping ridiculous claim of 3080m equivalence. This benchmark specifically doesn't scale well with high end GPUs at all. Worse, the Aztec test is one of several tests in a suite called GFXBench, all of which have simple dirt graphics.

Here is the desktop 3080Ti being barely faster than mobile 3080 in the Manhattan test, worse yet, the desktop 3080Ti is actually slower than 3080 mobile in the T-Rex test!
https://gfxbench.com/compare.jsp?be...type2=dGPU&hwname2=NVIDIA+GeForce+RTX+3080+Ti

For comparing GPUs, the whole suite is as useless as it gets!

By 3D benchmarks you mean things that are either one or more of the following: ported over hastily, run on a lousy OpenGL (not even Vulkan) -> Metal wrapper, and/or run in Rosetta
No, Geekbench (useless too, but consistent) and Shadow of Tomb Raider, which has a Metal API implementation.
 
Last edited:
What is the reason for this? DX12? Bad optimizations? (Secret console sauce doesn't explain this large deficit).
My opinion on this is that consoles simply enjoy longer and better support from devs. Bet many things in recent shader models and DX12 are suboptimal for the first gen GCN.
In the case of past gen consoles, such things will be avoided/refactored, but no devs will care making another code path for the 10 years old hardware on PC.
I remember many devs struggled with replacing structured buffers with constant buffers for 10+% gains on Pascals.
 
My opinion on this is that consoles simply enjoy longer and better support from devs. Bet many things in recent shader models and DX12 are suboptimal for the first gen GCN.
In the case of past gen consoles, such things will be avoided/refactored, but no devs will care making another code path for the 10 years old hardware on PC.
I remember many devs struggled with replacing structured buffers with constant buffers for 10+% gains on Pascals.

Yes precisely this. All these examples are new(ish) games running on very old hardware that developers are surely spending little to no time actually optimising for. GCN1.0 doesn't even factor into the minimum spec in most cases which presumably means the game hasn't even been formally tested on that architecture, let alone optimised for it.

It's worth noting that if these kinds of performance deltas between equivalent specs were the norm between PC's and consoles then the PS5 and XSX would be performing around 6900XT/3090 levels in current games whereas all evidence points to them performing more in line with the 2080/s as per their specs.
 
On consoles, they expose many more shader intrinsics and developers can bypass the shader compiler as much as they want. PC APIs cannot ever hope to even come close to the possibilities of shader optimizations available on consoles. With PC, every developer is at the mercy of the driver shader compiler which is frequently a performance liability whereas they don't have to trust the codegen of shader compiler on consoles.

There are games/content that can only feasibly run on consoles no matter how much more brute force hardware on PC you can throw at them ...
 
Yes precisely this. All these examples are new(ish) games running on very old hardware that developers are surely spending little to no time actually optimising for. GCN1.0 doesn't even factor into the minimum spec in most cases which presumably means the game hasn't even been formally tested on that architecture, let alone optimised for it.

It's worth noting that if these kinds of performance deltas between equivalent specs were the norm between PC's and consoles then the PS5 and XSX would be performing around 6900XT/3090 levels in current games whereas all evidence points to them performing more in line with the 2080/s as per their specs.

Also worth noting that the 7970 is 1.5 year older than the PS4 and One are. R9 290x was the successor to the 7970 that launched spring 2012. The 7970 (or 7870 etc) shouldnt even be used to do comparisons i think.
Its between 2070/s and 2080/s for the PS5 and XSX in raw performance, or even better said, RX5700XT to 6700/XT somewhere.Since ray tracing is the norm now its better to leave out nvidia gpus in comparisons.

There are games/content that can only feasibly run on consoles no matter how much more brute force hardware on PC you can throw at them ...

Like what?
 
On consoles, they expose many more shader intrinsics and developers can bypass the shader compiler as much as they want. PC APIs cannot ever hope to even come close to the possibilities of shader optimizations available on consoles. With PC, every developer is at the mercy of the driver shader compiler which is frequently a performance liability whereas they don't have to trust the codegen of shader compiler on consoles.

This is obviously a fair point but I don't thing these factors would account for the 2x (in some cases) delta were seeing between consoles and older GPUs of equivalent specs if a game had been properly optimised on PC for that specific architecture.

What you highlight here though is how dependent game performance on PC is on driver level optimisation, of which of course these older GPUs get absolutely none.

There are games/content that can only feasibly run on consoles no matter how much more brute force hardware on PC you can throw at them ...

Games? Do you have some examples?
 
Of course it's not just media engines, it should have crapload of raw computing power, the difference of Pro to Max is almost a full blown Navi21 in terms of transistors and it's mostly GPU. But it doesn't mean it's suited well for games.
The original post that quoted mentioned apps though, in fact it specifically address how it falls down on gaming performance which is due to a myriad of reasons atm. I am merely saying it's performance is not 'solely due to the media engines'. It does have considerable horsepower both in FP/INT as well as GPU, but obviously the attention paid to the Mac as a gaming platform will not exactly show that in the best light. No one is arguing the Mac is a viable high-end gaming platform now just because of the M1X.
 
HZD in particular heavily leverages the fact that it's an APU and not a separate CPU/GPU. When the engine was architected, I suspect it was only ever intended to be PS4 only, and on that hardware, moving data between CPU and GPU is 'free' as they are the same memory pool.

Once it was ported to PC, it was found to be a very strong outlier in regards to performance scaling with PCIe bandwidth; on PC with a separate CPU/GPU, all of that data transfer that the engine was architected around being 'free' and instant with an APU and shared memory pool, now burn through:
This was a narrative that was around when HZD was first launched that I saw in Reddit threads and something NXGamer re-iterated sometime later, but there wasn't much corroborating evidence for it and it seemingly had a relatively minor impact in the end - the performance drop from 16x to 8x was very similar to other games, a slight dip at low resolutions and nearly invisible at higher.

If it was truly a PCI bandwidth issue then it stands to reason at some point we would see a levelling off of performance on high-end cards, but that doesn't seem to be the case - the more powerful the GPU, the better your performance. There was some periodic stuttering even on high-end rigs but with the latest patch that was discovered to be shader compilation (I noted at the time that even with letting the shader optimization stage fully complete there would still be CPU spikes traversing the world indicating that there are shaders that were still being compiled in real-time).

I don't doubt that the UMA of the PS4's APU played a big part in how it was architected and it was no easy task (obviously) bringing that to the PC, but even with the problematic release there's little to indicate the inherent performance deficit relative to the hardware it was running on was due to the PCIE bus being strained.

EDIT: I will say though that despite what HZD's VRAM budget indicator says, on 4/6GB cards the textures simply don't load in higher-res assets beyond 1080p resolution, you need an 8GB+ cards to have the texture display properly once you go beyond 1080p - this hasn't changed with the latest patch. So maybe there is some issue here with texture streaming/UMA as that's a ridiculous amount of memory needed to run it properly vs the PS4 Pro's total ~5GB accessible (the game can routinely take 8+ GB to run + 5GB textures). But then again, it also could be that this was a farmed out port to a studio that didn't necessarily have the skillset or access to the Guerilla devs initially that was always the core of the issue.

N8HyKHm.png


SMPQBT6.png
 
Last edited:
This is obviously a fair point but I don't thing these factors would account for the 2x (in some cases) delta were seeing between consoles and older GPUs of equivalent specs if a game had been properly optimised on PC for that specific architecture.

What you highlight here though is how dependent game performance on PC is on driver level optimisation, of which of course these older GPUs get absolutely none.

It's not just the driver, virtually all of the GPU instruction set is exposed for developers. If performance parity meant not using console features like UMA then proper optimization on PC isn't possible regardless of how hard they try ...

Games? Do you have some examples?

Dreams by Media Molecule in particular would be very hard to get running on PC. Their point cloud renderer relies on using global ordered append to get hilbert ordering, self-intersection-free dual contouring all for free on consoles and there's no good alternative to mimic these effects on PC without killing your framerate even on the highest end hardware available over there. Given the extremely high density of the point clouds, I would not be surprised to find out that they use ordered append to do culling as well too ...

You can create entire rendering pipelines that's only possible for consoles. Truly obscure stuff ...
 
Last edited:
Back
Top