It's worth remembering that the one way bandwidth for PCIe2 16x is only 8GB/s which is barely faster than some Gen4 NMVe drives and so could be very easily saturated by heavy IO plus normal game data going back and forth.
PCIe3 doubles that to 16GB/s though which gives far more headroom. I would expect that if PCI3 is still a bottleneck, and thus 100% saturated (to enable PCIe4 to give a further speed boost) that we'd be seeing a bigger speedup from the doubling of bandwidth between 2.0 and 3.0. I suspect 4.0 won't give much if any speedup over 3.0. That seems born out by the 4060/Ti results here vs the 3060Ti. The 3060Ti being full speed PCIe4.0 and the 4060's being only half speed. Yet neither seems impacted by that.
Evidence that points towards decompression likely not being a bottleneck would be:
None of that is conclusive, but I'm sure we will get conclusive evidence one way or the other soon enough as it's as simple as benchmarking the transition times on the same system with just the GPU swapped out.
- We have other benchmarks showing vastly higher decompression throughput than R&C is using
- We see no obvious difference in load speeds between different GPU's of wildly different capabilities (albeit this is on different test systems so a proper test may show different results).
- We can see GPU usage is well below 100% on the 4090 during the portal transitions
Hell, anyone here with the game could do it right now by simply underclocking their GPU.
This title does seem to be even more PCIe bandwidth constrained than one of the previous leaders, Hitman 3, which sees a pretty solid 20% performance decrease on the 4090 from going Gen3 x16 to Gen2 x16 in a similar fashion at 4K: https://www.techpowerup.com/review/nvidia-geforce-rtx-4090-pci-express-scaling/21.html
I've always speculated that engines architected (at least initially) as console exclusives would have a higher PCIe usage. Many things that will consume PCIe bandwidth on the PC don't on an APU, since CPU memory and GPU memory are one and the same, and don't require transfers across the bus; if in the early stages of development they didn't even have a future PC port in mind, then why not use that capability to its fullest?
That brings me to another interesting test, particularly since Rich already has an RX 570, although it'd be nicer if it were an 8GB one to eliminate VRAM pressure skewing the results...
RX 570 8GB versus... AMD Phoenix APU.
On average, TechPowerUp rates the Radeon 680M in the Phoenix APUs as exactly identical speed-wise to the RX 570: https://www.techpowerup.com/gpu-specs/radeon-780m.c4020
It'd be neat to compare both the frame rates and the average PCIe bandwidth utilization of both, with the RX 570 in a system with a similar Zen4 CPU.
I know DF's got their hands busy with lots of other things, but a man can dream!