Sony PlayStation 5 Pro

Looking back at some of Pro's earlier rumors...



Jeff sources as we can see, have gotten many things right, way before MLID even reported such leaks this month.

Jeff even mentioned after MLID leak that other things weren't being discussed to protect sources. Even MLID mentioned that he had to edit the Pro spec's not to out his source.

If, and this is a big IF, Sony has some custom RT in place assisting AMD's compute unit solution, this would be interesting discussion to be had.
I think they could have removed the part talking about the boost mode on PS5 unpatched games. I was surprised not to see them even mention it. And about the custom RT I wouldn't be surprised and people shouldn't be. They have being doing this since PS4, Pro then PS5. Each times increasing the number and potency of customizations.
 
The PS5 Pro GPU will and is likely a carbon copy of the 7700XT.

Which has a 256bit bus and 48MB of infinity cache.

7700xt has a 192-bit bus with 3 memory controller dies, so 3 x 16MB = 48MB.

https://www.amd.com/en/products/graphics/amd-radeon-rx-7700-xt

The Navi 32 chip it's made from can use 4 memory controller dies for a 256-bit bus with 64MB of L3 (4 x 16MB) - in this configuration it's the 7800xt.

https://www.amd.com/en/products/graphics/amd-radeon-rx-7800-xt

With a 256-bit bus any Infinity Cache on Pro would be 4 x something (most likely 4 x 16 MB).

Navi 32 has 3 shader engines, with 60 total CUs, and 54 with one redundant DCU per SE (60 - 3 x 2). Pro is supposed to be 60 active out of 64, so that couldn't be Navi 32, even though the performance is likely to be very similar.

Two redundant DCUs on PS5 Pro kind of makes it look like it's two shader engines, but that would result in crazy wide shader engines with 30 DCUs per SE, outdoing even Xbox X's 26 per SE. It could also possibly lead to only 64 rops as currently RDNA has been limited to 32 ROPs per shader engine, AFAIK.

If you look at this path tracing test data, the 7700XT is 2.2x faster than the 6700XT and 2.4x faster than the 6650XT.

People forget how terrible the raw ray racing performance is on PS5, so it really isn't difficult to get to 2x (or more) performance.

And take away the Infinity Cache and yeah, it starts to show how limited current gen consoles are at RT.
 
The PS5 Pro GPU will and is likely a carbon copy of the 7700XT.

Which has a 256bit bus and 48MB of infinity cache.

If you look at this path tracing test data, the 7700XT is 2.2x faster than the 6700XT and 2.4x faster than the 6650XT.

People forget how terrible the raw ray racing performance is on PS5, so it really isn't difficult to get to 2x (or more) performance.

It is not, it's a new RDNA 3.5, almost 4, chip with RNDA4's new BVH structure. It also has a 256b bus and 16gb.
 
RDNA4's BVH structure? So not as custom as the current rumours suggest?

Yeah apparently, going by leakers that finally seem to have things right (what with devs having kits and all) on twitter. BVH8 for structure for whatever reason. It seems like PS5 Pro is almost but not quite RDNA4, just like PS5 is almost but not quite RDNA2.

Also devs get 13.7gb of ram now, a bit extra.
 
RDNA4's BVH structure? So not as custom as the current rumours suggest?
Who knows indeed? We could never know until some future SDK leak...
7700XT satisfies all the qualifiers of PS5 Pro performance statements.
Even 4x higher framerate in some RT games compared to PS5? Which game(s)? That's a doc for developers so we must expect factual information, not PR marketing stuff.
 
Who knows indeed? We could never know until some future SDK leak...

Even 4x higher framerate in some RT games compared to PS5? Which game(s)? That's a doc for developers so we must expect factual information, not PR marketing stuff.
It’s not game performance, it’s RT performance. Up to 2x performance is a best case including said RT improvements as part of total frame rendering. 7700XT goes beyond that using PT, but that isn’t a realistic use case for consoles.
 
It’s not game performance, it’s RT performance. Up to 2x performance is a best case including said RT improvements as part of total frame rendering. 7700XT goes beyond that using PT, but that isn’t a realistic use case for consoles.
That's not what the leakers implied at all. This is the kind of framerate difference that we could (and can) see between RDNA2 PS5 and 4070 GPU using the same (consoles) settings.
 
That's not what the leakers implied at all. This is the kind of framerate difference that we could (and can) see between RDNA2 PS5 and 4070 GPU using the same (consoles) settings.
Think of it from a frametime perspective. If the non RT aspects only improved by 45% at best, how do you arrive at 4x net performance improvement? It just isn’t feasible. RT processing is not 80-90% of the frametime. Increased RT also requires an increase in most other aspects of GPU power. It just doesn’t math up here.

The only credible leaker is Tom Henderson. Where had he stated 4x performance in games? I recall him stating up to 4x RT performance.
 
Last edited:
Think of it from a frametime perspective. If the non RT aspects only improved by 45% at best, how do you arrive at 4x net performance improvement? It just isn’t feasible. RT processing is not 80-90% of the frametime. Increased RT also requires an increase in most other aspects of GPU power. It just doesn’t math up here.

The only credible leaker is Tom Henderson. Where had he stated 4x performance in games? I recall him stating up to 4x RT performance.
You answered your own question: RT != raster. First it's 4x at best and more like 2x-3x otherwise. My take on this is simple. Take a PS5 game, add 3 RT effects and record the ~10fps framerate. Then optimize this game on PS5 Pro using the same exact settings (1080p, 3 RT effects) and record the improved (2x to 4x depending of scene) framerate.

As players we'll never witness this difference in actual PS5 games. But compared to PS5 hardware it's still meaningful and this can be seen in some cases between weakest RDNA2 GPUs (on PC as no devs would dare use those settings on consoles) and say 4070 nvidia GPU.
 
As players we'll never witness this difference in actual PS5 games.

That's...literally the argument though. The actual real-world performance won't have that gap, because rendering performance, even in a heavy RT game, still depends on multiple facets of the traditional rendering pipeline to deliver the actual final framerate.

He's saying that you will only get that 4x gap in specifically contrived scenarios that are not reflective of shipping games, and you're responding to "Ah yes, but have you considered this contrived scenario?"
 
It is not, it's a new RDNA 3.5, almost 4, chip with RNDA4's new BVH structure. It also has a 256b bus and 16gb.
RDNA 3.5 is BVH4x1, like 7700XT. PS4 pro has double the processing capability of ray intersects with BVH8x2 (8 BVH boxes and 2 triangles intersection resolve per cycle) per RT unit. So RDNA 4 at least. In the leaked doc only is mentioned the word custom (fully) for the ML hardware and its 300 tops.
 
RDNA 3.5 is BVH4, like 7700XT. PS4 pro has double the processing capability of ray intersects with BVH8 per RT unit. So RDNA 4 at least. In the leaked doc only is mentioned the word custom (fully) for the ML hardware and its 300 tops.
RDNA3.5 has no traversal RT units. PS5 Pro is basically RDNA3.5 + RDNA4 traversal RT units (and AI TOPs units?). Sony will likely market this as "RDNA4 based GPU".
 
RDNA3.5 has no traversal RT units. PS5 Pro is basically RDNA3.5 + RDNA4 traversal RT units (and AI TOPs units?). Sony will likely market this as "RDNA4 based GPU".
AMD so far has no traversal hardware, only intersection resolve units. Traversal is software based and thats why is more cache sensitive than in Nvidia solution, but thats where supposely infinity cache helps for RT.
AMD RDNA 3.5 ray tracing unit with its intersection resolve unit can resolve if a ray intersects with 4 BVH's structure boxes or 1 BVB's structure triangles per cycle. PS5 pro twice that. Once calculated if the boxes are hit or not the traversal software (hardware unit in Nvidias GPUs) manages which boxes are hit and for those goes deeper in its BVH tree to calculate more possible hits until the bottom is reached where is calculated if triangles are hit. This BVH traversal management has a dedicated hardware unit in Nvidia GPUs that avoids getting the caches crazy. Nvidia until now has had way more intersection processing capability and this allowed them to use wider BVH tree structures with more boxes per level and fewer levels. This mitigates the traversal calculations and cache access stress. With BVH8 AMD duplicates intersection processing and can use also wider BVH structures.
 
Last edited:
RDNA 3.5 is BVH4x1, like 7700XT. PS4 pro has double the processing capability of ray intersects with BVH8x2 (8 BVH boxes and 2 triangles intersection resolve per cycle) per RT unit.

Is it confirmed in the leaked docs that PS5 Pro has double the ray box and ray triangle intersection rate? Wouldn't this require doubling up on TMUs?

I imagine that a wider BVH structure could speed things up even if the processing rate was the same.
 
Is it confirmed in the leaked docs that PS5 Pro has double the ray box and ray triangle intersection rate? Wouldn't this require doubling up on TMUs?

I imagine that a wider BVH structure could speed things up even if the processing rate was the same.
It seems so, in Tom Henderson's leaked docs extended info article:

In addition: 30 WGPs running specialised BVH8 traversal shaders vs 18 WGPs running BVH4 tranversal shaders on the standard PlayStation 5.

 
Last edited:
It seems so, in Tom Henderson's leaked docs extended info article:

In addition: 30 WGPs running specialised BVH8 traversal shaders vs 18 WGPs running BVH4 tranversal shaders on the standard PlayStation 5.


A BVH8 structure doesn't necessarily mean it's got twice the RT hardware or performance per CU per clock though - it could just be a more efficient way of fetching and processing the data. And a "shader" would appear to be something running on the CUs, so that's unlikely to be twice as fast per CU per clock.

Any improvements coming to PS5 Pro and RDNA 4 would be great and good news for both console and PC, but hype has a tendency to turn into a rather more modest reality. RDNA3 -> RDNA4 doesn't need to be a huge jump to get PS5 Pro to 2~3x RT performance, so long as the Pro has a modest amount of infinity cache.
 
A BVH8 structure doesn't necessarily mean it's got twice the RT hardware or performance per CU per clock though - it could just be a more efficient way of fetching and processing the data. And a "shader" would appear to be something running on the CUs, so that's unlikely to be twice as fast per CU per clock.

Any improvements coming to PS5 Pro and RDNA 4 would be great and good news for both console and PC, but hype has a tendency to turn into a rather more modest reality. RDNA3 -> RDNA4 doesn't need to be a huge jump to get PS5 Pro to 2~3x RT performance, so long as the Pro has a modest amount of infinity cache.
FWIW:
 

That's looks like it's maybe an assumption on their part and not something that they can say for sure. It would just seem (to me) to be an odd way to rebalance RDNAs DCUs.

Maybe you could use a full Dual Compute Unit to process BVH8: use both sets of TMUs and work on the same node in the shared LDS, and send the same instructions to both CUs. Probably wouldn't be twice as fast but might offer a worthwhile increase in efficiency in exchange for very little silicon..?

RDNA seems to be evolving along the line of adding as little extra specialised hardware as possible, maybe RDNA 4 will be the same and seek to make better use of the DCU to process BVH8 structures.
 
Back
Top