PS5 Pro *spawn

With the 2TB XSX listed at $599, do we still think the pro is <$600? I guess the chip size might be similar or slightly smaller than the xsx depending on the process used?
That model is probably sold at a profit, so I don't think we can make assumptions based on that.
 
With the 2TB XSX listed at $599, do we still think the pro is <$600? I guess the chip size might be similar or slightly smaller than the xsx depending on the process used?
Simple, shelving the idea that the upgraded system will somehow have AI HW for temporal upscaling and a proportionally much higher RT perf will go a LONG way to both getting the HW design complexity under control and meeting that possible price target ...
 
Simple, shelving the idea that the upgraded system will somehow have AI HW for temporal upscaling and a proportionally much higher RT perf will go a LONG way to both getting the HW design complexity under control and meeting that possible price target ...
Are you stating that an absence of features from newer versions of RDNA would inherently improve the price and/or complexity of a potential PS5Pro? If so, why would it?

Given that MS have announced a $600 XSX with 2TB of storage due this autumn, I'm taking that as industrial espionage proof that the PS5Pro will be $600 with 2TB of storage.
 
Are you stating that an absence of features from newer versions of RDNA would inherently improve the price and/or complexity of a potential PS5Pro? If so, why would it?

Given that MS have announced a $600 XSX with 2TB of storage due this autumn, I'm taking that as industrial espionage proof that the PS5Pro will be $600 with 2TB of storage.
Isn't the price more in line with APU die size regardless of the actual graphics IPs used? Since PS4 Sony is used to release cheap consoles. PS5 was $400 digital, I am expecting $500-550 for the PS5 Pro digital in US if they used at least TSMC 5nm. Price will be another matter in Europe / Japan obviously as there is some horrendous Playstation tax there.
 
I wonder if the wording in the document about the machine learning capabilities ("fully custom design") means that Sony designed the units and AMD just implemented them in the design. The wording is curious because Sony and Cerny always use the word "semi-custom" to describe they're collaborations with amd.
 
Are you stating that an absence of features from newer versions of RDNA would inherently improve the price and/or complexity of a potential PS5Pro? If so, why would it?
I'll try and answer, because if a chip has more features a company will charge more for it, does not matter if it costs no more to produce, companies have produced a chip cut connections to certain units with a laser which costs them more and charged less for it
 
Isn't the price more in line with APU die size regardless of the actual graphics IPs used? Since PS4 Sony is used to release cheap consoles. PS5 was $400 digital, I am expecting $500-550 for the PS5 Pro digital in US if they used at least TSMC 5nm. Price will be another matter in Europe / Japan obviously as there is some horrendous Playstation tax there.
Good point about the digital. Yeah, I think $550 is reasonable there, particularly if it's compatible with the PS5slimmish's drive. Which it should be, of course, but you never know ¯\_(ツ)_/¯

I'll try and answer, because if a chip has more features a company will charge more for it, does not matter if it costs no more to produce, companies have produced a chip cut connections to certain units with a laser which costs them more and charged less for it
That makes sense. I'm curious how true that is for Sony & MS though. When selling dies to different GPU vendors, your statement makes complete sense.

With semi custom APU's, however, I'm not sure if they're ever used for anything else. If it's the case that Sony and MS just buy wafers and whatever doesn't work gets thrown on the scrapheap, I can only see die size and node effecting price.

And if that's the case, a 5nm PS5Pro strikes me as more likely than not to contain features from AMD's 5nm architecture(s.)
 
Are you stating that an absence of features from newer versions of RDNA would inherently improve the price and/or complexity of a potential PS5Pro? If so, why would it?
The addition of more features/robust implementations have always unilaterally increased HW complexity and we have to remember that newer AMD architectures aren't necessarily backwards compatible ...

That being said, a possible path for the new system (if there is one) to hit the $599 USD MSRP would likely involve a custom RDNA2 implementation (maintain BC) while backporting/cherry-picking features such as a more advanced NGG pipeline, VOPD, updated MES block, & etc. On RDNA3, there's an option for implementations to vary their register file sizes (N31/N32 vs N33) but to clamp down on costs I don't believe they'll increase their register file size this time around and will stick to a similar size as found on older/current lower-end architectures. With all these deliberations in mind, older process technologies like 6nm becomes a realistic option. For the memory architecture, they'll probably just overclock the memory some more and may consider moving to a 320-bit bus width if that ... (no infinity cache)

Some of these moves like not upgrading the register file sizes (RT workloads are frequently occupancy limited) or not including any large cache (memory space for efficiently spilling arguments) will inherently limit the new systems potential gains in RT perf ...
 
More about what I said
The ATI Omega drivers come with a softmod that unlocks 4 dormant pixel pipelines on the 128 MB Radeon 9500. This mod turns it into a full Radeon 9700.
Also a hd6950 could be turned into a hd6970 by flashing the bios and there was an amd cpu which could be turned into a more expensive variant using a pencil.
There was also a geforce that could be turned into a more expensive quadro
 
With the 2TB XSX listed at $599, do we still think the pro is <$600? I guess the chip size might be similar or slightly smaller than the xsx depending on the process used?
Microsoft are being ridiculous charging $600 for an otherwise standard XSX just with an extra 1TB of storage in 2024. I dont think it'll be any kind of point of reference for Sony, who is still seemingly interested in selling consoles.
 
More about what I said

Also a hd6950 could be turned into a hd6970 by flashing the bios and there was an amd cpu which could be turned into a more expensive variant using a pencil.
There was also a geforce that could be turned into a more expensive quadro

And 290 could be flashed to a 290x.

The Phenom 2 CPU's at the time could be unlocked to full quad/hex core by simply turning a bios setting to on.
 
Last edited:
More about what I said

Also a hd6950 could be turned into a hd6970 by flashing the bios and there was an amd cpu which could be turned into a more expensive variant using a pencil.
There was also a geforce that could be turned into a more expensive quadro

That AMD chip was the K6 - I had one of those and pencilled the overclock trick. Worked really well as I remember :)

*edit - just thinking about it, the Duron had the same trick I think.
 
It won't if they ditch the software traversal and caches never were the main problem with RT.

RDNA4 seems to have a compressed, wide BVH structure. Possibly traversal acceleration as well? It's assumed PS5 Pro uses pretty much the same hardware there.

I do wonder if there's a more "wide" friendly manner of tracing than rays through a BVH, or at least tracing sparse rays through a bvh.
 
I do wonder if there's a more "wide" friendly manner of tracing than rays through a BVH, or at least tracing sparse rays through a bvh.

Tracing coherent rays is wide friendly which is pretty much what rasterization does. Any algo that requires divergence won’t be SIMD or cache friendly. It’s not an RT specific thing.

Only real solution is narrow, fast ALUs backed by a massive cache.
 
Tracing coherent rays is wide friendly which is pretty much what rasterization does. Any algo that requires divergence won’t be SIMD or cache friendly. It’s not an RT specific thing.

Only real solution is narrow, fast ALUs backed by a massive cache.

Yeah, but how do you get a bunch of coherent rays to represent diffuse/reflections? It works with direct lighting obviously. I'm thinking something like caching worlspace traces in what is basically a graph, this object traces to this object from this location/vector. Then when an object moves you trace dense rays from that moving object. This gives you visibility to from/that object; as the object moves any rays from it override location/vector caches for hit locations, since now that location vector would trace towards the object (at least towards the rays location from the object).

The problem comes in when you need to... as I'm typing this, no it doesn't. I was going to type that problem comes in when you need to re-test for invalidated rays that used to hit that object, which causes massive divergence. But if you've cached the worldspace location where that moved object was, you only need to trace a probe centered on that objects previous worldspace, or maybe a few probes at most.

You trace from these probes out, and again you have a new bidirectional cache of visibility for all the invalidated areas now. Along with worlspace probe restir (just share rays that haven't registered a hit yet if they go close to other probes centers, tricky implementation but it works) to get an idea of volumetrics you've got a relatively dense/coherent RT setup. You don't need to trace multiple bounces because it's just a graph from current hit to worlspace hit to worldspace hit, which should all be cached.
 
Yeah, but how do you get a bunch of coherent rays to represent diffuse/reflections?

Instead of tracking relative object vectors (an exponential problem) and forcing rays to be coherent you can try tackling the problem from the other direction. Just like OOO CPUs track a large number of instructions for opportunistic execution you can do the same with rays.

If you can track a sufficiently large number of in-flight rays it should be relatively easy to find rays to schedule because the BVH nodes they need are readily available in cache. Those rays don’t even have to be from the same wavefront or workgroup. With this approach you reduce cache thrashing since it’s the cache that’s dictating which rays get scheduled together and not the other way around.
 
Most importantly keep the price reasonnable.
guess so, the console is going to be a success anyways but since this isn't a generational change where more powerful hardware means that the games' development is going to be much more expensive, so the base model price should drop to 300€ or so.
 
Back
Top