Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
Closest comparison I can think of is maybe putting an RX 590 (36CU Polaris) vs RX 5700 (36CU Navi), although the latter has double the ROPs, and the clocks would have to be normalized for throughput (core and memory).

Would have to have really in depth analyses for something more specific though...
you'd need radeon analyzer or pix or something to see the CU occupancy i think
 
So back to where we were before, a faster SSD obviously streams things faster thus affording more detail or assets on screen than one that's twice as slow, that's obviously assuming on top of using all the available RAM of course. Do you not agree that's PS5's power advantage?
This is probably the best use case here:

I loved this level. (side note: best FPS campaign I've ever played I think, I liked it much more than Doom) And it does exactly instant loading no down time and it's based on user interaction.

It runs on a paltry XBO and PS4.

Infact the whole titanfall series is sort of based on this. Which funnily was largely overlooked by the SSD discussion.

So this is why, I know, that if there is a will, there is a way. Both ssds drives will be 50x faster easily than the ones spinning on this gen, but PS5 is only 2x faster than XSX. There are likely limitations caused by the slow hard drive, but I don't think the sony SSD is going to unlock things that much more than what we have on XSX. Just my 2 cents.
 
Cerny is right even regarding available bandwidth, though I doubt the narrower + higher clocked will fully make up for the 18% difference in compute and 25% in memory bandwidth.

That might have been true if the XSX was low clocked, but it's not, in fact it's rather high clocked at 1800+mhz. Around 20% in raw compute power for the GPU, if the PS5 can keep it clocks at the advertised 2.23ghz, that is.

but PS5 is only 2x faster than XSX

MS said their compression hw equals five zen cores, Sony's two. Still unknown yet, but xsx might come very close.
 
This is probably the best use case here:

I loved this level. (side note: best FPS campaign I've ever played I think, I liked it much more than Doom) And it does exactly instant loading no down time and it's based on user interaction.

It runs on a paltry XBO and PS4.

Infact the whole titanfall series is sort of based on this. Which funnily was largely overlooked by the SSD discussion.

So this is why, I know, that if there is a will, there is a way. Both ssds drives will be 50x faster easily than the ones spinning on this gen, but PS5 is only 2x faster than XSX. There are likely limitations caused by the slow hard drive, but I don't think the sony SSD is going to unlock things that much more than what we have on XSX. Just my 2 cents.

Exactly. Neither console is going to do anything that the other can't. They're both solidly engineered devices within spitting distance of each other's best specs. Some higher resolution this, higher framerate that. And each time, we're going to need Digital Foundry to point out the differences.

Heck, I'm playing Terraria at the moment, which almost feels like I'm taking the piss out of hardware capable of God of War.

My own prediction of the differences we'll see:
XSX - Higher resolution.
PS5 - More varied NPC faces.

No-one will care. Except for us, but we're not normal.
 
My own predictions we'll see:
XSX - Higher settings, a somewhat more stable fps, less dynamic resolution scaling. (due to more powerfufull GPU 2070 vs 2080ti level, faster CPU, much higher BW).
PS5 - Better positional audio when using tv speaker, faster streaming. (due to tempest CU, streaming due to devs forgetting BCpack compression).
 
My own predictions we'll see:
XSX - Higher settings, a somewhat more stable fps, less dynamic resolution scaling. (due to more powerfufull GPU 2070 vs 2080ti level, faster CPU, much higher BW).

I'm not convinced we'll generally see higher settings (assuming you're referring to the equivalent of PC settings "mid, high, ultra" kind of thing) because I think it's going to be more straightforward for devs to scale resolution. I suppose it depends which becomes the lead platform: if it's the PS5, I wouldn't be surprised to find devs targeting 4K on the PS5 and then dialling up a couple of settings on the XSX. But generally I expect pretty much identical settings with a 15% difference in resolution.

A somewhat more stable FPS seems likely on account of both the CPU and GPU being more powerful, meaning the odd hiccup that would cap out the PS5's hardware will have extra legroom on the XSX.

Less dynamic resolution scaling is hard to predict. Going by some of DF's videos, it's hard even to measure. I expect that a game either will or won't use dynamic resolution scaling though. And where we do see it, the maximum/minimum parameters will be adjusted by 15% in the XSX's favour.

PS5 - Better positional audio when using tv speaker, faster streaming. (due to tempest CU, streaming due to devs forgetting BCpack compression).

If the PS5 manages decent positional audio via TV speakers, I expect MS will implement it in some way, shape, or form.

And what does faster streaming amount to? It's an extra 3.2 to 4.2 GB/s of bandwidth. If I take the worst case and assume 8GB/s in a 60fps game, that's 53MB of extra data that can be pulled per frame. Which I think will be spent on the likes of NPC's.
 
MS said their compression hw equals five zen cores, Sony's two. Still unknown yet, but xsx might come very close.

https://www.eurogamer.net/articles/...s-and-tech-that-deliver-sonys-next-gen-vision
"By the way, in terms of performance, that custom decompressor equates to nine of our Zen 2 cores, that's what it would take to decompress the Kraken stream with a conventional CPU," Cerny reveals.

A dedicated DMA controller (equivalent to one or two Zen 2 cores in performance terms) directs data to where it needs to be, while two dedicated, custom processors handle I/O and memory mapping. On top of that, coherency engines operate as housekeepers of sorts.
 
MS said their compression hw equals five zen cores, Sony's two. Still unknown yet, but xsx might come very close.
Even ignoring the convoluted logic, you're off by five times. Sony said the kaken asic section is equivalent to "nine zen2 cores".

Kraken is a lot more compute-efficient than Zlib, which is a multiplying factor making your processing power argument pointless. The IO processor have a dedicated DMA controller which is equivalent to "another zen2 cores or two" according to Cerny. There's another dedicated coprocessor to manage the SSD IO and file abstraction, and another dedicated coprocessor to manage the memory mapping. The Tempest silicon is also equivalent to another two zen2 cores. Coherency engine power is unknown.

There's a lot of stuff happening here helping to free up the CPU, it has the equivalent of 13 or 14 zen2 cores in the tempest and IO processors.
 
I'm not convinced we'll generally see higher settings (assuming you're referring to the equivalent of PC settings "mid, high, ultra" kind of thing) because I think it's going to be more straightforward for devs to scale resolution. I suppose it depends which becomes the lead platform: if it's the PS5, I wouldn't be surprised to find devs targeting 4K on the PS5 and then dialling up a couple of settings on the XSX. But generally I expect pretty much identical settings with a 15% difference in resolution.

A somewhat more stable FPS seems likely on account of both the CPU and GPU being more powerful, meaning the odd hiccup that would cap out the PS5's hardware will have extra legroom on the XSX.

Less dynamic resolution scaling is hard to predict. Going by some of DF's videos, it's hard even to measure. I expect that a game either will or won't use dynamic resolution scaling though. And where we do see it, the maximum/minimum parameters will be adjusted by 15% in the XSX's favour.

Scaling has become much and much better these days, that trend is going to continue. Devs can scale across hardware very well, taking advantage between lower and higher end hardware.

@Tabris

https://wccftech.com/gears-dev-load...sampler-feedback-streaming-is-a-game-changer/

'Interestingly, at least on paper, the I/O capabilities of the PlayStation 5's SSD are superior with its 5.5 GB/s raw and 8-9 GB/s compressed I/O throughput, whereas the specs of the Xbox Series X SSD are 2.4 GB/s raw and 4.5 GB/s compressed.

However, Microsoft might have quite a few software tricks up their sleeves between the DirectStorage API, Sampler Feedback Streaming, and the new BCPack compression system tailored for GPU textures.'

https://www.windowscentral.com/xbox-series-x-way-more-powerful-ps5-heres-how-much-more

'The SSD on the PS5 is considerably faster than the one found in the Xbox Series X. However, Microsoft implemented other unique features that should mitigate the difference.'

'The biggest takeaway from the event was the fact that the Xbox Series X is considerably more powerful than the PS5. The Xbox Series X features a faster processor, clocked in at 3.8 GHz, and a better graphic processing unit (GPU) that's a minimum 1.875 teraflops (TFLOPs) more powerful than the chip found inside the PS5, when it's running at its maximum speed. For reference, that's the power of a standalone PlayStation 4, but since this is AMD's next-generation architecture, the difference is even greater, maybe even double the real-world performance due to better efficiency. It also seems like the PS5 doesn't feature variable-rate shading, so real-world graphics may take a noticeable hit on the console.'



 
Even ignoring the convoluted logic, you're off by five times. Sony said the kaken asic section is equivalent to "nine zen2 cores".

Kraken is a lot more compute-efficient than Zlib, which is a multiplying factor making your processing power argument pointless. The IO processor have a dedicated DMA controller which is equivalent to "another zen2 cores or two" according to Cerny. There's another dedicated coprocessor to manage the SSD IO and file abstraction, and another dedicated coprocessor to manage the memory mapping. The Tempest silicon is also equivalent to another two zen2 cores. Coherency engine power is unknown.

There's a lot of stuff happening here helping to free up the CPU, it has the equivalent of 13 or 14 zen2 cores in the tempest and IO processors.

It’s all irrelevant. Zlib and Kraken are two different compression schemes with Kraken offering way better decode performance on your typical cpu.

Plus we are not talking about the same rate of decompression. Sony’s Kraken asic is built to handled way more compressed data than the SSD can supply and maxes out somewhere in 25-27 GBps range.

How many cpu cores a decompression scheme needs at a certain rate is only relevant if you are doing decompression on the cpu. And in that case the less cpu you need the better, not more.
 
When Cerny said that it was very challenging to have high occupancy, he may have been referring to GCN. As it's very difficult to keep it saturated just given the nature of how it does work. But AMD specifically went to address this with RDNA.

He was referring to the natural limits of how wide you can paralellize some tasks, the example he gave was lots of small triangles. There is always going to be a point where more CUs = redundant because your can't have those CUs doing other things because this will hit the cache which you might need all of us for the work the active CUs are doing.

You can say for certain that there will be scenarios where PS5 will be much faster than Series X, that circumstance is going to be where the optimum number number of CUs is around 36 +/- 2 which PS4 will do faster than Series X because it's clocked faster. There will also be scenarios where PS5 is much slower than Series X because even running faster, going wider is better.

Whether or not any of these scenarios are significant enough result in differences in game performance or graphics detail is debatable. It's too easy to say. Both systems will present different problems in early on, but as always, techniques will evolve over time.
 
He was referring to the natural limits of how wide you can paralellize some tasks, the example he gave was lots of small triangles. There is always going to be a point where more CUs = redundant because your can't have those CUs doing other things because this will hit the cache which you might need all of us for the work the active CUs are doing.

You can say for certain that there will be scenarios where PS5 will be much faster than Series X, that circumstance is going to be where the optimum number number of CUs is around 36 +/- 2 which PS4 will do faster than Series X because it's clocked faster. There will also be scenarios where PS5 is much slower than Series X because even running faster, going wider is better.

Whether or not any of these scenarios are significant enough result in differences in game performance or graphics detail is debatable. It's too easy to say. Both systems will present different problems in early on, but as always, techniques will evolve over time.
What are the underlying basis and figures are these based on?
Current 4pro games, future engines, how much of the overall rendering is it referring to?

Otherwise just as simple to say 52 is the optimal amount compared to 36 or 60 for the exact same reasons you gave.

Is there situations where having a higher frequency would be faster, of cause. But, how often even for the same TF?
Pretty sure even DF did couple benchmarks comparing frequency to cu at same TF. If memory serves was hardly anything in it, if anything wider may have had the edge.
Sure that's current games, but it's another data point.
 
It just depends whos talking, from what persepctive, etc. What Cerny said was true for the PS5. Doesn't mean he wouldn't have elected for 12TF if their design was different from the start.
Cerny was talking about 10TF and best way they felt to get to it. Not that it's necessarily better in any way than 12TF.
No one is saying some simple inequality like "10TF > 12TF". We are talking about real world performance.

1: Usage of ALU resource:

Yes more ALU resource is better but 36CUs has higher occupancy (let's say 5%) then PS5 GPU can match the performance of other GPUs with
more ALU resources.

(Note I am not saying PS5 has more usable TFs than Xbox. If there is 5% more occupancy PS5 can match a 10.8TF 48CUs GPU).


2. Advantage of high frequency:

Cerny talked about improving GPU performance with higher frequency, even if there is no difference in ALU resource.






If there is a hypothetical GPU which has 12TF but also 20% lower occupancy than PS5, than its actual available ALU resource is less than PS5.

Or another GPU with 12TF but only 1.4~1.5 GHz, the overall GPU performance may not surpass PS5 GPU.
 
Last edited:
What are the underlying basis and figures are these based on?
Current 4pro games, future engines, how much of the overall rendering is it referring to?
He is the man who can access a lot of internal RDNA data so he probably have some tests about the occupancy and the number of CUs.

Thus he can say 36CUs is easier to be fully used than 48CUs .
 
He is the man who can access a lot of internal RDNA data so he probably have some tests about the occupancy and the number of CUs.

Thus he can say 36CUs is easier to be fully used than 48CUs .
To be fair I can't remember the full details of what he actually said. I just know that people are mis-representing what he actually said a lot.

So did he say that a frame would render 5% quicker, 2 fps in general faster, that due to better occupancy getter better efficiency so smart shift can lower power?
Or just that it's better. Which can mean almost anything?

As I said, usually make sure to say it in every message, I don't dispute what he says and his findings based on their design. I just dispute the way people use it.
 
Status
Not open for further replies.
Back
Top