Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
https://www.resetera.com/threads/nx...eneration-is-born.176121/page-4#post-30070900

Devs will choose whether they want full Power to gpu or full Power to CPU where one or the other underclocks below the listed spec. So a game to game Basis. I imagine most cross gen games will choose to prefer higher clocked gpu Mode as they will be gpu bound even if the Zen cores are underclocked. Zen just runs around the Jag that most cross gen games are not going to worry about CPU time, especially 30 fps games.​

So unless the devs push both sides to the max at the same time it should be fine.

Glad it's more official now that someone from DF has said it. :)

Called it way early when I said, devs will work to a known spec and won't be relaying on "boost"
 
Thats.... a good thing :p We wont have to be lucky when buying on :)
Now i'm not sure how much sarcasm there is in the sentence.
But i give an example: The realtime GI stuff i'm currently working on can actually not saturate the Vega GPU i'm using. The test scene is quite small, but to compensate i set all settings to 'ultra'. It takes 2ms which is fine. Then i look at GPU clocks and i see it keeps running at 166 MHz and is totally bored.
The example is not very practical - in a game this task would run async with other work. But there are situations where no max clock is necessary, similar to how it rarely happens all CPU cores run at 100%.
Why not making the other part faster if this happens? It's no bad thing to have dynamic clocks - it's an additional option.
 

I don't know how it works! Factor it in anyway!

From the Digital Foundry article, there is a BCPack decompressor block fed from the SSD that can produce a maximum of 6 GB/s.
This does seem like it will allow for more compact storage footprints and a higher average read bandwidth from the SSD.
The storage system in this case is the reverse of the PS5's GPU and DRAM bandwidth situation, as the BCPack system is competing against a raw bandwidth advantage that still exceeds that cap. The lower capacity might feel the pinch of a compression scheme not tailored to game textures, though.

The thing is, BCPack IS factored in. We know the compression chip maxes its output at 6GB/s. The PS5 compression chip output maxes at 22GB/s. Both are best case figures. PS5's RAW throughput is near the XSX's best case output. There is no chance that BCPack provides an advantage that is being missed.

If the PS5 CPU is with SMT at 3.5 GHz, it's going to consume more power than PS5 CPU without SMT which in turn means there's a far larger chance that the PS5 will have to downclock the GPU.

There's no indication at this point the PS5 can run with SMT disabled.

Considering one is advertised as best case "boost" and the other is locked in, we actually don't the speed difference between the two.

That's not what it was advertised as.

Assuming it doesn't downclock under load.

Which we've been told is very rare.

https://www.resetera.com/threads/nx...eneration-is-born.176121/page-4#post-30070900

Devs will choose whether they want full Power to gpu or full Power to CPU where one or the other underclocks below the listed spec. So a game to game Basis. I imagine most cross gen games will choose to prefer higher clocked gpu Mode as they will be gpu bound even if the Zen cores are underclocked. Zen just runs around the Jag that most cross gen games are not going to worry about CPU time, especially 30 fps games.​

So unless the devs push both sides to the max at the same time it should be fine.

Dictator is talking out his ass. This is explicitly the opposite of what Cerny says happens. DF are not an official source when they are going on about their pet theories.
 
Now i'm not sure how much sarcasm there is in the sentence.

Very much of it so :)

There is no chance that BCPack provides an advantage that is being missed.

You never know, we don't have all the technical info and plans yet, from either i think.

Which we've been told is very rare.

True we could say there are no dynamic clocks, if its that rare its nothing to worry about really.

Dictator is talking out his ass.

Source? your ass? :p
 
The thing is, BCPack IS factored in. We know the compression chip maxes its output at 6GB/s. The PS5 compression chip output maxes at 22GB/s. Both are best case figures. PS5's RAW throughput is near the XSX's best case output. There is no chance that BCPack provides an advantage that is being missed.
That's what I mean by the SSD situation being the reverse of the DRAM situation for the PS5 and Xbox SX. The Xbox has a raw throughput advantage with a possible caveat in capacity/arrangement in DRAM, while the PS5 has that for its SSD and compressor.
 
That's what I mean by the SSD situation being the reverse of the DRAM situation for the PS5 and Xbox SX. The Xbox has a raw throughput advantage with a possible caveat in capacity/arrangement in DRAM, while the PS5 has that for its SSD and compressor.

What possible caveat for the PS5 SSD? Other than the total capacity being smaller it is straight up faster in every way we know of.
 
It all depends on how low the CPU needs to clock for the GPU to MAX, and vice-versa.

There seems to be a general confusion on here about not being able to distinguish between clock speed and current. Not saying it’s you, just speaking in general.

The load you put in at a given frequency will determine the voltage needed and thus the power draw.

Let’s say a cpu doing 3ghz pulling 20amps at 1v is generating 20w of power draw. 20w is it’s max budget. We won’t go into vrm complications on here.

Lets drop that down to 10amps. There no reason for the frequency to drop as it’s below the max power budget. So at 10 amp load the cpu is happily signing along at 3ghz (max set) drawing 10w.

Now a 25amp load comes in. Because you’re power throttling, your cpu frequency will drop as needed until ”voltage x amps = max power draw.” That’s the “boost” part.

In this situation, it’s actually reverse boost meaning its controlling frequency drop from best case scenario based on load coming in and the power budget allocated.
 
In this situation, it’s actually reverse boost meaning its controlling frequency drop from best case scenario based on load coming in and the power budget allocated.

So is it correct to say the following?

In this situation the CPU and GPU are part of 1 single unit that has an upper power draw limit. So if the GPU portions are mostly maxed out drawing the majority of the power, you have less power to allocate to the CPU.

It would be useful if there was a chart or appendix of empirical measurements of CPU and GPU instruction mixes and what impact it has on the timings. Since the frequencies will vary, the latencies of the instructions would vary too.

Maybe even a listing of instructions or instruction mixes and their power draw impacts.
 
So I just thought I should look into how well does RDNA scale with clocks and here is actually interesting benchmark on YT with 5700 (36CU) and 5700 (XT).

In this case, 5700 is clocked at 2150MHz (9.9TF) and 5700XT is clocked at 1750MHz (8.9TF).


Results are surprising because it seems XT pulls ahead most of the time, and sometimes not even by a little, even though it has 400MHz and 1TF deficit. Not sure what to make of this really...
 
What possible caveat for the PS5 SSD? Other than the total capacity being smaller it is straight up faster in every way we know of.
The caveat is primarily the smaller capacity, and if the BCPack algorithm has the stated advantage on game content there is an additional capacity disparity. On top of that, if the BCPack algorithm has that level of advantage, it would increase the mix of higher-compression assets being read by the SSD versus one using zlib or Kraken.
That average also depends on how much the Series X can apply BCPack versus zlib, which is something I likened to a slower/faster divide in the storage.

In both cases, I think the signs point to the platform with the raw capability having an edge despite having some disadvantages.
 
Yes, but nothing cerney said conradicts what DF/Dictator explained.
It's ambiguous at best. Cerny said the gpu and cpu would run at the max clock, or close to, the majority of the time. Saying dev will have to chose between cpu power and gpu power is in conflict with the presentation, unless the majority of the time means to include optimizations from devs to micro-manage every last percentage.

Again, we are all talking out of our asses since we don't have reasonable real world examples of how often and by how much the frequency drops, and whether the devs manage this or not. HOWEVER the presentation wording suggests against this with "majority of the time", and "very small drop since 2% drop is 10% less power".
 
How do both consoles handle lossy data?

We had LZ for lossless and JPG for lossy, but now we only have lossless or did I miss a big part? Kraken seems to have both modes?
 
Last edited:
Yes, but nothing cerney said conradicts what DF/Dictator explained.

Other than Cerny saying the GPU and CPU both run at their max clocks almost always? Dictator is claiming max clocks are an exclusive choice, one of the other. That's not how it works at all.
 
Status
Not open for further replies.
Back
Top