Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
This is my concern with those holding onto that 10.32 TF number. Frankly, I would have preferred the GPU to be a lower number and have the CPU fixed at 3.5Ghz. I suppose developers have a choice on locking one or the other. So perhaps my concern is a nothing burger when it actually comes to delivery. But in the case of discussion seems slightly dishonest to say PS5 is running 10.32 TF. It's also dishonest to say they have faster rasterization and ROP power when they have less bandwidth to feed it.
If the clocks were variable but the theoretical performance was higher than Xbox Series X it wouldn't be an issue. The situation that Sony is in is that they have given best case clock speeds and performance, that by their own description of how it works that max CPU and GPU clocks are not achievable at the same time. What we really need to know is what the range of the clocks will be. If the GPU can hit 2ghz but can also hit 300mhz , that would be a performance issue.
 
Spinal Tap would like a word..

Makes it easier if you later on add additional priority levels. You expand it to have 10 priority levels, [1, 3, 5, 7, 9, 11, 13, 15, 17, 19], now Priority Level 11 is no longer the highest priority. Or do you add negative levels for lower pirorities [-7, -5, -3, -1, 1, 3, 5, 7, 9, 11] ? ;)
 
... It's also dishonest to say they have faster rasterization and ROP power when they have less bandwidth to feed it.

....

Less for 10g, more for 6. At least devs won't have to tell what goes where. Here, same bandwidth for all the ram, I find it a better solution imo...
 
Less for 10g, more for 6. At least devs won't have to tell what goes where. Here, same bandwidth for all the ram, I find it a better solution imo...

That's because you are oversimplifying things. Giving the data requests that benefit from it significantly more bandwidth is easily worth the minimal extra complication. It should be very straightforward which data needs to go where and I'd be surprised if most operations don't route data automatically towards one pool or the other by default.
 
Less for 10g, more for 6. At least devs won't have to tell what goes where. Here, same bandwidth for all the ram, I find it a better solution imo...
Fairly certain rasterization (please correct me) was done on 32MB of esram, writing the final values out to main memory when it was completed.
I believe XSX pulls way infront of 4Pro for the same reason, the 30% compute difference is not the factor for some of the differences we see in resolution.
And the 4Pro has 2x the number of ROPS as X1X.
64 vs 32.
I have also seen similar 'filtrate' rasterization/ROP claims on fill rate for 4Pro vs X1X.

Bandwidth is the limiter for performance here.

detailed response here on XSX vs PS5 for fillrate performance
https://forum.beyond3d.com/posts/2111559/
 
Last edited:
Yea,
the PS5 will have fixed power.
How much the clocks will fluctuate and how high the temperature rises as a result of that is unknown to us.
As I've stipulated repeatedly, my concern is the asymmetric difference in power consumption between CPU and GPU. With a fixed amount of power, under heavy load for the GPU to sustain it's frequency, it will draw significant resources from the CPU. This is my concern with those holding onto that 10.32 TF number. Frankly, I would have preferred the GPU to be a lower number and have the CPU fixed at 3.5Ghz. I suppose developers have a choice on locking one or the other. So perhaps my concern is a nothing burger when it actually comes to delivery. But in the case of discussion seems slightly dishonest to say PS5 is running 10.32 TF. It's also dishonest to say they have faster rasterization and ROP power when they have less bandwidth to feed it.

The assumption made by that writer is that there is lots of waste so not to worry they will maintain their boost clocks. But we know that's not true, Sony games are notoriously good at saturating their hardware or else their fans wouldn't be spinning up so loudly for all their titles. Thus, power draw. I'd also caveat that having a lot of wasted cycles on a high clock rate means you're not really doing much more. It's like having a 5GHz GPU with bandwidth of only 40Mb/s. Your silicon is running fast and idle twiddling it's thumbs waiting for work to do. Even Mark Cerny mentions that the high GPU clock rate combined with the slower RAM is an architectural imbalance during his presentation.
Agree. We wont know for sure until we get DF face offs.

Mark was very precise on SSD and Tempest, but we have to keep in mind that as far as variable frequency goes, he described it in few sentences without exact numbers and corner cases. "Expect" + "most of the time" + "at or close to" can really mean anything when we are talking about 0-10%.

Average frequency might fall to ~2100MHz which would still easily fit the description and yet would bring system into mid 9TFs.

I also agree devs will rather run CPU at 3.5GHz, especially since XSX is 3.66GHz, then say 3.2GHz and constantly boost GPU. This is because it will be easier to turn effect or two down for GPU, then thinker on how to find 10-15% of CPU performance for game logic.

That and CPU clocks will bring more bang for buck (perf per watt) then OCing GPU past it sweet spot + will require more BW for higher perf of GPU, which is a bit of a downer on ng consoles.
 
Especially if we want 60FPS, we finally get more powerfull CPU's then getting them downclocked isn't what we want maybe. But then, Doom ethernal does 60fps stable on the premium machines, with graphics that match or better anything released so far. On base machines resolution can drop close to 720p on the other hand.
And yes, higher clocks do help even with Zen2 and i9 cpu's. Some games, especially BFV 64mp games like higher cpu clocks.
 
One thing that I apparently got backwards is that the smaller bandwidth allocated to the slow memory pool would lower the time that the CPU was blocking the GPU's memory accesses. Based on the way I'm seeing it explained, I think @Betanumerical had the right of it and it will actually increase it relative to the PS5's uniform 256-bit bus. However, the converse would theoretically also be true. The GPU's 320-bit access to the fast pool will lower the degree that it is blocking the CPU's memory accesses. I've seen it stated that the CPU is, in general, more latency-sensitive than the GPU, so maybe that's a desired outcome?
 
That's because you are oversimplifying things. Giving the data requests that benefit from it significantly more bandwidth is easily worth the minimal extra complication. It should be very straightforward which data needs to go where and I'd be surprised if most operations don't route data automatically towards one pool or the other by default.

Hope you're right. I've gtx 970 related nightmares.
 
Hope you're right. I've gtx 970 related nightmares.

The issue there was caused by PC developers not being able to design for a specific hardware configuration. If they had incentive to create a 970-specific version and granular control over what went where in memory on the video card (or the driver had a profile for every game to do it automatically), this could have been easily mitigated.
 
Agree. We wont know for sure until we get DF face offs.

Mark was very precise on SSD and Tempest, but we have to keep in mind that as far as variable frequency goes, he described it in few sentences without exact numbers and corner cases. "Expect" + "most of the time" + "at or close to" can really mean anything when we are talking about 0-10%.

Average frequency might fall to ~2100MHz which would still easily fit the description and yet would bring system into mid 9TFs.

I also agree devs will rather run CPU at 3.5GHz, especially since XSX is 3.66GHz, then say 3.2GHz and constantly boost GPU. This is because it will be easier to turn effect or two down for GPU, then thinker on how to find 10-15% of CPU performance for game logic.

That and CPU clocks will bring more bang for buck (perf per watt) then OCing GPU past it sweet spot + will require more BW for higher perf of GPU, which is a bit of a downer on ng consoles.
Most games being gpu limited, especially at 4k or anywhere near, and devs will choose to reduce gpu further?
 
Yes, I believe you're all right about this. But still, 6gb of "slower" ram is not nothing. I guess not everything will be able to fit in the faster 10gb. Then what ? It's a real question, I'm not saying perf will tanks, viva PS5 and all that.
 
One thing that I apparently got backwards is that the smaller bandwidth allocated to the slow memory pool would lower the time that the CPU was blocking the GPU's memory accesses. Based on the way I'm seeing it explained, I think @Betanumerical had the right of it and it will actually increase it relative to the PS5's uniform 256-bit bus. However, the converse would theoretically also be true. The GPU's 320-bit access to the fast pool will lower the degree that it is blocking the CPU's memory accesses. I've seen it stated that the CPU is, in general, more latency-sensitive than the GPU, so maybe that's a desired outcome?

I don't think GDDR6 latency ever played a big part. I remember latency being touted as a big advantage for Xb1 due to Esram + DDR3. Never materialized in the real world.
 
Last edited:
I don't think GDDR6 ever played a big part. I remember latency being touted as a big advantage for Xb1 due to Esram + DDR3. Never materialized in the real world.

Not sure you read what I wrote right. Or you didn't understand the context.
 
Agree. We wont know for sure until we get DF face offs.

Mark was very precise on SSD and Tempest, but we have to keep in mind that as far as variable frequency goes, he described it in few sentences without exact numbers and corner cases. "Expect" + "most of the time" + "at or close to" can really mean anything when we are talking about 0-10%.

Average frequency might fall to ~2100MHz which would still easily fit the description and yet would bring system into mid 9TFs.

I also agree devs will rather run CPU at 3.5GHz, especially since XSX is 3.66GHz, then say 3.2GHz and constantly boost GPU. This is because it will be easier to turn effect or two down for GPU, then thinker on how to find 10-15% of CPU performance for game logic.

That and CPU clocks will bring more bang for buck (perf per watt) then OCing GPU past it sweet spot + will require more BW for higher perf of GPU, which is a bit of a downer on ng consoles.


We should not trust Cerny about the GPU clocks. He's an employee of Sony. It's literally his job to put a rosy face on things. Any more than one should trust an MS exec when they tell you everything is great about a given Xbox architecture.

About CPU, these CPU's are so much more powerful than last gen, I can see developers not caring too much, and following a lowest common denominator system. If PS5 CPU is down at 3.0 ghz clock for a given multiplatform, , then Xbox will be running that same code too, not taking advantage that it has a little extra oomph.
 
Where did Sony say that?
Cerny stated it during his reveal. Here's a Eurogamer article with a quote:

"Rather than look at the actual temperature of the silicon die, we look at the activities that the GPU and CPU are performing and set the frequencies on that basis - which makes everything deterministic and repeatable," Cerny explains in his presentation. "While we're at it, we also use AMD's SmartShift technology and send any unused power from the CPU to the GPU so it can squeeze out a few more pixels."​

If power delivery is constant and clocks are known variable and not tied to thermals, and unused power from the CPU has to be shifted to the GPU to achieve the highest clocks, then the highest clocks for the GPU and CPU can't be achieved at the same time. Otherwise clocks would be at the maximum and power wouldn't have to be shifted to the GPU to achieve those speeds.
 
Less for 10g, more for 6. At least devs won't have to tell what goes where. Here, same bandwidth for all the ram, I find it a better solution imo...


I dont know why MS did that, wish they hadn't.

I'm going to hope their engineers knew what they were doing and made a good tradeoff, cost reduction for little performance hit.
 
Status
Not open for further replies.
Back
Top