Digital Foundry Article Technical Discussion [2023]

Status
Not open for further replies.
It's a pretty well-investigated tradeoff in the server/datacenter world where power is a significant running cost. For the same performance, wide and slow is dramatically more power efficient than narrow and fast, but costs more. So it poses a really interesting tradeoff between fixed Si cost vs. variable runtime cost, and from what I've seen DC customers are more than happy to pay for Si in order to offset power costs.

In the client space it's not power but thermals that's a bigger issue, so instead of fixed-vs-variable cost tradeoff it's a fixed(Si)-vs-fixed(thermal design) cost tradeoff.
Yup, thank you for clarifying this for readers who may not understand why we went away from single core high clocks to multi-core.

I would probably say that it's a well investigated physical limitation which led to the introduction of multi-core computing.
It really just comes down to voltage being cubic with clockspeed, and the faster you go, eventually your power demands get insanely high, but if you keep clockspeeds down and increase the number of cores, you gain computational power without introducing significant power and heat draw - of course at the cost of being able to probably feed both cores there is likely to be efficiency loss.

But yes, you are correct, in the datacenter world this is actually a thing. I was speaking purely about consumer GPUs, we just have power limitation challenges because we can only clock so fast before cooling is just impossible.
But you are right, when you're building cores that are extremely wide, and very few chips per wafer with a high defect rate, chips nearly cost the same as the wafer itself. I guess we are discussing a design philosophy on keeping costs down, I don't think there's any inherent 'architecture design philosophy' where someone would purposely choose wide and slow because it has benefits outside of power and cooling.
 
Cerny this and Cerny that.
Isn't what "actually happen" just that he had a budget and and with the IO System expense remained only enough money for a GPU of that size that had to be clocked higher and later he had to think about "well, it's not all bad, there are some positives"?
Meaning, his "more effective" GPU was just a happy wishiful thinking accident?
 
Been playing Star Wars Jedi Survivor recently. Is this game cpu bound? I'm on a 10900k and 3080, but I can't seem to get a stable 60fps with RT on... Even 4k DLSS performance mode will drop to 40fps in the first city scene. And turning down settings & internal res doesn't seem to affect the upper fps limit i get ~ 55fps
 
Been playing Star Wars Jedi Survivor recently. Is this game cpu bound? I'm on a 10900k and 3080, but I can't seem to get a stable 60fps with RT on... Even 4k DLSS performance mode will drop to 40fps in the first city scene. And turning down settings & internal res doesn't seem to affect the upper fps limit i get ~ 55fps
It’s just one of the worst ports of the last few years.
 
It’s just one of the worst ports of the last few years.
Actually it is not a port. It is just pourly optimized and the engine was maxed out in many ways.
There is always a point in an engine (let's assume it is well optimized) where you reach a certain limit for a feature that ran well while the scene had only a few objects but than someone decides to add more and on top of that also other features get used.

E.g. the melons in starfield are a great example. It works well for a few, even for thousands, but at a certain point it are just to many calculations to be done and sooner or later you get limited by the hardware because now you have half a million melons that get shadow, physics, light, ...
And in this case it is just the same melon over and over again. Now think of many different objects with their own high res textures, ...
It adds up quite fast. So adding a new effect/feature on its own might not decrease performance but applying that to many other things + effects/features can eat up every system quite fast.
Engines might be written with near to the metal APIs but that doesn't make them resistant to bottlenecks when a game is designed with the engine and heavily uses everything the engine can.

I often have that with businesslogic in the services I wrote (for onprem systems). One task is not much and the system won't do much, but than the customer comes in, connects one feature with another, iterates through all files in a library and also adds some logical endless loops that can only be broken if there is nothing that needs an update. But often those workflows loop a few times before everything is fine. I (who inderectly gives then that power) can see what they do with their systems and I can optimize their workflows but that costs them money so they often life with the problems they created until their system is so slow that they can't throw more hardware on it . They could also optimize their processes on their own, but they often don't do that because this would also cost money if a service worker of their company invests time in their workflows (especially until they are experienced enough).
Btw, that is one reason why in my company we prevent to give to much control over Cloud Services. In the cloud this could get really expensive quite fast.
 

0:00:00 Introduction
0:01:01 News 01: Sony announces updated PS5
0:23:29 News 02: Super Mario Bros. Wonder previewed!
0:33:16 News 03: Redfall updated with 60fps mode on Xbox
0:47:22 News 04: Robocop: Rogue City impressions
1:03:12 News 05: John’s gaming pickups!
1:11:34 Supporter Q1: Are more expensive CPUs becoming a requirement for 4K gaming?
1:19:31 Supporter Q2: Are consoles just going to get bigger and bigger as time goes on?
1:24:03 Supporter Q3: How does John decide between PC and consoles for digital games?
1:28:40 Supporter Q4: Should ReBAR be forced on for all games on Nvidia GPUs? Why isn’t it enabled more often?
1:32:27 Supporter Q5: Will we ever see an FPS Boost style feature on PlayStation?
1:37:31 Supporter Q6: If I usually use DLSS to hit 4K resolutions, should I look at native 1080p or 1440p benchmarks to indicate how well games will run?
1:42:42 Supporter Q7: What are some Activision Blizzard franchises you’d like to see revived under Microsoft ownership?
 
Miles (no pun intended) ahead of the State of Play reveal.
This feels like the optimal formula for this current generation of consoles. The asset streaming was the push, and knowing that compute is not quite there to do GI bounce lighting etc. just focus on reflections and let the asset streaming system handle everything else.

They’ve got a wonderful next Gen looking game with a high amount of world complexity and features without the computational bottlenecks at a visual compromise that most people won’t notice.
 
It's not the SSD that is really impactful here. It's the custom I/O hardware that is completely bypassing CPU and with super low latency.
I dont think there's anything terribly special about that, either. The 'custom I/O' hardware basically just comes down to a dedicated decompression block along with their bespoke 12 channel SSD controller(which I expect will get reduced in the Pro and eventual Slim). But even Xbox has this(pared down proportional to their slower SSD), as well as the ability to bypass the CPU. It's only on PC that DirectStorage cant (yet) bypass the CPU.

This is just plain old good work on the developer's part here, or at least making an emphasis to really show off what can be done if it's prioritized.
 
The technical definition of this execution is "bloody amazing". That's something like 2 seconds to prep and launch into a completely different part of the entirety of New York with massive view distance. An additional second to load to local details in high fidelity as Miles is falling. Whether IO or SSD or software within how the game engine works, this points to geometry import being a 'solved problem' and highlights one of the great successes this generation has ushered in.
 
I dont think there's anything terribly special about that, either. The 'custom I/O' hardware basically just comes down to a dedicated decompression block along with their bespoke 12 channel SSD controller(which I expect will get reduced in the Pro and eventual Slim). But even Xbox has this(pared down proportional to their slower SSD), as well as the ability to bypass the CPU. It's only on PC that DirectStorage cant (yet) bypass the CPU.

This is just plain old good work on the developer's part here, or at least making an emphasis to really show off what can be done if it's prioritized.
The PS5 CPU is pretty good but less powerful than PC CPU. I/O Complex and Tempest Engine help have more power for gameplay code, Physics or other CPU stuff.

And from a Fabian Giesen tweet he was surprised by the number of team asking a PS5 or Xbox Series software licence for Kraken, it means some developer don't use the I/O Complex at all or the hardware decompressor on Xbox. Because it is easier for multiplatform dev. The same code everywhere.

Very funny stuff too, Dolby Atmos on PS5 only work fully with title using the Tempest Engine. It works for Returnal but not Hogwarts Legacy.

This is good to see at least first party or big third party dev using I/O complex, the Xbox HW decompressor and the Tempest Engine.
 
Last edited:
It's not the SSD that is really impactful here. It's the custom I/O hardware that is completely bypassing CPU and with super low latency.

1697482356304.png

A very impressive showing on the PS5 and amazing use of the SSD. The eventual PC version of this should be very interesting.

Well we probably got about 2 years for directstorage optimizations to take hold regardless. :)
 
Status
Not open for further replies.
Back
Top