Digital Foundry Article Technical Discussion [2023]

Status
Not open for further replies.
The discussion of vram kind of tells me that we are still in cross-gen. Developers are still not making use of some of the current-gen technologies that are supposed to help mitigate the need for 16 Gigs of vram GPUs. Many games still have a minimum spec from the Xbox One/PS4 Generation.

Happy to hear they'll likely be doing some Hogwarts coverage. There's a lot of tech to analyse there.
 
Games are very latency sensitive so they'll never be as CPU multi-threaded as people expect them to be in the future.

There will always be tasks that have too high of a latency penalty when spread over multiple cores and thus there will always be a need for higher clocks and improved IPC from CPU's.
 
Games are very latency sensitive so they'll never be as CPU multi-threaded as people expect them to be in the future.

There will always be tasks that have too high of a latency penalty when spread over multiple cores and thus there will always be a need for higher clocks and improved IPC from CPU's.

ND multi threaded it engine for TLOU rematered and going from 30 fps to 60 fps was enough to counter the latency penalty of it.


They work on three frame at the same time.
 
The discussion of vram kind of tells me that we are still in cross-gen. Developers are still not making use of some of the current-gen technologies that are supposed to help mitigate the need for 16 Gigs of vram GPUs. Many games still have a minimum spec from the Xbox One/PS4 Generation.
It's not just the settings. Developers can optimise VRAM consumption a lot without any visible difference. I've read this from Crytek, id and CIG developers. When I look at Crysis 3 on the PlayStation 3 I find it amazing that it worked with so little RAM.

Nevertheless, this does not change the fact that high-end effects like raytracing or voxel effects need a lot of RAM. I'm just surprised that 12 GB causes problems in Dead Space while Cyberpunk 2077 runs great.
 
CPUs are not getting sufficiently faster in the near future, what they should do is this: whenever they detect a game that is single threaded (or low multi threaded), they should massively downclock or even disable all the inactive cores, and super charge the loaded cores to ultra frequencies, I want to see an Intel 14900K/Ryzen 8950X with 8GHz or even 9GHz single core frequency (or ideally 2 cores frequency), that should offset the very low IPC gains we get with each new CPU.
 
CPUs are not getting sufficiently faster in the near future, what they should do is this: whenever they detect a game that is single threaded (or low multi threaded), they should massively downclock or even disable all the inactive cores, and super charge the loaded cores to ultra frequencies, I want to see an Intel 14900K/Ryzen 8950X with 8GHz or even 9GHz single core frequency (or ideally 2 cores frequency), that should offset the very low IPC gains we get with each new CPU.

Wouldn't you get too much heat in a small area in that situation?
Maybe even thermal stress oddities resulting from maximum heat adjacent to inactive cores.
Although I suppose if that was their goal, they could adapt the design to make it function in that way.
 
Wouldn't you get too much heat in a small area in that situation?
Maybe even thermal stress oddities resulting from maximum heat adjacent to inactive cores.
Although I suppose if that was their goal, they could adapt the design to make it function in that way.
Theoretically if you disabled every other core, assuming the cores were laid out physically in a grid, you could create a checkerboard pattern of heat that would give more heatsink mass per active core to aid in cooling. How much it would matter is another story, of course, but spreading out the space between heat producing cores should make them easier to cool.
 
Games are very latency sensitive so they'll never be as CPU multi-threaded as people expect them to be in the future.

There will always be tasks that have too high of a latency penalty when spread over multiple cores and thus there will always be a need for higher clocks and improved IPC from CPU's.

Gonna have to respectfully disagree here.
Even the most latency sensitive parts of game engine design ( eg audio) are going multi-threaded.
The key to good multi-threading is breaking your work up into several smaller jobs, and having a suitable job system that can scale from 1-N cores,
and it's not just about scaling actual cpu usage, because you might want to scale cpu usage differently vs ram usage, or to optimize latency, or cache coherency.

There will always be a task that is the limiting factor, but claiming that a modern multi-core 3Ghz+ out of order CPU, is fundamentally incapable of handling certain operations is just wrong. It might require some deep optimization, or even design from the start to be more suitable to multi-threading, but modern CPU's are sooooo fast, it's silly to claim they are not fast enough for multi-threading across multiple cores.

Of course we will always welcome more speed and IPC :) then i can just be a lazier programmer!
 
CPUs are not getting sufficiently faster in the near future, what they should do is this: whenever they detect a game that is single threaded (or low multi threaded), they should massively downclock or even disable all the inactive cores, and super charge the loaded cores to ultra frequencies, I want to see an Intel 14900K/Ryzen 8950X with 8GHz or even 9GHz single core frequency (or ideally 2 cores frequency), that should offset the very low IPC gains we get with each new CPU.

Looking at single-core overclocking records doesn't offer much reason to be optimistic here. Even with LN2 the clocks don't get all that much higher than what we've already got, and even then we're only talking about a one-time speed bump, not something that would continue to scale in subsequent iterations. Single-thread performance is still very much in demand for low latency trading, so I'd imagine whatever obvious levers there are to be pulled have already been pulled (with respect to single-core boosting). Even if we hadn't hit the Dennard scaling power wall we probably wouldn't have been too far off from a signal propagation limit anyways, so we were bound to hit a wall sooner or later.
 
Plus 9Ghz only gets you so much when your ram is still running at the same speed as it is now.

A lot of modern computing is mostly limited by data transfer rather than actual compute resources,
so your better off putting your effort into designing better algorithms and smaller data structures to start with.
I know thats hard work, much harder than simply saying I want 9Ghz CPU! But thats life.
 
Even the most latency sensitive parts of game engine design ( eg audio) are going multi-threaded.
But not everything can and that's my point.
There will always be a task that is the limiting factor, but claiming that a modern multi-core 3Ghz+ out of order CPU, is fundamentally incapable of handling certain operations is just wrong.
At no point did I state that, I simply stated that some things just don't scale across multiple cores all that well and as well as people expect them too.

That is not the same as saying a CPU can't process certain operations.
it's silly to claim they are not fast enough for multi-threading across multiple cores.
Claiming a CPU is not fast enough to make effective use of multi-threading and saying certain work loads just don't scale across multiple cores all that well are again, two very different things.
Of course we will always welcome more speed and IPC :) then I can just be a lazier programmer!
I would just use extra speed and IPC to play 2007 Crysis...... getting very close to finally being able to lock it to 60fps across the whole game now.
 
But not everything can and that's my point.

At no point did I state that, I simply stated that some things just don't scale across multiple cores all that well and as well as people expect them too.

That is not the same as saying a CPU can't process certain operations.

Claiming a CPU is not fast enough to make effective use of multi-threading and saying certain work loads just don't scale across multiple cores all that well are again, two very different things.

I would just use extra speed and IPC to play 2007 Crysis...... getting very close to finally being able to lock it to 60fps across the whole game now.

What workload don't scale across multiple cores? Physics scale around multiple cores for examples. if development of the engine turns around data, gameplay code can scale too.

The best example of a game scaling very well is Doom Eternal. It is scaling good up to 16 cores/32 threads and same for Decima physics engine.
 
The best example of a game scaling very well is Doom Eternal. It is scaling good up to 16 cores/32 threads and same for Decima physics engine.

Doom does very little CPU wise so it's not the best example as all its CPU has to do is setting up frame for the GPU to render.

It's levels are small, the AI is dumb and its game world is physically dead.

BVH updating is also something that doesn't scale infinitely across cores, I'll have to check the ray tracing thread again for the link to the paper I posted but I believe 3 cores are optimal for it as using more cores starts to hurt performance.
 
Doom does very little CPU wise so it's not the best example as all its CPU has to do is setting up frame for the GPU to render.

It's levels are small, the AI is dumb and its game world is physically dead.

Again give me example of things don't scaling well. This is not a 2023 debate but a 2013 or 2014 debate. ND, GG succeed to massively multithread their engine. Decima physics engine and Phys X too, the first one scale to 16 cores. It would be better to have only one CPU core at 40 Ghz but this is impossible.




Mostly since the ps4 generation of consoles, which have 8 very weak cores, architectures have evolved into trying to make sure all the cores are working on something and doing something useful. A lot of game engines have moved to a Task based system for that purpose. In there, you don’t dedicate 1 thread to do one thing, but instead split your work into small sections, and then have multiple threads work on those sections on their own, merging the results after the tasks finish. Unlike the fork-join approach of having one dedicated thread and having it ship off work to helpers, you do everything on the helpers for the most part. Your main timeline of operations is created as a graph of tasks to do, and then those are distributed across cores. A task cant start until all of its predecessor tasks are finished. If a task system is used well, it grants really good scalability as everything automatically distributes to however many cores are available. A great example of this is Doom Eternal, where you can see it smoothly scaling from PCs with 4 cores to PCs with 16 cores. Some great talks from GDC about it are Naughty Dog “Parallelizing the Naughty Dog Engine Using Fibers” 3 and the 2 Destiny Engine talks 4 5
 
Again give me example of things don't scaling well. This is not a 2023 debate but a 2013 or 2014 debate. ND, GG succeed to massively multithread their engine. Decima physics engine and Phys X too, the first one scale to 16 cores. It would be better to have only one CPU core at 40 Ghz but this is impossible.

There's more to game engines than just physics and I already gave you an example in the last reply. But you're on ignore for a reason.

And if you're going to use games as an example don't use ones with relatively simple CPU loads.
 
and same for Decima physics engine.
Decima scales well in Horizon Zero Dawn, but doesn't really scale beyond 6 cores in Death Stranding, so I guess it's situational?

I will also add Frostbite to the list, it can can scale well on 8 and beyond cores.

But, all of these examples are still a drop in the bucket compared to the vast majority of engines and games that get CPU limited quickly. Heck, UE4 and friends are still badly CPU limited to this day. UE5 (the engine of next gen) doesn't scale well on multi core PC CPUs to this day. It responds well to frequency, but not multi cores.

Plus 9Ghz only gets you so much when your ram is still running at the same speed as it is now.
DDR5 operates at 7800MT/s now, more than double that of DDR4, we are heading to 10000 MT/s soon (almost triple DDR4), yet CPU frequency didn't double in the same period of time.
ND, GG succeed to massively multithread their engine. Decima physics engine and Phys X too, the first one scale to 16 cores.
Multi-Core scaling on weak console cores is different to scaling on beefy over provisioned PC CPU cores, the same workload on PS4 would barely saturate a powerful 4 core PC CPU. We are still nowhere close to even saturate a contemporary 8 core PC CPU in 2023.

they could adapt the design to make it function in that way.
Maybe put one lean mean single core that can reach super clocks in the center of the package, even if they have to change its architecture to a different one and pipeline the hell out of it. This should carry the weight of the entire CPU regarding badly threaded workloads, and boost gaming performance tremendously.
 
Decima scales well in Horizon Zero Dawn, but doesn't really scale beyond 6 cores in Death Stranding, so I guess it's situational?

I will also add Frostbite to the list, it can can scale well on 8 and beyond cores.

But, all of these examples are still a drop in the bucket compared to the vast majority of engines and games that get CPU limited quickly. Heck, UE4 and friends are still badly CPU limited to this day. UE5 (the engine of next gen) doesn't scale well on multi core PC CPUs to this day. It responds well to frequency, but not multi cores.


DDR5 operates at 7800MT/s now, more than double that of DDR4, we are heading to 10000 MT/s soon (almost triple DDR4), yet CPU frequency didn't double in the same period of time.

Multi-Core scaling on weak console cores is different to scaling on beefy over provisioned PC CPU cores, the same workload on PS4 would barely saturate a powerful 4 core PC CPU. We are still nowhere close to even saturate a contemporary 8 core PC CPU in 2023.


Maybe put one lean mean single core that can reach super clocks in the center of the package, even if they have to change its architecture to a different one and pipeline the hell out of it. This should carry the weight of the entire CPU regarding badly threaded workloads, and boost gaming performance tremendously.

The Decima physics engine was tested on PC and it scales on 16 cores/32 threads with higher clock than consoles. Again HZD was made for a PS4. Wait Horizon 3 made for the PS5 and we will see how it scale on at least 8 cores PC CPU.;)

And if we have 16 cores/32 threads on PS6/next Xbox I am sure all this engine will scale well on 16 cores /32 threads PC CPU.
 
There's more to game engines than just physics and I already gave you an example in the last reply. But you're on ignore for a reason.

And if you're going to use games as an example don't use ones with relatively simple CPU loads.

No this is not an example. Doom Eternal game design has nothing to do with multicore limitation, Decima engine scale well with CPU core too. Give me a part of the engine which don't scale AI, audio Physics, Gameplay code?

@vjPiedPiper gave one example with audio because of latency but game developer use multithreading too.



Another video of multithreading game engine:
 
Status
Not open for further replies.
Back
Top