Digital Foundry Article Technical Discussion [2025]

Charlietus · Feb 10, 2025

I would take a performance hit on direct x13 if it meant that stutters were reduced. Who cares about max performance honestly.

Bludd · Feb 10, 2025

Charlietus said:
I would take a performance hit on direct x13 if it meant that stutters were reduced. Who cares about max performance honestly.

depends on the hit but yeah if its like 5% who cares

Lurkmass · Feb 10, 2025

Scott_Arm said:
Complex lighting ends up having inconsistent performance because it can worsen depending on quad utilization. Small triangles are the primary culprit, but you are guaranteed to have quads that do not have perfect quad utilization regardless. A deferred rendering is guaranteed to only execute the lighting once per pixel in a quad regardless of quad coverage/utilization, because it does not have to sample textures. The texture data that's relevant is written out to a gbuffer in a first pass, so mipmap selection is no longer relevant (and there's something about full-screen quads or compute shaders here that I don't fully know). That makes lighting perform better regardless of small triangles etc.

A full-screen compute shader dispatch could (depending hardware configuration) be faster than rendering a draw with a full-screen quad that consists of 2 triangles because no additional helper invocations are generated when compared to the interior diagonal edge of a quad ...

There's other advantages to running the lighting pass in compute shaders now commonly seen currently these days as opposed to pixel shaders. If you want to utilize other GPU hardware resources you can't use groupshared memory with the pixel shading pipeline since standardized APIs limit this capability to the compute shading pipeline but there are likely console extensions to bypass that specific limitation. Running the lighting pass in a compute pass also means that you can use async compute to interleave other types of tasks like drawing/rendering graphics work such as shadow map generation or copying resources around on a transfer queue with them concurrently ...

Lurkmass · Feb 10, 2025

Bludd said:
depends on the hit but yeah if its like 5% who cares

I would say closer upto 90% depending if your goal is to pressure every developer into opting ubershaders or getting IHVs to pivot removing their shader compilers as much as possible ...

Scott_Arm · Feb 10, 2025

I watched the UE livestream where they talked about shader compilation stutter and the tools they have to mitigate it. I felt like it was a fairly honest stream, though I don't remember if they addressed any bugs with the tooling that was causing pain points for devs as @Dictator brought up in the DF discussion. The points in the stream about having distribution of pre-compiled shaders like Valve does with Steam Deck was pretty interesting.

Is there anything that really prevents devs from creating "ubershaders" with UE5, or is it the way the material/shader editor works that prevents it? I'm assuming artists may author shaders in external tools as well?

Lurkmass · Feb 10, 2025

Scott_Arm said:
Is there anything that really prevents devs from creating "ubershaders" with UE5, or is it the way the material/shader editor works that prevents it? I'm assuming artists may author shaders in external tools as well?

Here's a relevant twitter thread from responses by Wihlidal:

https://twitter.com/x/status/1783923998600155247

https://twitter.com/x/status/1783938988287660487

Basically your options boil down to ALWAYS running a worst case (uber) shader with likely very high register pressure (results in poor HW occupancy) or wish away the "optimization passes" (inlining, dead code elimination, constant folding/propation, etc.) of IHV driver compilers in which the generated slop will exhibit slow execution times ...

In both cases you're pessimizing performance ...

Pjotr · Feb 10, 2025

Scott_Arm said:
I watched the UE livestream where they talked about shader compilation stutter and the tools they have to mitigate it. I felt like it was a fairly honest stream, though I don't remember if they addressed any bugs with the tooling that was causing pain points for devs as @Dictator brought up in the DF discussion. The points in the stream about having distribution of pre-compiled shaders like Valve does with Steam Deck was pretty interesting.

Is there anything that really prevents devs from creating "ubershaders" with UE5, or is it the way the material/shader editor works that prevents it? I'm assuming artists may author shaders in external tools as well?

Yes you can make ubershaders with the material editor. But there could be performance implications with ubershaders that an artist does not know about. I think that things like counting the number of registers used and so on are better left up to the engineers. And the shader graph might not be expressive enough to get optimal shader code for ubershaders. Most artists tend to use static switch nodes which gets you an explosion of number of shaders quickly. The best workflow is that some technical artists or engineers make some materials that are carefully curated and that artists only use those.

But I also have had artists just duplicate materials and change them a little for on-off things just out of convenience. Like for really lame things like when an texture-map is too bright/dark, instead editing the map with photoshop they duplicate the shader and add a bunch of math nodes. These are the things that get you uncontrollable shader stutter. Think of this: now you have one object with a custom shader sitting in a big level and that shader is only used for that object so you need to be lucky that during QA that objects gets loaded else you will not collect PSOs for it. The material system in UE is so user friendly that it is both a blessing and a curse now. It gets you really good looking scenes because there is far more materials variety. But sometimes you also have to ask if that is needed. Look for instance at KCD2. CryEngine does not a material graph so only engineers can make new materials. This limits the number of materials that exists by a lot and this also shows. For me KCD2 looks really last gen because I pick up on this, but the general public does not care and praise it for the good graphics.

raytracingfan · Feb 10, 2025

Lurkmass said:
or wish away the "optimization passes" (inlining, dead code elimination, constant folding/propation, etc.) of IHV driver compilers in which the generated slop will exhibit slow execution times ...

Can the API just give developers the choice between a proper compilation with all the optimization passes and a fast compilation without them? That way as many shaders as possible can be properly compiled ahead of time, but whenever a shader needs to be compiled in real-time for whatever reason the fast option can be chosen to minimize stutter and CPU cost (and the PSO added to a queue for proper compilation when there's enough CPU headroom to allow it).

Lurkmass · Feb 10, 2025

raytracingfan said:
Can the API just give developers the choice between a proper compilation with all the optimization passes and a fast compilation without them? That way as many shaders as possible can be properly compiled ahead of time, but whenever a shader needs to be compiled in real-time for whatever reason the fast option can be chosen to minimize stutter and CPU cost (and the PSO added to a queue for proper compilation when there's enough CPU headroom to allow it).

Well no IHV out there wants to effectively develop/QA test two independent compiler stacks for their own drivers!

Why else are they converging on notoriously slow common compiler infrastructure like Clang/LLVM for all these past years ? Why else is Microsoft interested in integrating SPIR-V with Direct3D ? It's because the industry is interested in sharing more work with each other in the open since it saves them resources even if it makes the consumer user facing (gamer) experience in question worse!

Also as a little bit of a joke, what's the market potential like for a specialized ASIC for faster optimized code generation so that way we can charge customers some more that want to get rid of perceived driver compilation job spikes ? (one standardized CPU to be able to do everything while another special CPU/ASIC design for faster optimal code analysis for anyone that dares)

raytracingfan · Feb 10, 2025

Lurkmass said:
Well no IHV out there wants to effectively develop/QA test two independent compiler stacks for their own drivers!

Why else are they converging on notoriously slow common compiler infrastructure like Clang/LLVM for all these past years ?

It was the industry-standard compilers that made me ask the question - they all have flags for picking an optimization level, with the lower levels resulting in faster compilation but less optimized output. I'm not sure how applicable the comparison is to IHV driver compilers, but I don't think it would require maintaining two entirely separate compilers. On a related note, can some of the optimization passes be moved from the DXIL/SPIR-V -> machine code step on the user's machine to the HLSL/GLSL -> DXIL/SPIR-V step on the developer's machine?

Scott_Arm · Feb 10, 2025

@Pjotr id software uses the ubershader approach in the Doom series. My takeaway from what you've written is that they must have engineers or technical artists that create a library of materials that the other artists may use when creating assets for the game. It's an entirely different workflow to ensure performance, but imposing some limitation on artists because there's a finite set of materials to work with, or maybe some material limitations to keep performance from collapsing.

Lurkmass · Feb 10, 2025

raytracingfan said:
It was the industry-standard compilers that made me ask the question - they all have flags for picking an optimization level, with the lower levels resulting in faster compilation but less optimized output. I'm not sure how applicable the comparison is to IHV driver compilers, but I don't think it would require maintaining two entirely separate compilers. On a related note, can some of the optimization passes be moved from the DXIL/SPIR-V -> machine code step on the user's machine to the HLSL/GLSL -> DXIL/SPIR-V step on the developer's machine?

How is a "hardware invariant" compiler optimization model supposed to work in your proposal ?

Attempting to move these optimization techniques towards a higher level representation just means that the IHV compiler now has to potentially work even harder to undo these 'supposed' optimizations ...

Take for instance bit manipulation instructions PDEP/PEXT where AMD architectures prior to Zen 3 had microcoded implementations so how is a compiler supposed to know when to generate these instructions or generate alternative binaries for optimal execution times without that specific knowlegde in hand ?

GhostofWar · Feb 11, 2025

The reason that battlefield trailer made the DF guys have bf3 vibes is because I think some of the quick shots were from bf3 maps, i'm almost positive I saw grand bazaar (day) on the east side of the map with the pedestrian overpass. I got the feeling they might be going all in on that thing they did in the last BF (was it called BF portal?). Where you could select classic maps and apply whatever game rules you felt like.

trinibwoy · Feb 12, 2025

Completely agree with Alex’s take on Monster Hunter Wilds. It’s really hard to see what that game is doing to justify the hardware requirements. The environment detail, texture work and lighting are very mediocre.

Charlietus · Feb 12, 2025

trinibwoy said:
Completely agree with Alex’s take on Monster Hunter Wilds. It’s really hard to see what that game is doing to justify the hardware requirements. The environment detail, texture work and lighting are very mediocre.

At least they made some huge improvements from the beta in the near final build.

The beta literally looked like mud, now it looks playable.

RobertR1 · Feb 12, 2025

GhostofWar said:
The reason that battlefield trailer made the DF guys have bf3 vibes is because I think some of the quick shots were from bf3 maps, i'm almost positive I saw grand bazaar (day) on the east side of the map with the pedestrian overpass. I got the feeling they might be going all in on that thing they did in the last BF (was it called BF portal?). Where you could select classic maps and apply whatever game rules you felt like.

On the surface game looks like what we all want. BF3 setting and gameplay with bad company destruction. Let’s see if they can keep themselves from getting in their own way.

trinibwoy · Feb 12, 2025

I hope BF next is great. I played a ton of BF3 and BC2 back in the day but no Battlefield has sucked me in since. I was never a fan of COD. Would love an excuse to go on another 1000 hour romp.

DavidGraham · Feb 13, 2025

Avowed on PC features strong visuals with a very powerful Hardware Lumen implementation for global illumination/reflections and decent performance.

trinibwoy · Feb 13, 2025

Awesome to see UE5 being used well. Hope it’s a sign of things to come.

trinibwoy · Feb 18, 2025

You get a UE5 game and you get a UE5 game! Everybody gets a UE5 game!

Digital Foundry Article Technical Discussion [2025]

Charlietus

Bludd

Experiencing A Significant Gravitas Shortfall

Lurkmass

Lurkmass

Scott_Arm

Lurkmass

Pjotr

raytracingfan

Lurkmass

raytracingfan

Scott_Arm

Lurkmass

GhostofWar

trinibwoy

Meh

Charlietus

RobertR1

Pro

trinibwoy

Meh

DavidGraham

trinibwoy

Meh

trinibwoy

Meh

Similar threads