Digital Foundry Article Technical Discussion [2024]

Status
Not open for further replies.
I am very glad how strongly they expressed their negative opinions about both FSR1 and 2 (the console settings). They both look terrible, are destroying the image quality in many ways, and shouldn't be used at all by developers, period. And finally John is admiting that CBR rendering as used in many Pro games was a bett

CBR can look decent, and I've always appreciated that it seems to handle aspects like post process better than some other temporal methods, albeit that could be just due to the care taken when it's on a platform with that as the only option, plus considering the high starting native resolution as I explain below.

As noted it can vary in quality significantly, but more importantly it doesn't have near the performance scaling as other temporal solutions, even FSR2. Now as others have said, that's exactly the problem - they're asking too much of these techniques on consoles by trying to scale from far too low a resolution, but that alone rules out CBR as a replacement for these titles. While there are dynamic implementations of CBR, most titles on the PS4Pro at least started from 1920x2160 - that's more pixels that FSR2/DLSS Quality mode in 4k! Add that overhead, and you're likely in the performance ballpark of native 1800p - which is probably why Death Stranding: DC nixxed checkerboarding on the PS5 and just said 'fuck it, straight 1800p for performance mode'.

It's a decent solution for when 4K is asking too much, but you really can't compare how older games looked with it and compare it to modern UE5 titles using FSR2 scaling to reach 4k. It would likely look just as bad, if not even worse, if it was tasked to scale from the resolutions these games are trying to do. Like 1920x2160 probably has a higher pixel count than the resolutions some UE5 titles are putting out after FSR2, and before their spatial scaling component! :)

Edit: Watched that segment and totally in agreement with Alex, it's disappointing to see such little iteration on FSR2's quality, outside of adding frame gen. UE5's TSR shows there are ways to improve the quality even without machine learning.
 
Last edited:
If this was addressed to me, of course, I was referring to spatial upscaling in the context of temporal upscalers. Spatial upscalers are utilized in the core loop of TAAU and all other temporal upscalers. When TAAU/FSR/whatever else fails to accumulate samples for various reasons, you will see the low resolution spatially upscaled image with all the low res underlying aliasing (on camera cuts, in the disoccluded regions or on the periphery of the screen or in motion when MVs have not been dilated). Robust spatial upscaling is a huge part of the puzzle on the way to better temporal upscalers, if not the main one right now.

Wasn't addressed to you. Was referring to the limitations of spatial upscalers on their own that do not use temporal data. Things like FSR1, DLSS1.
 
fixed the link starting at the 0:19:40 mark. I agree with them. For MS stay in the hardware business, a bit of a more open platform and strategy might help. They have 100% of the OS after all. Maybe a MSX like thing, learn from the japanese.

Where I don't agree with them is when they say that AI don't need to enhance NPCs dialogues and so on. Sony filled a patent for PS6 to add AI to the NPCs so they respond to you more naturally and in many more varied ways.
 
A good trip down memory lane.


Not great that you need to enable this to fix the stuttering as most people won't be aware of this option - it was only very recently added, but it's pretty neat:

1707861969763.png

Special K actually was the first to do this, at least for OpenGL (don't know if it does it in Vulkan). Basically what this allows you to do is take advantage of the Direct3D swapchain, but the game is of course still rendering using whatever API it's using. This allows you to take advantage of well established methods for controlling frame buffering like you can in native Direct3D titles, such as Fast Sync, using low latency etc whereas otherwise they wouldn't work with OpenGL/Vulkan. Very handy for older OpenGL games like Wolfenstein: New Order/Old Blood, where with the game's vsync they have annoying stuttering, so before this the only solution was to force OpenGL triple buffering, but that adds noticeable latency. With this you can use Fast Sync and a Rivatuner scanline sync of 1 to get the proper fame pacing, but with better latency.

Would love to see this expanded to DX12 titles, it's annoying when you encounter problematic vsync behavior with DX12 titles as you're completely up to the developers skill, there's no way to control it except to force vsync on/off. Don't know if that's technically possible though.
 
The reality temporal solutions are only going to increase. Sampling over time is the future. You're just not going to see real gains in the quality of rendering without temporal data. Spatial upscaling is very limited because it will always be some kind of interpolation, where temporal has access to real good samples that you've already generated. Downscaling is just a dead end because it requires generating more samples per frame which is just brute force. I do think the real issue is pushing the limit of temporal upscalers past their capabilities. Upscaling from 720p to 1440p, and then applying an additional spatial upscale to 4k is never going to look sharp, at least with current solutions (but probably never).
I personally think around 1080p internallyt is like the limit for FSR2 and devs should be optimizing games for 1080p to begin with. It's kind of crazy how many games on base PS4 were 1080 back then and even when cross gen came around which is usually when we see cutbacks to last gen
 
Yep. I still remember all the customised to death types of AA, such as the HRAA, and other cool methods used a decade ago. Yet, the development of the current upscalers is awfully and annoyingly slow, as if the past decade of research in the AA field was suddenly forgotten. Like how does it come that we are still enjoying those low-res aliased edges in motion when morphological AA methods were available a decade ago and cost nothing today? Just prefilter the damn input before upscaling and accumulating samples, as I suggested years ago, as was done in the SMAA 2X a decade ago, and as the STP has finally implemented it now (ctrl+f the GEAA. God, thank you!), or use the MLAA itself for the spatial upscaling (simple search and replace problem) instead of relying on the Lanczos or bicubic filtering for the spatial upsampling inside of the TAA loop (are we in a stone age?). There were cheap coverage samples a decade ago, which cost nothing and required just a couple of bits per sample - use them to achieve the perfect spatial edge upscaling with the higher resolution coverage samples frequency. Or at least use the barycentrics to calculate the distance to the edge (GBAA from the Humus) to properly reconstruct it in higher resolution. Without this essential stuff we will never achieve upscalers with good enough quality of geometry edges in motion and with good enough quality for higher than 4x upscaling factors.
I think UE5's TSR sorts of walking in this direction as well. From the shader codes, they seem to apply a form of morphological AA based on Luma whenever the history sample is rejected (could be FXAA/SMAA, or their in-house edge search function)
Although I would argue these spatial data feel limited, even using them only in disoccluded areas could be visually interrupting (but that's the least we can do lol). I think we can bend more towards ML accelerated resolving which has been proved by DLSS2, XeSS and Temporal Metal Upscaler.

I think vendors should embrace more ML related hardware standards to help make a generic ML upscaler possible across the platform (like what XeSS did. We don't need a single API that does the whole thing, but rather hardware accelerated instructions to speed up ML operations). However, given the consoles' fixed hardware specs, this feels like something we could only expect in next generation at least.
 
CBR can look decent, and I've always appreciated that it seems to handle aspects like post process better than some other temporal methods, albeit that could be just due to the care taken when it's on a platform with that as the only option, plus considering the high starting native resolution as I explain below.

As noted it can vary in quality significantly, but more importantly it doesn't have near the performance scaling as other temporal solutions, even FSR2. Now as others have said, that's exactly the problem - they're asking too much of these techniques on consoles by trying to scale from far too low a resolution, but that alone rules out CBR as a replacement for these titles. While there are dynamic implementations of CBR, most titles on the PS4Pro at least started from 1920x2160 - that's more pixels that FSR2/DLSS Quality mode in 4k! Add that overhead, and you're likely in the performance ballpark of native 1800p - which is probably why Death Stranding: DC nixxed checkerboarding on the PS5 and just said 'fuck it, straight 1800p for performance mode'.

It's a decent solution for when 4K is asking too much, but you really can't compare how older games looked with it and compare it to modern UE5 titles using FSR2 scaling to reach 4k. It would likely look just as bad, if not even worse, if it was tasked to scale from the resolutions these games are trying to do. Like 1920x2160 probably has a higher pixel count than the resolutions some UE5 titles are putting out after FSR2, and before their spatial scaling component! :)

Edit: Watched that segment and totally in agreement with Alex, it's disappointing to see such little iteration on FSR2's quality, outside of adding frame gen. UE5's TSR shows there are ways to improve the quality even without machine learning.
To give FSR2 credits, the devs do iterate the algorithms quite a lot over the year. There are quite a lot of additions now compared to what I've seen in summer 2022, mainly focusing on transparency and motion recognition. Although these hand-tuned solutions are understandably limited. And performance takes a hit the more "rules" you add to an algorithm.
However, as a dev who actually worked with integrating FSR, all I can say is that due to all that non-technical refactors, there's no guarantee that the latest iteration of FSR2 (or even DLSS2) is gonna be used. Even when used, not all features are ticked on. As an engineer I always wish I have more time to polish everything, but when the deadline is around the corner: "****, just ship it, it works at least"
 
I personally think around 1080p internallyt is like the limit for FSR2 and devs should be optimizing games for 1080p to begin with. It's kind of crazy how many games on base PS4 were 1080 back then and even when cross gen came around which is usually when we see cutbacks to last gen

I disagree here, 1080p input is more than adequate for DLSS but not for FSR2.

In my opinion and from my testing it needs to be closer to 1440p to get an half decent image from FSR2.
 
the new DLSS from Microsoft is called Automatic Super Resolution, but apparently it needs a NPU? (No Ph****** Idea). Will it work with my Ryzen 3700X or does it have NPU? Dunno what it is. Man I was hyped.

M1dTfIn.png


There is also this tool which seems fitted to OLED screens with HDR and so on.


A tool like this would help to not having to use lossless scaling in certain games that don't support XeSS etc. I like to play in silence with RT on so the GPU doesn't go above 120W of power consumption and lock games like Elden Ring at 30fps and use Frame Generation, thought I am using Black Frame Insertion from the TV as of late since it works okay, while not as good as FG it's better than nothing, but this feature would save me the extra processing of lossless scaling) and have better image quality natively from the OS. But this NPU thing sounds puzzling and I don't think I have that. :sneaky:
 
Last edited:
the new DLSS from Microsoft is called Automatic Super Resolution, but apparently it needs a NPU? (No Ph****** Idea). Will it work with my Ryzen 3700X or does it have NPU? Dunno what it is. Man I was hyped.

M1dTfIn.png


There is also this tool which seems fitted to OLED screens with HDR and so on.


A tool like this would help to not having to use lossless scaling in certain games that don't support XeSS etc. I like to play in silence with RT on so the GPU doesn't go above 120W of power consumption and lock games like Elden Ring at 30fps and use Frame Generation, thought I am using Black Frame Insertion from the TV as of late since it works okay, while not as good as FG it's better than nothing, but it saves me the extra processing of lossless scaling) and have better image quality natively from the OS. But this NPU thing sounds puzzling and I don't think I have that. :sneaky:
your gpu can be used for AI accelereration (NPU = neural processing unit) but if this setting can use GPUs for it, who knows
 
your gpu can be used for AI accelereration (NPU = neural processing unit) but if this setting can use GPUs for it, who knows
hope it works with the GPU too, it would be very limiting to need a "NPU" from a certain device, and this would open the use of good quality AI upscaling on games that have nothing or basically all games you wanted to use with upscaling via the OS, and it sounds very good, improving the IQ and framerate of any game, new and old from an OS level.
 
How much die area does the NPU on Meteor lake take up? I'm looking at the Intel blurb and the TPU bit on it and can't see anything.

I'm thinking in terms of guessing how practical it would be to have one on next gen consoles. It looks like potential uses could go well beyond upscaling.
 
How much die area does the NPU on Meteor lake take up? I'm looking at the Intel blurb and the TPU bit on it and can't see anything.

I'm thinking in terms of guessing how practical it would be to have one on next gen consoles. It looks like potential uses could go well beyond upscaling.
If you're using it for video game work, you're going to want the hardware integrated with the GPU. A separate NPU would negate any advantage of having it in the first place, travel time, cache sharing etc, would all be ruined by moving the data off chip and back each time you want to run AI models against it.

What's redeeming about the article is that it shows some investment by MS to generate the AI models to do the work. If those AI models (which are costly to develop) are sufficient and satisfactory enough, they may make their way to consoles. It's not clear right now is MS is using a 3rd party service for this model, or if they made their own.
 
If you're using it for video game work, you're going to want the hardware integrated with the GPU. A separate NPU would negate any advantage of having it in the first place, travel time, cache sharing etc, would all be ruined by moving the data off chip and back each time you want to run AI models against it.

Yeah, I was thinking of it as being integrated into the SoC. BW to the GPU would be most important I imagine, but so long as you can output to ram the CPU could use datasets it had produced too.

The main thing I'm trying to get an idea of is how much area it would need - to try and give an idea about what you might have to give up to get it (e.g. CUs).

What's redeeming about the article is that it shows some investment by MS to generate the AI models to do the work. If those AI models (which are costly to develop) are sufficient and satisfactory enough, they may make their way to consoles. It's not clear right now is MS is using a 3rd party service for this model, or if they made their own.

MS have said they've been researching on AI upscaling for years now. Would be good if they had their own solution so they could roll it out wherever they wanted without worrying about licensing costs.
 
Yeah, I was thinking of it as being integrated into the SoC. BW to the GPU would be most important I imagine, but so long as you can output to ram the CPU could use datasets it had produced too.

The main thing I'm trying to get an idea of is how much area it would need - to try and give an idea about what you might have to give up to get it (e.g. CUs).
aahh, it's pretty tiny from what I understand. They are effectively simplified chips that are effectively a form of tensor silicon with faster paths for running. So they can't train AI on these NPUs, only run them. I would say, to get an idea of what an NPU inside a GPU would look like, nvidia's tensor cores would be sufficient model to judge by in their chips and how much die space those take up (and subtract some off because those can be used for training as well).

Then again nvidia tensor cores are general purpose, so they can have a multitude of sizes they can support over their tensor cores. I'm not sure if NPUs are fixed in size. It would be interesting to see how consoles would tackle ML, fixed dedicated silicon for a particular network type would reduce power and silicon requirements, possibly run faster than generic hardware - but it wouldn't be flexible.
 
I believe AMD recently introduced their version of NPU in some of their processors. Xdna. If this has enough oomph to allow decent upscaling performance then it may be a great addition to any future console. Don't know if this is the same rumored NPU on the rumored ps5 pro.
 
Status
Not open for further replies.
Back
Top