Digital Foundry Article Technical Discussion [2022]

Dr. Nick · Sep 8, 2022

I did not make a thread for this game as it did not feel technically special in graphics department but I am loving the game so far.

Jay · Sep 8, 2022

It's time to upscale FSR 2 even further: Meet FSR 2.1!

One of our engineers shares the improvements we've made with AMD FidelityFX Super Resolution 2 to ghosting, temporal stability, and more.

gpuopen.com

Does make you wonder how poor @Dictator must feel this dropping within a day or two of him releasing his XeSS vid.

davis.anthony · Sep 9, 2022

That's amazing how they've all but got rid of the ghosting.

PSman1700 · Sep 11, 2022

PSman1700 · Sep 12, 2022

Below2D · Sep 14, 2022

Not console but DF posted a new video.

Dampf · Sep 14, 2022

Great video, very impressed by XeSS' first showing. It looks very competitive to DLSS, aside from some issues that I am sure will get worked on.

But what I am most curious about it is its DP4a mode and how it looks and runs on different GPU architectures (Turing vs RDNA1 would be the most interesting comparison here). I hope DF will cover that as well!

However, the video made it sound like the DP4a path will only run on Intel iGPUs and other GPUs will use a seperate SM6.4 path? I am hopefully just misunderstanding something here.

Below2D · Sep 14, 2022

My takeaway from this video is that DLSS is still the best solution out there and quite a bit superior to XeSS which has a few quirks to resolve (the strange flickering patterns chiefly on vegetation were the most egregious examples using performance mode). Intel needed to knock it out of the park with this solution to make their GPUs appealing. They got much worse performance and aren't up to par with DLSS (though XeSS isn't bad at all). if they want a chance for Arc to compete, they need to price it aggressively. Otherwise, it's DOA.

Dr. Nick · Sep 14, 2022

@Dictator, look forward to your video on FSR 2.1. That said even with its problems I can see XeSS becoming "industry standard".

Below2D · Sep 14, 2022

@Dictator Do you think a comparison with FSR 2.0 would have been more appropriate (though I guess you wanted to compare proprietary vs proprietary solution). From your video, I've gathered that DLSS is better than XeSS but I'm unsure how it stacks up to FSR 2.0. If you had done the comparison, it would have given us a clearer picture of the pecking order as it's widely agreed DLSS>FSR 2.0 so knowing who's second would have been more revelatory.

Still, you'll be working on videos comparing all three in the future + native so I can only imagine how daunting a task it is to compare all four.

Dr. Nick · Sep 14, 2022

Dampf said:
However, the video made it sound like the DP4a path will only run on Intel iGPUs and other GPUs will use a seperate SM6.4 path? I am hopefully just misunderstanding something here.

Re-watched that section and that is exactly what the video says. I don't think I'm mistaken that something could be done for other vendors if those vendors would like to pursue it. I think they said something along that line in an earlier Intel interview. Just that dp4a just currently works only for intel GPUS.

Dampf · Sep 14, 2022

Dr. Nick said:
Re-watched that section and that is exactly what the video says. I don't think I'm mistaken that something could be done for other vendors if those vendors would like to pursue it. I think they said something along that line in an earlier Intel interview. Just that dp4a just currently works only for intel GPUS.

Which makes zero sense. Why not run the DP4a path on modern GPUs from other vendors?
They are leaving a lot of performance on the table here, even on RDNA2 GPUs and franky I think that's a dealbreaker.

With that behaviour, FSR2.0 is still the much better option for AMD hardware. Running such a model in FP16 will be slow.

@Dictator Please, can you speak with someone at Intel about this matter? It is of high importance that they are using DP4a on cross vendor GPUs.

BRiT · Sep 14, 2022

Written Article @ https://www.eurogamer.net/digitalfo...vs-dlss-the-digital-foundry-technology-review

Intel's XeSS tested in depth vs DLSS - the Digital Foundry technology review

A strong start for a new, more open AI upscaler.

Intel's debut Arc graphics cards are arriving soon and ahead of the launch, Digital Foundry was granted exclusive access to XeSS - the firm's promising upscaling technology, based on machine learning. This test - and indeed our recent interview with Intel - came about in the wake of our AMD FSR 2.0 vs DLSS vs native rendering analysis, where we devised a gauntlet of image quality scenarios to really put these new technologies through their paces. We suggested to Intel that we'd love to put XeSS to the test in a similar way and the company answered the challenge, providing us with pre-release builds and a top-of-the-line Arc A770 GPU to test them on.

XeSS is exciting stuff. It's what I consider to be a second generation upscaler. First-gen efforts, such as checkerboarding, DLSS 1.0 and various temporal super samplers attempted to make half resolution images look like full resolution images, which they achieved to various degrees of quality. Second generation upscalers such as DLSS 2.x, FSR 2.x and Epic's Temporal Super Resolution aim to reconstruct from quarter resolution. So in the case of 4K, the aim is to make a native-like image from just a 1080p base pixel count. XeSS takes its place alongside these technologies.

iroboto · Sep 14, 2022

Below2D said:
Not console but DF posted a new video.

Thorough and great video. Really ran the algorithm through it's paces

iroboto · Sep 14, 2022

Below2D said:
My takeaway from this video is that DLSS is still the best solution out there and quite a bit superior to XeSS which has a few quirks to resolve (the strange flickering patterns chiefly on vegetation were the most egregious examples using performance mode). Intel needed to knock it out of the park with this solution to make their GPUs appealing. They got much worse performance and aren't up to par with DLSS (though XeSS isn't bad at all). if they want a chance for Arc to compete, they need to price it aggressively. Otherwise, it's DOA.

I suspect that XeSS will always run best on Intel hardware, and I assume they will price is appropriately and they have additional channels in the lower end markets that often compete directly with AMD as opposed to nvidia. Should be okay, not sure if DOA if the price is right.

iroboto · Sep 14, 2022

Dampf said:
Which makes zero sense. Why not run the DP4a path on modern GPUs from other vendors?
They are leaving a lot of performance on the table here, even on RDNA2 GPUs and franky I think that's a dealbreaker.

With that behaviour, FSR2.0 is still the much better option for AMD hardware. Running such a model in FP16 will be slow.

@Dictator Please, can you speak with someone at Intel about this matter? It is of high importance that they are using DP4a on cross vendor GPUs.

For Context of those who didn't read the article:

Arc's XMX units process using the int-8 format in a massively parallel way, making it quick. For non-Arc GPUs, XeSS works differently. They use a "standard" (less advanced) machine learning model, with Intel's integrated GPUs using a dp4a kernel and non-intel GPUs using kernel using technologies enabled by DX12's Shader Model 6.4.

I think it's because it's more effort. We may see it realized in the console space since the number of configurations is lower or not at all. Got to give some sort of advantage to their own hardware. It's entirely possible that if it's really critical that consoles get ML upscaling, they will be left to do this on their own - the model is quite frankly, worth too much and too expensive to just give away.

function · Sep 14, 2022

iroboto said:
For Context of those who didn't read the article:

I think it's because it's more effort. We may see it realized in the console space since the number of configurations is lower or not at all. Got to give some sort of advantage to their own hardware. It's entirely possible that if it's really critical that consoles get ML upscaling, they will be left to do this on their own - the model is quite frankly, worth too much and too expensive to just give away.

Couldn't this just be a driver thing?

Intel can write directly to their own hardware as they control the driver, but for AMD and Nvidia wouldn't they have to (or at least want to) use whatever MS's machine learning HLSL stuff is available? Edit - meaning SM 6.4.

Another edit: so if you've got DP4a it gets used, if not you fall back to doing your 8-bit dot product accumulatey stuff in a slower fashion? You just compile the HLSL differently?

Dampf · Sep 14, 2022

The DP4a path is only used on Intel iGPUs. For every other GPU it's using SM 6.4, regardless if the GPU supports DP4a or not.

iroboto · Sep 14, 2022

function said:
Couldn't this just be a driver thing?

Intel can write directly to their own hardware as they control the driver, but for AMD and Nvidia wouldn't they have to (or at least want to) use whatever MS's machine learning HLSL stuff is available? Edit - meaning SM 6.4.

Another edit: so if you've got DP4a it gets used, if not you fall back to doing your 8-bit dot product accumulatey stuff in a slower fashion? You just compile the HLSL differently?

that's a reasonable assumption. Which is why it's possible to see life in a console, but not likely in the PC space.
but technically DirectML should get around the need for a driver. Both Nvidia and AMD support DirectML, so it should use the right hardware if it's available. If the implementation of XeSS via DP4A is not done through DirectML, then you are correct. But if they make it through DML, there is a chance to see support on other vendor hardware.

function · Sep 14, 2022

Dampf said:
The DP4a path is only used on Intel iGPUs. For every other GPU it's using SM 6.4, regardless if the GPU supports DP4a or not.

And I'd guess they'll be using one or both of these SM 6.4 operations:

uint32 dot4add_u8packed(uint32 a, uint32 b, uint32 acc);

int32 dot4add_i8packed(uint32 a, uint32 b, int32 acc);

HLSL Shader Model 6.4 - Win32 apps

Describes the machine learning intrinsics added to HLSL Shader Model 6.4.

docs.microsoft.com

These are (edit: packed) 4 element 8 bit dot product accumulate operations. How is this in essence not DP4a? And if you have hardware support to accelerate this why wouldn't you use it?

Edit: Surely DP4a instructions are in the hardware to accelerate exactly this kind of SM 6.4 (or equivalent Vulcan) operations?

Digital Foundry Article Technical Discussion [2022]

Dr. Nick

Jay

It's time to upscale FSR 2 even further: Meet FSR 2.1!

davis.anthony

PSman1700

PSman1700

Below2D

Dampf

Below2D

Dr. Nick

Below2D

Dr. Nick

Dampf

BRiT

(>• •)>⌐■-■ (⌐■-■)

Intel's XeSS tested in depth vs DLSS - the Digital Foundry technology review

iroboto

Daft Funk

iroboto

Daft Funk

iroboto

Daft Funk

function

None functional

Dampf

iroboto

Daft Funk

function

None functional

HLSL Shader Model 6.4 - Win32 apps

Similar threads

Digital Foundry Article Technical Discussion [2022]

(>• •)>⌐■-■ (⌐■-■)

Intel's XeSS tested in depth vs DLSS - the Digital Foundry technology review​

Daft Funk

Daft Funk

Daft Funk

None functional

Daft Funk

None functional

Similar threads

Intel's XeSS tested in depth vs DLSS - the Digital Foundry technology review