Re-watched that section and that is exactly what the video says. I don't think I'm mistaken that something could be done for other vendors if those vendors would like to pursue it. I think they said something along that line in an earlier Intel interview. Just that dp4a just currently works only for intel GPUS.However, the video made it sound like the DP4a path will only run on Intel iGPUs and other GPUs will use a seperate SM6.4 path? I am hopefully just misunderstanding something here.
Which makes zero sense. Why not run the DP4a path on modern GPUs from other vendors?Re-watched that section and that is exactly what the video says. I don't think I'm mistaken that something could be done for other vendors if those vendors would like to pursue it. I think they said something along that line in an earlier Intel interview. Just that dp4a just currently works only for intel GPUS.
Thorough and great video. Really ran the algorithm through it's pacesNot console but DF posted a new video.
I suspect that XeSS will always run best on Intel hardware, and I assume they will price is appropriately and they have additional channels in the lower end markets that often compete directly with AMD as opposed to nvidia. Should be okay, not sure if DOA if the price is right.My takeaway from this video is that DLSS is still the best solution out there and quite a bit superior to XeSS which has a few quirks to resolve (the strange flickering patterns chiefly on vegetation were the most egregious examples using performance mode). Intel needed to knock it out of the park with this solution to make their GPUs appealing. They got much worse performance and aren't up to par with DLSS (though XeSS isn't bad at all). if they want a chance for Arc to compete, they need to price it aggressively. Otherwise, it's DOA.
For Context of those who didn't read the article:Which makes zero sense. Why not run the DP4a path on modern GPUs from other vendors?
They are leaving a lot of performance on the table here, even on RDNA2 GPUs and franky I think that's a dealbreaker.
With that behaviour, FSR2.0 is still the much better option for AMD hardware. Running such a model in FP16 will be slow.
@Dictator Please, can you speak with someone at Intel about this matter? It is of high importance that they are using DP4a on cross vendor GPUs.
Arc's XMX units process using the int-8 format in a massively parallel way, making it quick. For non-Arc GPUs, XeSS works differently. They use a "standard" (less advanced) machine learning model, with Intel's integrated GPUs using a dp4a kernel and non-intel GPUs using kernel using technologies enabled by DX12's Shader Model 6.4.
Couldn't this just be a driver thing?For Context of those who didn't read the article:
I think it's because it's more effort. We may see it realized in the console space since the number of configurations is lower or not at all. Got to give some sort of advantage to their own hardware. It's entirely possible that if it's really critical that consoles get ML upscaling, they will be left to do this on their own - the model is quite frankly, worth too much and too expensive to just give away.
that's a reasonable assumption. Which is why it's possible to see life in a console, but not likely in the PC space.Couldn't this just be a driver thing?
Intel can write directly to their own hardware as they control the driver, but for AMD and Nvidia wouldn't they have to (or at least want to) use whatever MS's machine learning HLSL stuff is available? Edit - meaning SM 6.4.
Another edit: so if you've got DP4a it gets used, if not you fall back to doing your 8-bit dot product accumulatey stuff in a slower fashion? You just compile the HLSL differently?
The DP4a path is only used on Intel iGPUs. For every other GPU it's using SM 6.4, regardless if the GPU supports DP4a or not.