Digital Foundry Article Technical Discussion [2022]

Status
Not open for further replies.
And I'd guess they'll be using one or both of these SM 6.4 operations:

uint32 dot4add_u8packed(uint32 a, uint32 b, uint32 acc);

int32 dot4add_i8packed(uint32 a, uint32 b, int32 acc);



These are 4 element 8 bit dot product accumulate operations. How is this in essence not DP4a? And if you have hardware support to accelerate this why wouldn't you use it?
That is a question you want to ask Intel. I am really not sure what's going here. All I know is that the DP4a path is running only on Intel iGPUs, as per Alex' video.
 
And I'd guess they'll be using one or both of these SM 6.4 operations:

uint32 dot4add_u8packed(uint32 a, uint32 b, uint32 acc);

int32 dot4add_i8packed(uint32 a, uint32 b, int32 acc);



These are (edit: packed) 4 element 8 bit dot product accumulate operations. How is this in essence not DP4a? And if you have hardware support to accelerate this why wouldn't you use it?

Edit: Surely DP4a instructions are in the hardware to accelerate exactly this kind of SM 6.4 (or equivalent Vulcan) operations?
Yea I just checked the documentation for DML; this appears is more correct than DML path unless you intend to run the Matrix Crunchers/tensor cores path, then DML is necessary if you want to want to drop on multi vendor. If you are just doing pure operations, it appears SM 6.4 is the way.

looking back
with Intel's integrated GPUs using a dp4a kernel and non-intel GPUs using kernel using technologies enabled by DX12's Shader Model 6.4
Yea in order to support 6.4 you require dp4a, so basically it's just dp4a via intel driver kernel, or dp4a via DX12. I think that is what we are reading here. So any card not supporting SM6.4 is out.
 
Yea in order to support 6.4 you require dp4a, so basically it's just dp4a via intel driver kernel, or dp4a via DX12. I think that is what we are reading here. So any card not supporting SM6.4 is out.

Sadly it's not that easy. The 5700XT for example does not support DP4a, but it does support Shader Model 6.4. Is it out or not?

That is the question.
 
Sadly it's not that easy. The 5700XT for example does not support DP4a, but it does support Shader Model 6.4. Is it out or not?

That is the question.
Its per feature, the latest drivers will always arrive with the latest support SM model (often).
But the code should do a check if the feature is supported.
 
Not console but DF posted a new video.


Exhaustively detailed as always @Dictator, excellent work.

So...good, but a work in progress as expected. With not even midrange cards out yet there's definitely time for Intel to continue to iterate before most people are exposed to it, albeit that's assuming they can do that without requiring title updates - that's not entirely clear. That heavy moiré pattern is kind of a deal breaker by itself in SOTT though atm.

As I only came to DLSS particularly late with my 3060, my beefs with some implementations is very rarely ghosting as that was already significantly improved with later updates that occurred before I loaded up my first title. What I find bothersome in some DLSS implementations (and obviously this applies to reconstruction tech as a whole judging by these comparison videos) that I rarely saw mentioned before I experienced it though, was how it can struggle with lower-resolution effects. The heavy flickering you see in the water with XESS in SOTT is very similar to what you get in Horizon Zero Dawn with DLSS when objects are obscured by smoke/steam, and the effect of the hanging lights in the SOTT Bar scene that flicker in and out with DOF is very similar to what happens with street lights in Spiderman:RM when web zipping which employs a slight DOF effect. Death Stranding and Rise of the Tomb Raider (albeit somewhat unfair as it never had TAA) can exhibit severe specular aliasing when motion blur is enabled with DLSS at points too.

Some of that is to be expected of course - you're asking the the tech to reconstruct from an ever lower resolution than the base res you're starting with, but when these occur they really stand out. The thing is, it's not just the lower resolution - when these problem areas occur, it's more that the algorithm is adding artifacts that aren't present at all through an extremely low-res TAA image, it's not just magnifying existing ones. For example, both in Spiderman and HZD, these flickering artifacts never occur to nearly the same degree despite lowering the res to far below what the equivalent DLSS settings are. Presumably these lower resolution effects are rendering exactly like that - far lower than screen res, but DLSS has far more more severe artifacting than TAA in these instances, despite presumably working from an even higher internal starting res than my forced scenearios.

So I'm hoping that can be an area of attention going forward - both from game devs and Intel/Nvidia/AMD - as that is easily the most distracting element of reconstruction when it occurs. It's obviously not a de-facto issue though, as if it was as simple as 'lower resolution buffers = artifacting' we would see it in every title with DLSS already, but that's definitely not the case. Ideally this is something that could be improved with future DLSS versions as I suspect we're well past seeing title updates for many of these games at this point.
 
Its per feature, the latest drivers will always arrive with the latest support SM model (often).
But the code should do a check if the feature is supported.
In the latest Intel interview video, they specifically said that it would work on SM 6.4 hardware. Now it could just be that the guys from Intel did not know there were SM 6.4 cards out there that do not support DP4a, but they sounded confident about it. They might need to clarify this one more time. Here is the list from PC gamer. https://www.pcgamer.com/intel-xess-compatibility-nvidia-amd/.
 
looking back
Yea in order to support 6.4 you require dp4a, so basically it's just dp4a via intel driver kernel, or dp4a via DX12. I think that is what we are reading here. So any card not supporting SM6.4 is out.

In the case of Navi 10 / 5700XT, which is claimed to support SM 6.4, could AMD be using a ton of individual int32 instructions (multiply, add, bitshift, bitmask, imNotBitter, that sort of stuff) to support those specific packed SM 6.4 ops?

You might be able to get the correct result from your HLSL, just with it being a load slower than using DP4a. A sort of software fallback.

I'm thinking along the lines of the AMD run time compiler for Navi 10 looking at what the hardware has, and building the GPU machine code to get you there. A proper DP4a instruction optimally, but a much slower fudge for say Navi 10 that appears to be lacking. This way it might be perfectly understandable for Intel to only specifically mention DP4a for their hardware, and leave everything on other hardware to how their SM 6.4 code is run.

I guess performance of XeSS on different supported GPUs could give us some insight into whether something like this is happening....
 
So XeSS would work on PS5 and XSX then by the look of it, if Intel want this tech to widely adopted get it running on consoles and you're half way there.

Technically on most GPU's or even PS4/OneS, however most likely not the same performance as on hw accelerated AI hw like Intels gpus that support it. If it would be worth it over consoles TAA is the question, probably not as XeSS perhaps has a peformance penalty.
Also of importance, RDNA3+ is supposed to come with its own dedicated hardware blocks for AI/ML acceleration like Intel and Nvidia's. Mid-gen refreshes and for sure next generation consoles would sport those then.
 
Article @ https://www.eurogamer.net/halo-infi...creen-campaign-co-op-tested-and-its-excellent

Halo Infinite's cancelled split-screen campaign co-op tested - and it's excellent​

Technical issues are minor - it's a brilliant way to play the game.

The news of the cancellation of Halo Infinite's campaign co-op split-screen option has been frustrating - not least because I was looking forward to playing it with my son, just as I have with all prior Halo titles supporting the feature. The fact that 343 Industries isn't supporting it is all the more confusing because right now at least, it's possible to glitch your way into split-screen campaign co-op for up to four players. It's true that the feature is not without its bugs, but in my experience, these are relatively minor and it's possible to play through the entire campaign in split-screen mode. Adding to the sense of disappointment is how close 343 has got to finalising this feature - and it plays brilliantly.

...

Player progress - including Achievement support - is unique to that player and even accessing the in-game map and upgrade systems can be done independently within each mini-screen. However, this can cause a corruption issue if the other player in standard gameplay. Other issues I encountered included spawning under geometry, time of day drifting between the two players (one can be playing at night, the other in the daytime) and there's no character collision for the players, who can literally walk through each other. Some players have noted game save corruption (which would be entirely understandable), but that wasn't an issue in my play.

...

There have also been suggestions that the mode was cancelled because getting it working on Xbox One may have been too challenging for the under-powered console, especially the OG 'VCR' model from 2013. However, despite understandable graphical drawbacks, it's perfectly serviceable and a solid way to play. The visual compromises are legion, however: dynamic resolution seems to top out at 720p but can drop as low as 540p, making the game look very blurry. Draw distance is compromised to the point where low poly imposters for enemies are drawn in at a very close range, while the range of Halo Infinite's real-time shadows is also savagely pulled in. Performance is also wobbly - in part thanks to the game's inconsistent frame-pacing at 30fps but also through genuine frame-rate drops into the mid-20s in the open world.
 
I couldn't disagree more with their statement "the feature works - and works well." Sure, if you ignore all of these game breaking and horrible potential user issues it works fine... Come on, no company in their right mind would ever ship anything so unstable in a flagship title. That's the honest reality.
 
It was cancelled because they didn't think they could pull it off, for whatever reason that may have been.

They're a technically good studio so I believe they're experienced enough to know when something can and can't be done to a standard worthy of release and in this case we need to trust their judgement.
 
So XeSS would work on PS5 and XSX then by the look of it, if Intel want this tech to widely adopted get it running on consoles and you're half way there.
Could always work on consoles. Guess SM6.4 just means easier to port to PS5 now.

The difference is XS can use DP4a PS5 can't (as far as we know)
So it comes down to if it's worth using on PS5 when FSR2.x is good enough at a slightly higher resolution. Eitherway won't be a big loss for PS5 if it can't. Will help make DF comparisons even more interesting though.

The more important use in console is on XSS where the lower input resolution makes a much bigger difference between FSR2.x and ML based upscaling, so far anyway.
XeSS would look a lot better in comparison to FSR2.x.
 
Could always work on consoles. Guess SM6.4 just means easier to port to PS5 now.

The difference is XS can use DP4a PS5 can't (as far as we know)
So it comes down to if it's worth using on PS5 when FSR2.x is good enough at a slightly higher resolution. Eitherway won't be a big loss for PS5 if it can't. Will help make DF comparisons even more interesting though.

The more important use in console is on XSS where the lower input resolution makes a much bigger difference between FSR2.x and ML based upscaling, so far anyway.
XeSS would look a lot better in comparison to FSR2.x.
The issue with XSS is its performance.

I genuinely don't think it has the performance in its tiny GPU to proves and AI upscale in the small frame times that are needed.


I'm sure it has like 1/8 of the INT performance of an RTX2060 OS something like that.
 
In the case of Navi 10 / 5700XT, which is claimed to support SM 6.4, could AMD be using a ton of individual int32 instructions (multiply, add, bitshift, bitmask, imNotBitter, that sort of stuff) to support those specific packed SM 6.4 ops?

You might be able to get the correct result from your HLSL, just with it being a load slower than using DP4a. A sort of software fallback.
Yes I assume this is the behaviour if the hardware is missing.
 
The issue with XSS is its performance.

I genuinely don't think it has the performance in its tiny GPU to proves and AI upscale in the small frame times that are needed.


I'm sure it has like 1/8 of the INT performance of an RTX2060 OS something like that.
Based on what Intel have said about DP4a working well with their integrated GPUs.
I would expect it to work on XSS.

But I do agree that's an unknown at the moment in terms of of the fixed performance cost.

1/8th? You comparing tensor cores here?
 
Status
Not open for further replies.
Back
Top