Digital Foundry Article Technical Discussion [2021]

Status
Not open for further replies.
Just a wild (and probably completely wrong) thought - could the lack of stutter on PS5 be down to cache scrubbers?

I seem to recall this stops a GPU from stalling when it’s not being fed fast enough.
 
During gameplay the difference should increase a bit, if there isn't another bottleneck (e.g. memory bandwidth).
The corridor of doom is strange, though. If the CPU is somehow limiting here, it would make sense, because than the GPU would not have that much work and the PS5s CPU would "upclock" to it's maximum. That could explain why they are both almost on par in that scene.
But I really don't know why this scene is so CPU limited. Doesn't really look like something special is going on there. Maybe it is just one thread limiting there, or an engine inefficiency or something like that.


The xbox one version, should also be build with same GDK version, as this is for all xbox one/series consoles.



might not be the case, because the shot is not made from the exact same spot. On the PS5 image the character is at least one step nearer or the camera angel is further away on the xbox shot. Might just be that RT is just not done that "far" away ^^. Might be just a step behind the RT border.
yea... but it shouldn't deploy the scarlett version. So there should be a way to compress teh scarlett one and not the xbo versions.
Then again though, imo it could be a fairly rushed game. I'm reading reports of a lot of instability and crashing on series X with this title.

Let me see what pops up when I installed it on my OG XBO.

edit: you cannot
Ultimate Edition only works on Series X|S consoles. The older platforms aren't part of this binary.

So I would disagree, the compression should have occurred.
 
how is this frame take 3x longer calculated ? just curious (what scene example)

Yeah sure. So if you miss 16 ms (60 fps frame time) you'd normally drop to 33 ms. And you might alternate between the two, or even drop to a regular 33 ms. Or if you went lower, see some 50 ms frames appearing in with the 33 ms frames. There's normally a progression about things. In Control, in addition to these kinds of drops, there are also one off hitches where everything is smooth and then bam you miss three frames and then instantly you're back to not missing any. And it can be worse than that.

For example here (should be at the right time stamp) - you can see things basically stop for a few frames:

It's not just dropping frames like we typically see (which it certainly does do some in places too!), it's the flow of the game being interrupted for several frames in a drop and then recovery that uncharacteristic of the way GPU load usually changes. As they say, it's like something's stalling the game for a while.
 
The PC version was the Ultimate Edition to begin with. :p

*fanboy moment*

He may be onto something though as it seems that the PC RDNA2 GPU's are simply running the original RT implementation in Control that was likely developed with Turing in mind. The consoles obviously have a more custom implementation which is perhaps better tailored for RDNA2's RT shortcomings.

What I can't get over is how crazy fast the 3090 is there. Extrapolating the 2070S performance to faster Turing cards it would certainly be well in excess of 2x faster than a 2080S.
 
What I found really jarring while playing the game in Graphics mode was that, since the ray traced reflections are reconstructed from multiple previous frames, you can spot it lagging behind the reflected world and cause a jelly effect, which is especially noticeable when panning the camera.

I would like to see them using the spare computation headroom on consoles to fix this, for example, adjust the number of previous frames used depending on the load.
 
He may be onto something though as it seems that the PC RDNA2 GPU's are simply running the original RT implementation in Control that was likely developed with Turing in mind. The consoles obviously have a more custom implementation which is perhaps better tailored for RDNA2's RT shortcomings.

What I can't get over is how crazy fast the 3090 is there. Extrapolating the 2070S performance to faster Turing cards it would certainly be well in excess of 2x faster than a 2080S.

The RT cores got buffed up from Turing to Ampere (like hardware support for motion blured raytracing) so you cannot compare them directly.
 
16% over 21 test samples!

In this case, I am not sure why photomode would underclock the GPU. On PC - when you turn on photomode, the game's CPU related framerate skyrockets as the entire simulation threads stop. Here I actually think we are looking at a thermal situation on PS5, by means of inference, where the CPU is doing very little and the GPU has all the power it wants for full clocks.

That was indeed most interesting!

Average of 16% is actually very close to the 18% paper FLOPS difference, and I noticed one scene that was ~25% faster on the XSX which is pretty much bang on the memory bandwidth difference. So it does appear the XSX can use it's "width" .... if the workload suits it.

In the Hotchips presentation for XSX, the hardware guys did say that RT in particular could be very BW intensive, and indicated that was the reason for their wide bus. So it does look like MS knew what they were doing, they were just designing for bang for buck for anticipated workloads. And both compute and RT (which itself leverages compute) do appear to be in the ascension.

I thought the same thing about the power budget too - with the CPU largely unstressed this is probably a good chance to make a larger amount of power available to the PS5 GPU. Perhaps this helped, perhaps not. It would be interesting to see system power draw at these points to see if we could infer anything.

Worth also noting that some scenes showed almost no difference between systems of course. And scenes with lots of Alpha magic-asplode effects may have shifted some scenes back towards the PS5, relatively speaking.

And the corridor of doom really doesn't like any of these systems. :no:
 
there's no VRS on Control IIRC.
I think that's a bug. Typically VRS doesn't destroy detail that is marked for high quality. It would scale detail really far away instead. Reflection rendering at distance is just fine if you look further back behind it. It's possible they just forgot to include the poster as part of the BVH tree.
The floor by her foot on xbox is also reflecting red on the steps, which is a continuation of the reflection by her right arm. But we don't see that on PS5 for instance. And on the right side of the stairs near the bottom you can see a red tinge on the steps for Xbox reflecting onto it, and that red tinge is missing on PS5 (or at least is extremely difficult to see). So once again, not sure if it's just angling, or being a bit more forward for PS5. But I don't think this is a by product of low vs high. it might just be rendering distance and PS5 is just closer and therefore it rendered and XBox is further back. You can see on the right side of screenshot the red box sticking out that is not present on Xbox. DF does their best to line up the positions, but it may not be perfect.

I think it's just a forgotten texture. Or its' entirely possible as we'll soon find out, there are actual differences between the two games in terms of textures etc. (see below)

That’s interesting.

As an obvious kinda-informed layman, I have a question about RT I just thought about.

Just like we have dynamic res, VRS, and a whole host of performance optimisations...

What is stopping engines from dynamically change the number of rays (or whatever quality variable) on a per frame basis, in order to keep up with the frame rate budget?

Surely we wouldn’t be able to see the difference while playing. Like I can’t see the difference between various states of dynamic resolution.

Yes it’s Friday in lockdown in London, I may have had a couple of vodka coke zeros. Sue me.
 
That’s interesting.

As an obvious kinda-informed layman, I have a question about RT I just thought about.

Just like we have dynamic res, VRS, and a whole host of performance optimisations...

What is stopping engines from dynamically change the number of rays (or whatever quality variable) on a per frame basis, in order to keep up with the frame rate budget?

Surely we wouldn’t be able to see the difference while playing. Like I can’t see the difference between various states of dynamic resolution.

Yes it’s Friday in lockdown in London, I may have had a couple of vodka coke zeros. Sue me.
hmm, an excellent question. If I were to take a stab at it, Ray Tracing is often done by rays per pixel of resolution. The more resolution the less rays you have per pixel.
Reducing the number of rays too much may not be enough there for your denoiser to work, and certain effects you're relying on for lighting make not be sufficient to finish. Ie: standard compute/raster based lighting surpasses the quality of RT if RT drops too low in quality. you may as well choose the traditional path.

On the flip side, to improve the number of rays per pixel and still keep performance high: one could reduce resolution to increase the number of rays per pixel, and rely on a really good denoiser to make up for a lack of rays shot, and then upscale the image to a very high resolution while keeping the performance benefit.
 
It's not just dropping frames like we typically see (which it certainly does do some in places too!), it's the flow of the game being interrupted for several frames in a drop and then recovery that uncharacteristic of the way GPU load usually changes. As they say, it's like something's stalling the game for a while.
Could the console be switching the main thread to a new core? Or do we lean toward it being an OS/API inefficiency?

It would be neat if DF could sync their framerate graphs with power draw.
 
As it doesn't need gameplay logic the photo mode shouldn't tax the CPU at all (compared to the gameplay). It should be similar in most cutscenes. McDonald is the one that did the Bloodborne 60fps hack on his PS4 devkit.
eIRZRDb.png

Those conditions are ideal to test the GPU + bandwidth of those consoles. But things get really interesting when the CPU has much more work to do: like during gameplay.
 
Could the console be switching the main thread to a new core? Or do we lean toward it being an OS/API inefficiency?

It would be neat if DF could sync their framerate graphs with power draw.

That's a good question! I wish I knew.

Moving a main thread around seems to have a cost, but AFAIK PCs often do it many times a second to balance thermal load of the most demanding thread so they can boost optimally, so unless something is going wrong (maybe with the scheduler?) I don't think it should be causing hitches like these.
 
As it doesn't need gameplay logic the photo mode shouldn't tax the CPU at all (compared to the gameplay). It should be similar in most cutscenes. McDonald is the one that did the Bloodborne 60fps hack on his PS4 devkit.
eIRZRDb.png

Those conditions are ideal to test the GPU + bandwidth of those consoles. But things get really interesting when the CPU has much more work to do: like during gameplay.

This is true, but the results of a more heavily loaded CPU (and its impact on bandwidth and power available for the GPU) might not necessarily change things in favour a different platform.
 
So when not playing the game the XBSX is faster but when playing the game the PS5 performs better. Like I said yesterday, maybe the PS5 is just designed better than the XBSX.
 
hmm, an excellent question. If I were to take a stab at it, Ray Tracing is often done by rays per pixel of resolution. The more resolution the less rays you have per pixel.
Reducing the number of rays too much may not be enough there for your denoiser to work, and certain effects you're relying on for lighting make not be sufficient to finish. Ie: standard compute/raster based lighting surpasses the quality of RT if RT drops too low in quality. you may as well choose the traditional path.

On the flip side, to improve the number of rays per pixel and still keep performance high: one could reduce resolution to increase the number of rays per pixel, and rely on a really good denoiser to make up for a lack of rays shot, and then upscale the image to a very high resolution while keeping the performance benefit.

Well we’re theorising anyway but the way I understand it, the big power sucker of RT is the BVH structure. So decreasing the number of rays wouldn’t really help there.

I just think that if Insomniac managed to offer a 60 FPS RT mode, which was clearly lower quality than the standard RT mode, then other games could have the same.

I just HATE that I have to choose between 60fps and RT in other games. All I do is turn RT on, look at myself, then turn it off, play the game, then go back and forth and it’s all VERY CONFUSING.
 
This is true, but the results of a more heavily loaded CPU (and its impact on bandwidth and power available for the GPU) might not necessarily change things in favour a different platform.
Well the data show it is actually the case. Most (but not all) gameplay comparisons (so when the CPU has much more work to do) with similar settings show a performance advantage on PS5 (or near indentical performance). And the scenes that have being picked by DF that show XSX performing better were all taken from cutcenes in Hitman 3 and photo mode in Control.
 
Well the data show it is actually the case. Most (but not all) gameplay comparisons (so when the CPU has much more work to do) with similar settings show a performance advantage on PS5 (or near indentical performance). And the scenes that have being picked by DF that show XSX performing better were all taken from cutcenes in Hitman 3 and photo mode in Control.

So loading the CPU and taking power away from the GPU helps the PS5?

Not sure I'm with you here.
 
Well we’re theorising anyway but the way I understand it, the big power sucker of RT is the BVH structure. So decreasing the number of rays wouldn’t really help there.

I just think that if Insomniac managed to offer a 60 FPS RT mode, which was clearly lower quality than the standard RT mode, then other games could have the same.

I just HATE that I have to choose between 60fps and RT in other games. All I do is turn RT on, look at myself, then turn it off, play the game, then go back and forth and it’s all VERY CONFUSING.
yea agreed.
I think if we give it time, they may find a better hybrid approach than the one that exists. I'm sure there are alternative solutions that will exist in the hybrid space, it just needs to be developed. I think a good example is UE5 for example, their Lumen tech doesn't require ray tracing hardware, but with it, they may be able to enhance or speed up their algorithm considerably. And so those types of things seem ideal I guess, with respect to the goal. The other way is to have some sort of algorithm that takes what Spider Man has, 60fps + RT --> to higher resolution and the problem is largely solved.
 
So loading the CPU and taking power away from the GPU helps the PS5?

Not sure I'm with you here.
It's unintuitive, but, except if the CPU is using specific power hungry instructions, the max power on PS5 should be reached during cutscenes and others scenes not taxing the CPU (particularly if uncapped).

On PS4 and Pro, for instance the fan is often spinning the most during non-gameplay scenes: in cutscenes or when the GPU is not limited by CPU logic as in start screens. The start screen of God of War is actually used by DF to measure the max power consumption of Pro. As stated by Cerny the map of Horizon makes the fan goes hyperdrive. Another example would be MGS5, one of the most technically impressive game on PS4 is the most noisy during the start screen or the cutscenes. The game is usually quite silent during gameplay. But there are plenty others examples on PS4 and we know Cerny and co studied tons of PS4 games when they designed PS5 dynamic clocks.

So actually what DF tested with Hitman 3 and here Control are likely be the worst cas possible for PS5, in those scenes (notably as they are uncapped) the GPU is more likely to be downclocked than gameplay scenes because those scenes wouldn't be stalled by some CPU logic that would make the GPU wait for some logic to be done.
 
Status
Not open for further replies.
Back
Top