Requisitions for Images! (should be in .png)
1. DLSS comparison shots: variety of scenes and modes and titles vs native
2. FRS comparison shots: variety of scenes (games other than riftbreaker) vs native
Analysis Hot Links
1.1 Doom Eternal RT PS5 vs XSX
1.2 Doom Eternal Balanced PS5 vs XSX
1.3 Doom Eternal Balanced 2 PS5 vs XSX
1.4 Doom Eternal RT 2 PS5 vs XSX
2.0 Gears 5 VRS
3.0 GDC Alpha Point Marketing Material Analysis
4.0 Doom Eternal DLSS Analysis
Preface
I would like to present a basic analysis of image quality of games as we continue to enter a period of graphics where it is critical to save processing power while keeping image quality up to match native TV resolutions.
In this case, I have been looking at and working with some newer basic tools to look at resolution, AF, VRS, and upscaling technologies to show gamers what they are getting back from the graphics in terms of useable visual feedback. It is a basic analysis of course, full of holes, easily exploitable etc. But it is certainly an improvement over having no tools to use at all.
Traditionally the industry has leveraged resolution and framerate as being the single most important metrics for image quality (one of static resolution the other motion resolution), but current techniques with upscaling, DRS, DLSS, FSR, and VRS can now alter the image in ways that static resolution as a metric alone can no longer represent successfully. We are now in a stage where we must examine image quality, and to do so, we must break static images down to its finer parts and examine them tile by tile.
This basic analysis covers Fourier and Discrete cosine analysis algorithms to separate edge quality while reducing as much noise as possible. Where there is more discrete detail available, the algorithms award them with a higher metric. Where there is less discrete detail (Vaseline effect) they are awarded less. The system is imperfect for obvious reasons, volumetric fog and other effects like this would move to lower scores not to improve them even though they are more taxing to GPU and higher quality versions of this is more desirable not less. Conversely, over sharpening your image would grant the image benefits as the current setup is not designed to take this into account (a butterworth filter would be required). UI and other elements are not removed, so this influences the analysis, it was not desirable to just have the average of the frame alone as a single point representing what the user will receive.
The Fourier transform will cover the edge and frequency analysis, the bulk of the IQ analysis. We transform the image into a frequency amplitude domain, and remove the low frequency data from the image, leaving only high frequency detail remaining. We then average the image block from here, images with more higher frequency data will result in higher scores. Higher frequency data is data you can visually see a difference in, so in this case, a transition from black to black has no frequency, but black to white is high frequency. This is a basic method of doing edge detection (as there are many methods) to see what the eye can perceive in clusters of pixels.
With Fourier transform we can get an idea of what is happening with the picture as we look for uniformity in the spectrum. Below is a picture of native rendering. The center of the picture represents low frequency data points in the image, and the 4 corners represent high frequency. Here we can see a uniformity of overall brightness happening across the image.
Source image for Riftbreaker courtesy of @cho
https://forum.beyond3d.com/posts/2211513/
When we move to TAA, we can see that in the 4 corners are darker in appearance, indicating a lack of high frequency detail during its reconstruction to 4K from 1440p. This example shot of Plague’s Tale shows this below.
Source Image courtesy of VG Tech (A Plague Tale Innocence PS5 vs Xbox Series X|S Frame Rate Comparison - YouTube)
When we look at FSR, we see that the corners and edges of the spectrum are filled in, the middle is filled in, but there is still a similar dark halo in between the low and high frequency. What is interesting is that at the poles N-S-E-W, there is additional values filled in there in the form of lines. The algorithm appears to be applying some form of additional sharpening here compared to the native image.
Source image for Riftbreaker courtesy of @cho
https://forum.beyond3d.com/posts/2211513/
The Discrete Cosine transform analysis is based on a paper to detect deep fakes. Using DCT, one can identify if there are upscaling artifacts in the image, and these upscaling artifacts are easily visible. Once detected, we know somewhere on the image upscaling was leveraged, and this is useful to separate deep fakes from real movies/images as we move forward into the future when ML generated deep fakes. Coincidentally, some items I discovered while looking at this analysis is that at times, the artifacts provide enough image to determine the resolution of what the image was before it was upscaled, as non-scaled images have uniformity in DCT. This is remarkably effective for pixel counting without having to do it manually, however it is not applicable to all upscaling methods, and I continue to do more research in this area to find a way to perform automatic resolution counting for upscaled images.
An example listed below is a scaled 1080p image of Dark Souls 3 upscaled to 1080p from 900p on XSS.
Source Image courtesy of VG Tech (Dark Souls 3 Xbox Series S Frame Rate Test (FPS Boost | Backwards Compatibility) - YouTube)
Another example Scaled 4K image of Doom Eternal upscaled to 4K, the arrow points to 83% of full frame, which is approximately 1800p at 4K.
Source Image courtesy of VG Tech: (Doom Eternal PS5 vs Xbox Series X|S Frame Rate Comparison - YouTube)
Of the two algorithms, DCT is less consistent, thus it is critical to pixel count to ensure these values are accurate. At least until we can ensure these artifacts are representative of what I believe them to be, and that may change for each title.
Conclusion:
If you have any feedback, ideas, criticisms of my methodology, please reply to these posts. If you have questions about the analysis (posts below), please reply to those posts directly. This is far from perfect of course, but it is meant to provide some ideas of what is happening behind the scenes that you may not be able to perceive and provide an inner look into what is happening when these images are reconstruction, shaded at different rates and what not.
Finally, these analyses are fun distractions from my everyday life. There are other projects I work on (should be working on), but sometimes irresistible to do these things from time to time, especially when the results have controversy when being discussed in other threads. But recent encouragement to see this project progress and interest in how it works has been nudging me to release at least the bare bits of how it operates. The project is far from complete and requires a significant amount of pressure testing for it to be of anything significant. A lot still must be learned in this whole process, and I haven’t found a lot of resources to assist here, any help is always appreciated.
Future Work:
Working on breaking the tiles down and bucketing the tiles in clarity buckets. This way we no longer need to analyze native to reconstruction, or native and VRS. We can just look at how clear each tile is and can easily analyze two various configurations but look at the difference in tile buckets. Also looking into using some of this for full movies, but processing time is much too slow. 60fps and above right now is murder without a dedicated CUDA script to do this from start to finish.
1. DLSS comparison shots: variety of scenes and modes and titles vs native
2. FRS comparison shots: variety of scenes (games other than riftbreaker) vs native
Analysis Hot Links
1.1 Doom Eternal RT PS5 vs XSX
1.2 Doom Eternal Balanced PS5 vs XSX
1.3 Doom Eternal Balanced 2 PS5 vs XSX
1.4 Doom Eternal RT 2 PS5 vs XSX
2.0 Gears 5 VRS
3.0 GDC Alpha Point Marketing Material Analysis
4.0 Doom Eternal DLSS Analysis
Preface
I would like to present a basic analysis of image quality of games as we continue to enter a period of graphics where it is critical to save processing power while keeping image quality up to match native TV resolutions.
In this case, I have been looking at and working with some newer basic tools to look at resolution, AF, VRS, and upscaling technologies to show gamers what they are getting back from the graphics in terms of useable visual feedback. It is a basic analysis of course, full of holes, easily exploitable etc. But it is certainly an improvement over having no tools to use at all.
Traditionally the industry has leveraged resolution and framerate as being the single most important metrics for image quality (one of static resolution the other motion resolution), but current techniques with upscaling, DRS, DLSS, FSR, and VRS can now alter the image in ways that static resolution as a metric alone can no longer represent successfully. We are now in a stage where we must examine image quality, and to do so, we must break static images down to its finer parts and examine them tile by tile.
This basic analysis covers Fourier and Discrete cosine analysis algorithms to separate edge quality while reducing as much noise as possible. Where there is more discrete detail available, the algorithms award them with a higher metric. Where there is less discrete detail (Vaseline effect) they are awarded less. The system is imperfect for obvious reasons, volumetric fog and other effects like this would move to lower scores not to improve them even though they are more taxing to GPU and higher quality versions of this is more desirable not less. Conversely, over sharpening your image would grant the image benefits as the current setup is not designed to take this into account (a butterworth filter would be required). UI and other elements are not removed, so this influences the analysis, it was not desirable to just have the average of the frame alone as a single point representing what the user will receive.
The Fourier transform will cover the edge and frequency analysis, the bulk of the IQ analysis. We transform the image into a frequency amplitude domain, and remove the low frequency data from the image, leaving only high frequency detail remaining. We then average the image block from here, images with more higher frequency data will result in higher scores. Higher frequency data is data you can visually see a difference in, so in this case, a transition from black to black has no frequency, but black to white is high frequency. This is a basic method of doing edge detection (as there are many methods) to see what the eye can perceive in clusters of pixels.
With Fourier transform we can get an idea of what is happening with the picture as we look for uniformity in the spectrum. Below is a picture of native rendering. The center of the picture represents low frequency data points in the image, and the 4 corners represent high frequency. Here we can see a uniformity of overall brightness happening across the image.
Source image for Riftbreaker courtesy of @cho
https://forum.beyond3d.com/posts/2211513/
When we move to TAA, we can see that in the 4 corners are darker in appearance, indicating a lack of high frequency detail during its reconstruction to 4K from 1440p. This example shot of Plague’s Tale shows this below.
Source Image courtesy of VG Tech (A Plague Tale Innocence PS5 vs Xbox Series X|S Frame Rate Comparison - YouTube)
When we look at FSR, we see that the corners and edges of the spectrum are filled in, the middle is filled in, but there is still a similar dark halo in between the low and high frequency. What is interesting is that at the poles N-S-E-W, there is additional values filled in there in the form of lines. The algorithm appears to be applying some form of additional sharpening here compared to the native image.
Source image for Riftbreaker courtesy of @cho
https://forum.beyond3d.com/posts/2211513/
The Discrete Cosine transform analysis is based on a paper to detect deep fakes. Using DCT, one can identify if there are upscaling artifacts in the image, and these upscaling artifacts are easily visible. Once detected, we know somewhere on the image upscaling was leveraged, and this is useful to separate deep fakes from real movies/images as we move forward into the future when ML generated deep fakes. Coincidentally, some items I discovered while looking at this analysis is that at times, the artifacts provide enough image to determine the resolution of what the image was before it was upscaled, as non-scaled images have uniformity in DCT. This is remarkably effective for pixel counting without having to do it manually, however it is not applicable to all upscaling methods, and I continue to do more research in this area to find a way to perform automatic resolution counting for upscaled images.
An example listed below is a scaled 1080p image of Dark Souls 3 upscaled to 1080p from 900p on XSS.
Source Image courtesy of VG Tech (Dark Souls 3 Xbox Series S Frame Rate Test (FPS Boost | Backwards Compatibility) - YouTube)
Another example Scaled 4K image of Doom Eternal upscaled to 4K, the arrow points to 83% of full frame, which is approximately 1800p at 4K.
Source Image courtesy of VG Tech: (Doom Eternal PS5 vs Xbox Series X|S Frame Rate Comparison - YouTube)
Of the two algorithms, DCT is less consistent, thus it is critical to pixel count to ensure these values are accurate. At least until we can ensure these artifacts are representative of what I believe them to be, and that may change for each title.
Conclusion:
If you have any feedback, ideas, criticisms of my methodology, please reply to these posts. If you have questions about the analysis (posts below), please reply to those posts directly. This is far from perfect of course, but it is meant to provide some ideas of what is happening behind the scenes that you may not be able to perceive and provide an inner look into what is happening when these images are reconstruction, shaded at different rates and what not.
Finally, these analyses are fun distractions from my everyday life. There are other projects I work on (should be working on), but sometimes irresistible to do these things from time to time, especially when the results have controversy when being discussed in other threads. But recent encouragement to see this project progress and interest in how it works has been nudging me to release at least the bare bits of how it operates. The project is far from complete and requires a significant amount of pressure testing for it to be of anything significant. A lot still must be learned in this whole process, and I haven’t found a lot of resources to assist here, any help is always appreciated.
Future Work:
Working on breaking the tiles down and bucketing the tiles in clarity buckets. This way we no longer need to analyze native to reconstruction, or native and VRS. We can just look at how clear each tile is and can easily analyze two various configurations but look at the difference in tile buckets. Also looking into using some of this for full movies, but processing time is much too slow. 60fps and above right now is murder without a dedicated CUDA script to do this from start to finish.
Last edited: