Digital Foundry Article Technical Discussion [2023]

Status
Not open for further replies.
What I am interested in is how much this impacts GPU performance.
I am guessing not much. Texture loading speed is only marginally faster on more powerful hardware, as Alex has shown: a 2060 takes 1.27 seconds, a 4070 takes 1.17 seconds and a 4090 takes 1.07 seconds. So, very minimal difference between vastly different GPUs.

Speaking of Portal Prelude RTX, the performance of Ampere GPUs seems far ahead of Turing GPUs. @1080p native, the 3060Ti is 20% faster than 2080Ti, while the 3070 is 40% faster. Apparently this was the same in Portal RTX, but it's the first time I paid attention to it.

 
Hmm. Nvidia already provided similar data. What I am interested in is how much this impacts GPU performance and VRAM. As @Dictator has a specific build without RTX IO this would've been the perfect opportunity to test that. But we only know the framerate of the RTX IO off build because the RTX IO on image was cropped :/

Alex, could you perhaps provide this information using the two builds? I'm sure a lot of us are interested in that and would appreciate it!

When Nvidia was asked about this one time, they said it's tiny and “probably not measurable”. (https://back2gaming.com/guides/nvidia-rtx-io-in-detail/)
I wonder if that's true.
literally no difference in my tests - but considering the loads are static and simple... I do not imagine there would be any anyway.
 
If it's not much, then it surely is interesting ratchet and clank only enables it at high settings and up. This implies to me that it's quite an intense task for the GPU. Anyway, we will find out soon enough.

@Dictator I see, thank you. That makes sense. Looking forward to your Ratchet and Clank tests!
 
I remember reading this interview, but... I've pulled stock HDDs from PS4's, hooked them up to a PC to test and benchmarked them, and they all have read speeds of about 100MB/s. Well, the working ones, anyway. That's pretty normal for 5400RPM drives. I'm sure there are cheap drives with less cache and worse seek times that are going to affect performance. And obviously the HDD controller in PS4 would affect performance. Did I have a fever dream or did I read that the PS4 only supports 1 SATA device internally so the HDD actually uses a USB to SATA converter on board? I feel like that information came out of the efforts to make Linux work on PS4. Perhaps the overhead and inefficiencies of that are causing issues.

Anyway, I'm not accusing Insomniac of lying about the HDD performance on PS4. In fact, I have no doubt they would make such claims without merit. But I do find it curious that there are drives that exist that limit PS4's performance to 20MB/s. I've never seen a working 5400RPM drive that slow. That's USB flash drive speeds.

Benchmarking a PS4 HDD won't tell you how much Sony can guarantee to a PS4 application. Not exactly Sony but Xbox docs talk about it

Guarantee​

The Xbox One and Xbox One S consoles had a minimum guarantee of 40 MB/s. The Xbox One X console increased the minimum guarantee to 60 MB/s. These numbers are well below the actual hardware limits, which are in the 130 MB/s range. This was entirely due to the overhead caused by the operating system.

DirectStorage removes most of the overhead caused by the operating system. This allows a minimum guarantee closer to the hardware limits. The new minimum performance guarantee is 2.0 GB/s over a 250 ms window for raw data. The use of decompression on the content will push the final bandwidth higher.

Future Xbox consoles will support the addition of dynamic user-installable drives that are also NVMe based. The same minimum performance guarantee that's provided for the internal drive is also provided for the user-installable drive.
 
Does RTX IO use the Cuda cores or some other part of the GPU? What is the bottleneck for performance?
Regular CUDA/FP32 ALU cores.

Some theorized that RTX IO would run on tensor cores, especially after GRAID announced it's accelerating RAID calculations using a 3060 GPU on their SR1010 RAID controller cards, and claimed they have the most powerful RAID accelerator ever in the world (and by a huge margin). GRAID said they are using the tensor cores and some AI algorithm to achieve this. So many people thought RTX IO would do the same on RTX GPUs.


But that theory is not true in the current form of RTX IO, it runs through DirectStorage in the case of DirectX 12, and runs through some special NVIDIA Vulkan extensions in the case of Vulkan, all on the regular CUDA cores. In the case of Portal Prelude RTX, it's running on VulkanRT, so it accesses RTX IO through the special NVIDIA Vulkan extensions, which is why it runs on NVIDIA GPUs alone currently.
 
Benchmarking a PS4 HDD won't tell you how much Sony can guarantee to a PS4 application. Not exactly Sony but Xbox docs talk about it
I understand this. What I was saying is that the claims that user replaced drives causing such a hit is wild. As stated in the Xbox doc you quoted, the overhead is "entirely due to the overhead caused by the operating system." They are claiming the user replaceable drives are half as fast, but in my testing of the hardware, the OEM drives are average in speed. If the drive's speed isn't the limiting factor, you would think that using a slower drive would have a minimal effect on guaranteed speed.
 
I understand this. What I was saying is that the claims that user replaced drives causing such a hit is wild. As stated in the Xbox doc you quoted, the overhead is "entirely due to the overhead caused by the operating system." They are claiming the user replaceable drives are half as fast, but in my testing of the hardware, the OEM drives are average in speed. If the drive's speed isn't the limiting factor, you would think that using a slower drive would have a minimal effect on guaranteed speed.
Actually this is partly my bad, I said it was 20mb/sec but in their presentation Insomniac said they arrived at 20mb for each game tile, which needed to be loaded in 0.8 of a second. So a little better, but not massively off. Here is the actual segment of Insomniac's 2019 GDC presentation talking about I/O.


You'll note that @dobwal flagged all of the things that could impact the I/O limiting what the game can rely on, which the Insomniac guy also says. Downloading other games, or recording video on a slower HDD with smaller cache can really bottom out your effective read speed because rather than just reading the game, it's also having to write out in progress downloads video as well.
 
What is Alex looking at? Space? A rainbow? A bird? we need to know!!!
 
I wonder if we'll get any Ratchet and Clank PC reviews tomorrow?

I assume (would hope) some outlets have gotten codes and have been testing/benching the game already for a bit.

Looking forward to the game, and the hopefully good news regarding the port quality!
 
Up until now, it's only Fortnite that's seen Unreal Engine 5's Nanite micro-geometry system deployed on the current generation consoles. Remnant 2 changes all of that, with both Nanite and virtual shadow maps deployed. Can the current-gen consoles handle it? Is 60fps viable in performance and balanced modes? What about Xbox Series S? We had questions and Oliver Mackenzie provides answers.
 
Interesting that the game likely uses UE5 TSR on consoles but the option isn't exposed on PC. In any case the game seems crazy heavy on all systems.

Yah, not shocking that Lumen wasn't included. They'd probably prioritized geometry and shadows, which is honestly probably the better choice for that game.
 
Interesting that the game likely uses UE5 TSR on consoles but the option isn't exposed on PC. In any case the game seems crazy heavy on all systems.

Yeah I had to quickly check it out to see if it was as crazy heavy as it was rumored to be (it was), but man - FSR2 truly looks awful on this, even more of a gap between it and DLSS than normal. Trying FSR2 performance at 1440p and it's just a mess, while DLSS still looks presentable. Don't blame them for using UE5's own temporal solution on consoles, but it was odd it wasn't an option on the PC.

On the slight upsides, like other UE5 titles the vram usage is relatively low, and I didn't see any shader stuttering. Think 12-thread CPU's will be the minimum for this, seems 6-core only CPU's are crippled but that really shouldn't be surprising by now. It is definitely not a well optimized title regardless, UE5 or not.
 
What's up with no raytracing for amd cards in Ratchet & Clank Rift Apart for amd GPUs at release?
They must have performance issues on AMD GPUs...
Maybe those super efficient low level console API instructions translate rather bad to PC APIs..
Not much room there with all that Direct Storage GPU decompression shenanigans happening at the same time huh ..?!
 
Status
Not open for further replies.
Back
Top