Digital Foundry Article Technical Discussion [2022]

Status
Not open for further replies.
I meant the Cost of render time on GPUs. I simply doubt Nvidias statement of RTX I/O being so lean on a GPU. Like i wrote obove - i want to see something. At least a Demo of some sorts. Should not be so difficult. And to adress the "2Tf sacrifice are enough to catch up or even overtake PS5s I/O capabilitys" - i think that came out of another Thread wich was linked here some while ago. I absolutly tink that for one - on RTX 20xx nobody has 2Tf to spare when it renders a lets say 2nd wave PS5 Exclusive. I think trying to mimik PS5s I/O throughput is much more stressfull to a GPU then most people think. It is cute that Jen-Hsun Huang thinks that simply flipping a switch on a RTX 20xx / 30xx GPU is sufficient to mimik PS5 I/O Capabilitys without adding up a kinds of latencys in the CPU/ GPU correspondance ( wich is already much higher than on Consoles because UMA instead of hUMA ) .

That’s dependent on whatever decompression scheme that Nvidia or other developers offers in conjunction with RTXIO. I need to find the link but an A100 using nvCOMP and a third party decompression method called Bitcomp can chew through compressed textures or geometry at 400+ GBps.

An A100 offers about 50% more tflops than something like a 3060 but we are not talking the need to handle 400 GBps of compressed data but rather 10-15 GBps.

People keep talking about how current software is designed around HDDs which presents a limitation. But we forget that current hardware has been designed around HDDs too. Current VRAM sizes on GPUs are the product of limited bandwidth between HDDs and GPUs/CPUs. Nvidia nor AMD were building ever growing chips only to starve them. VRAM sizes have increased to accommodate the data needs of the GPUs.

Mother AMD and Father Nvidia didn’t have the capacity to feed Gerry, Preston and Unis on an as-needed basis but they did have the ability to buy a minivan to fill up with groceries that would last for weeks. Just because a grocery store with a drive-thru opened up across the street doesn’t mean the kids have a greater capacity to eat more food.

Removing the main storage bottleneck is going to offer advantages going forward. The main one being the ideal ratio of RAM to GPU performance will grow smaller.
 
Last edited:
Well I just wish that we will see at some point a game fully utilising the purpose of the super fast I/O on PS5, because the concept and possibilities described by Cerny sound super fascinating.
But we havent yet seen a game taking prober advantage of the hardware.
 
I'll add that perhaps people should also wait for evidence that it's necessary in the first place.
For sure both sides are jumping the gun. We don't know how impactful the PS5 IO system is relative to PC in a variety of real world scenarios. For now all we can say is it might offer a bit faster loading based on available data. We know even less about direct storage because we have no available data. It has been 2 years and we don't know any more than we did when it was announced and there is yet to be a single use case we can judge. GPU decompression is even more of a question mark. How can anyone throw it around as a solution to a problem that might not even exist at this point in time?
 
I'll add that perhaps people should also wait for evidence that it's necessary in the first place.

How can anyone throw it around as a solution to a problem that might not even exist at this point in time?

Same for the PS5.

We don't know how impactful the PS5 IO system is relative to PC in a variety of real world scenarios

And vice versa, we dont know how impactfull pc io will be on the ps5. Will the ps5 hold back things in IO department?
 
That gen of consoles never surpassed Crysis 1.
In terms of what? Technical proficiency, physics, and rendering features? Perhaps but even then, it’s strongly questionable as far cry 2 was on Xbox 360 I think. In terms of visuals/art direction? That’s purely subjective.

You should be more specific with your comments because when I read this, I see someone trying to pass their opinion off as a fact.
 
For sure both sides are jumping the gun. We don't know how impactful the PS5 IO system is relative to PC in a variety of real world scenarios. For now all we can say is it might offer a bit faster loading based on available data. We know even less about direct storage because we have no available data. It has been 2 years and we don't know any more than we did when it was announced and there is yet to be a single use case we can judge. GPU decompression is even more of a question mark. How can anyone throw it around as a solution to a problem that might not even exist at this point in time?

I don't think defining it as a solution to a problem that needs to be solved, i.e. if it doesn't work then it's game over, is necessarily the correct characterization. All Direct Storage is doing, whether it's the file IO improvements or GPU decompression is reducing the CPU overhead associated with high levels of streaming or data loading.

Without Direct Storage at all, that process would still happen, and with a sufficiently powerful CPU could be as fast or faster than the PS5 assuming equally optimised code. But with Direct Storage the CPU requirements should be lower, and more in line with the level of CPU in the PS5 for those tasks.

Is it really "jumping the gun" to believe that Direct Storage will indeed reduce CPU load? We've seen the Forsaken devs provide hard data that it does exactly that on the file IO side. Is there any reasonable basis - at all - to suspect the GPU compression aspects won't do the same when they are ready?
 
Without Direct Storage at all, that process would still happen, and with a sufficiently powerful CPU could be as fast or faster than the PS5 assuming equally optimised code. But with Direct Storage the CPU requirements should be lower, and more in line with the level of CPU in the PS5 for those tasks.
It's my understanding the whole point of the additional units in the I/O complex in PS5 was to ensure PS5's CPU has nothing to do with file I/O, at all.
 
Crysis 2 on consoles doesn't look nearly as food as Crysis 1 on PC.
What's PC got to do with the quality of graphics across PS3 games?

I'm more confused now than my previous post which started with: "I may be misunderstanding, but..." :-?
 
I'm late to the party it seems but welcome to the forum. I think you have the decent intentions of your arguments, but perhaps the wrong counter points.
I meant the Cost of render time on GPUs. I simply doubt Nvidias statement of RTX I/O being so lean on a GPU. Like i wrote obove - i want to see something. At least a Demo of some sorts. Should not be so difficult. And to adress the "2Tf sacrifice are enough to catch up or even overtake PS5s I/O capabilitys" - i think that came out of another Thread wich was linked here some while ago. I absolutly tink that for one - on RTX 20xx nobody has 2Tf to spare when it renders a lets say 2nd wave PS5 Exclusive.
Without a doubt compression can be crushed by GPUs, it's done normally on GPUs, and there's nothing particularly difficult an dedicated ASIC chip could do that the vast amount of ALU compute units can't also do. It should be noted that most GPUs are often at any point in time up to 50% saturated at any given time, and each 1% above that gets incredibly more difficult to extract. This leaves a lot of available time for asynchronous compute calls to come in and perform decompression on assets in between rendering calls.

That being said, your intent is still correct here, but you're looking the wrong thing. The PS5 has a dedicated IO decompression kit, thus it's dedicated and it's latency is guaranteed. The GPUs do not have a dedicated chip, but have more than sufficient horsepower to complete the task several times over. That being said if there is a call to decompress an asset while it's already running a job, that job may need to wait, where you don't have this problem with a dedicated chip. So this is a latency issue not a power one.

When it would be so much more efficient on GPU ("barely measurable") why did Sony or Cerny not take this aproach when designing the PS5? It would be way smarter then to not develop the whole custom I/O shenanigens and go with a bit bigger GPU and do the Decompression there. Smarter because then , if a game would for some reason not rely much on decompression they could use the GPU then for better IQ and Detail Settings instead. Clearly here is something that Nvidia is not telling. And again - where are the demos?
Logic simply suggests that any major custom HW development that sums up many millions MUST be considered more efficient than using already existing (general purpose) tech, because all comitees and engineers involved in the process would have pointed out or even blocked further investigation in custom tech.
With consoles it always comes down to money. They are required to extract the most performance per dollar that they can, and that means in a world where you cannot keep increasing the cost of compute due to increasing costs of bandwidth, dedicated silicon and other accelerators make for a desirable alternative. I hope this makes sense, you can't just keep having more compute power without more bandwidth. The latter is really expensive in terms of scaling up vs compute power.
 
With consoles it always comes down to money. They are required to extract the most performance per dollar that they can, and that means in a world where you cannot keep increasing the cost of compute due to increasing costs of bandwidth, dedicated silicon and other accelerators make for a desirable alternative.
For this generation of consoles, this was entirely about chasing performance and not keeping costs down (money). Engineering and manufacturing this custom silicon and creating APIs for it work, cost time and money. Both Microsoft and Sony could have let the APUs in the consoles do the decompression as the always have done, by the CPU or via the zlib decompressors that they had in the previous generation.

In X years, this functionality will definitely be part of PCs I/O chipset which is where is should be. Data is read, decompressed and routed to main RAM or video RAM with no CPU or GPU intervention. But that's a load long. Intel need to drive this, working OS vendors to build support for this into the OS.
 
Status
Not open for further replies.
Back
Top