Really cool tech. So does this take away from GPU power for other tasks? So essentially trading GPU power for more efficient memory and bandwidth?Here is a vid tech demo of XVA with emphasis on the benefits of SFS over just using XVA:
Really cool tech. So does this take away from GPU power for other tasks? So essentially trading GPU power for more efficient memory and bandwidth?Here is a vid tech demo of XVA with emphasis on the benefits of SFS over just using XVA:
What do you mean?Really cool tech. So does this take away from GPU power for other tasks? So essentially trading GPU power for more efficient memory and bandwidth?
Really cool tech. So does this take away from GPU power for other tasks? So essentially trading GPU power for more efficient memory and bandwidth?
It should do for sure.At the very least relieve developers of their challenges around fitting their game around the memory constraints
Well, SF is more or less a based on an "info" (feedback) for the engine what is needed and what not (more or less). So it must actively get integrated. It is nothing that has an automatically integrated. Just like mesh-shaders. If the engine/game does not use it, it is more or less "useless" and done the traditional way.It should do for sure.
Is SFS pretty much automatic, or is there a bit of work on the dev end to use it?
Yes, the decompressor is a hardware block as said before. The lack of other IO hardware takes you from the 100% SSD speed to the 20% real result. See Cernys slide. No software can overcome that 80% without eating many CPU resources.
This is really not true or embellished reality. You do not need 100% IO bandwidth all the time. Normally you only need a fraction of the available bandwidth, but when you need it, you want to have it as fast as possible.Yes, the decompressor is a hardware block as said before. The lack of other IO hardware takes you from the 100% SSD speed to the 20% real result. See Cernys slide. No software can overcome that 80% without eating many CPU resources.
The other reason why hardware blocks are good, and something folks don't appreciate until you debug a decompression routine, is the cache hit for CPU decompression. If you're already tight on cache running your massive open world, throwing CPU-decompression means more cache contamination. CPU-decompression is fast because it leverages cache.Even without a hardware-block (at least xbox has it) in real-life workflows it might still only make a minor difference. E.g. Microsoft concentrated more on only load things that are really needed, so the IO-bandwidth and IO operations getting even less of a limiting factor.
you canty quote Cerny who compare PS5 to PC and then compare it to XSX. They put a lot of work to overcome various limitations and maximise sdd. Everything is explained here
Here is a vid tech demo of XVA with emphasis on the benefits of SFS over just using XVA:
The greater the frametime, the better SFS should work as it will give a little bit more time to DMA the requested tile from the SSD. It is at high framerates that I think SFS may get into trouble.This is very nice. He talks about the speed of feedback and not seeing things being loaded. How quick does the feedback come, real time or per frame? That demo seemed to be running in the high hundreds to low thousands of FPS. A far cry from 60 or 30fps, slower feedback and more to load.
The multiplier for memory and IO does not change but will the overall experience?
The ssd will be the same but I wonder if the feedback is delayed by 33ms if it works as well.
Just my musings.
We need some true this generation games using all the next gen console tech, it should be amazing
(5) Importantly, DirectStorage cuts down on latency by optimising path length, bypassing indirection of the filesystem and the FTL of volume layers. This is certainly being achieved through Flashmap which is tailor-made for that exact function (https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/flashmap_isca2015.pdf). If true, this will confirmed memory mapping of a portion at least of the SSD.
On the PC the GPU is doing the decompression since it doesn’t have the decomp block like in the Series X! Still much better than using a CPU! And even better the data is decompressed when it reaches VRAM.Really cool tech. So does this take away from GPU power for other tasks? So essentially trading GPU power for more efficient memory and bandwidth?
Sampler Feedback doesn't, decompression (on PC) does.Really cool tech. So does this take away from GPU power for other tasks? So essentially trading GPU power for more efficient memory and bandwidth?
At this point Sampler Feedback Streaming is only available on Xbox Series X/S?
I have heard that it will come to PC via Direct 12 Ultimate, but at this point it hasn't?
Microsoft said they added specific hardware to the Xbox alone for SFS from what I recall.
James Stanard said that only Sampler Feedback was a Direct X 12 Ultimate feature, and not the streaming.
So question is, can it be applied to Nvidia GPUs for instance if they don't have the same hardware as Xbox?
I haven't seen Nvidia advertise it, but I have seen reports that it is coming to PC via DX12U.
I think people are getting confused with SF and SFS.
So is it coming to PC, or is Stanard right that its not a DX12U feature?
I can be wrong but i will give it a try. It looks like SFS requires DirectStorage, DS on XSX/S is build with flashmap as a backbone and imo here is the problem.
I dont think this can be achived with a simple DX upgarde. Flashmap is not a simple IO improvement, it completely redesigns how ssd is accessed. Perhaps it will come later as an update to os? Maybe msft is not planning to release it on Pc, i have no idea. There is nothing hw wise that says it cannot be done thou.
In the paper about sampler feedback that @Ronaldo8 linked on previous page i dont see anything hw specific to amd. I think it shouldn't be a problem for a modern nvidia gpu.
So in saying all that, could SFS be implemented on GPUs that don't have the same customizations as Series X/S?Sampler feedback is a feature already available in RTX 20 series cards (introduced 2 years back). The only hardware customization (although significant) on series console not included (as of now) in available GPU cards are specialized texture filters and the feedback map implemented in caches (though I guess the latter can still be implemented somehow?).
Flashmap, if it is indeed the solution adopted by MS, is a purely software implementation that enables SSD memory-mapping and the resolution of the FTL and filesystem layers into a single one (a software wrapper that treats every file like a singular small SSD). PCs and datacenters are the more obvious deployment environments to be honest.
Sampler feedback is a feature already available in RTX 20 series cards (introduced 2 years back). The only hardware customization (although significant) on series console not included (as of now) in available GPU cards are specialized texture filters and the feedback map implemented in caches (though I guess the latter can still be implemented somehow?).
Flashmap, if it is indeed the solution adopted by MS, is a purely software implementation that enables SSD memory-mapping and the resolution of the FTL and filesystem layers into a single one (a software wrapper that treats every file like a singular small SSD). PCs and datacenters are the more obvious deployment environments to be honest.