Velocity Architecture - Limited only by asset install sizes

So in saying all that, could SFS be implemented on GPUs that don't have the same customizations as Series X/S?

Looking at the specs
https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html
it seems to me that SFS can be implemented on any SF capable gpu. And it seems i was wrong on DS as being required. Looking at the youtube presentation video xva helps with streaming and compression when using with SFS, there was a comment in the chat:
"SFS results in approx 2.5x the effective IO throughtput (SSD perf) and memory usage above and beyond hardware capabilities on average.
In other words, it gives a speed boost till 6GB/s (raw) / 12GB/s (compressed) on Series X."
 
Last edited:
Looking at the specs
https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html
it seems to me that SFS can be implemented on any SF capable gpu. And it seems i was wrong on DS as being required. Looking at the youtube presentation video xva helps with streaming and compression when using with SFS, there was a comment in the chat:
"SFS results in approx 2.5x the effective IO throughtput (SSD perf) and memory usage above and beyond hardware capabilities on average.
In other words, it gives a speed boost till 6GB/s (raw) / 12GB/s (compressed) on Series X."
Microsofts patent for SFS.
https://patents.google.com/patent/US10388058B2/en
 
DirectX allows for SFS and streaming SFS, the Series X/S employs a custom filter that isn't available with the PC version of DX.

https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html

How to adopt Sampler Feedback for Streaming
To adopt SFS, an application does the following:

  • Use a tiled texture (instead of a non-tiled texture), called a reserved texture resource in D3D12, for anything that needs to be streamed.
  • Along with each tiled texture, create a small “MinMip map” texture and small “feedback map” texture.
    • The MinMip map represents per-region mip level clamping values for the tiled texture; it represents what is actually loaded.
    • The feedback map represents and per-region desired mip level for the tiled texture; it represents what needs to be loaded.
  • Update the mip streaming engine to stream individual tiles instead of mips, using the feedback map contents to drive streaming decisions.
  • When tiles are made resident or nonresident by the streaming system, the corresponding texture’s MinMip map must be updated to reflect the updated tile residency, which will clamp the GPU’s accesses to that region of the texture.
  • Change shader code to read from MinMip maps and write to feedback maps. Feedback maps are written using special-purpose HLSL constructs.
 
Perhaps this has been discussed before but it looks like the roadmap for games that utilize this will definitely mean that games will need to run on SSD and current options of using external HDDs will only be for cold storage or am I off-base here.
 
Perhaps this has been discussed before but it looks like the roadmap for games that utilize this will definitely mean that games will need to run on SSD and current options of using external HDDs will only be for cold storage or am I off-base here.
We've known that games that make use of Velocity Architecture will only run on the built in SSD or the little SSD add on thingy for over a year now.
 
We've known that games that make use of Velocity Architecture will only run on the built in SSD or the little SSD add on thingy for over a year now.

I figured I was stating the obvious. I just wonder how this will be conveyed to the consumer, if at all.
 
The custom texture filter hw is the only one we know about. I wouldn't be surprised if there's other hw accelerators in the decomp block for other related features.
Not the only thing.
If I remember correctly, it also stores some texture map lookup data in hardware or something, can't remember right now, but that's also custom. Possibly something else also.

In the end its how much these even minor differences make, could be things that make big effective performance improvement, or quality of result etc.
 
SFS can be adopted for any D3D12 12_2 capable system, but right now it only makes sense on Xbox Series X|S because of UMA and low latency I/O subsystem.

Unless by streaming you mean streaming from main memory to VRAM.
 
SFS can be adopted for any D3D12 12_2 capable system, but right now it only makes sense on Xbox Series X|S because of UMA and low latency I/O subsystem.

Unless by streaming you mean streaming from main memory to VRAM.
Sampler feedback is usable on all d3d12 systems and it’s perfectly fine to use on PC no reason it wouldn’t be usable; if you’re pulling resources regardless if from IO or from VRAM you may still want to see what you sampled.
 
Andrew Yeung (DirectStorage team from MS) basically states that PC developers should start getting ready for next gen now, by implementing Sampler Feedback in their engines and get the engine set up to be very granular and parallel for these workloads.

Basically PC DX12U GPUs are perfectly capable of Sampler Feedback Streaming, however to truly get the most out of the technology, you're going to want your engine to be DirectStorage ready so that you can actually lean hard into the idea of only streaming what's necessary.
 
I'm kind of curious about what the 100GB figure actually means.

The 100GB figure was just a number used for marketing. They can use more or they can use less.
 
I'm kind of curious about what the 100GB figure actually means.

I did read this again: https://news.xbox.com/en-us/2020/07/14/a-closer-look-at-xbox-velocity-architecture/

But it doesn't really say anything about that. Is it just an example of a typical game size? Is it a method of addressing virtual memory of 100GB? Is the number actually relevant for anything?
It's the amount of virtual RAM available for a game thats instantly accessible to the CPU/GPU. Possible that there is more than 100GB though; depending on if there's enough space to store the translation of virtual to physical addresses. Possibly 100GB is a rough estimate for a minimum amount that developers should bank on.
 
Last edited:
Possible that there is more than 100GB

As said multiple times earlier in this thread, the 100gb figure was strictly something said for easy marketing. They can use more than that.
 
The virtual memory is the size of the asset in the game. If the game sizze is 200GB, the vast majority will be assets.
 
It's the amount of virtual RAM available for a game thats instantly accessible to the CPU/GPU. Possible that there is more than 100GB though; depending on if there's enough space to store the translation of virtual to physical addresses. Possibly 100GB is a rough estimate for a minimum amount that developers should bank on.

The virtual memory is the size of the asset in the game. If the game sizze is 200GB, the vast majority will be assets.
Its actually an address space of data on SSD(disk) and physical RAM that the game(running on the CPU & GPU) thinks is all in physical RAM. So when an asset is needed its sent out to physical RAM and the CPU/GPU doesn't realize this. It thinks it's all in the physical RAM. So there has to be some limit on the size of the virtual memory depending on amongst others the size of RAM and other caches available to store the virtual addresses. I did a rough calculation and got 500GB virtual RAM for the Series X but could be very wrong. MSFT stating 100GB means that 16 GB of physical RAM and 84GB of space on the SSD of the game install will appear as in the same location(virtual RAM). So developers can bank on this when making their games. They don't need 84GB of extra physical RAM when they can instantly access it when its needed as needed from the SSD. But as others have stated it was used as a marketing term and it wouldnt be surprising that a larger amount of of a game install(more than 84GB of a game install on the SSD) is can be mapped onto the virtual address space.
 
As said multiple times earlier in this thread, the 100gb figure was strictly something said for easy marketing. They can use more than that.

I know you already know this, but just in case others haven't noticed your previous posts.

This can obviously also be smaller than 100 GB. There's no reason to reserve a larger amount of space on the SSD than any particular game requires. Not all games will have 100 GB of game assets that need to be stored in a directly accessible manner on the SSD.

Ori and the Will of the Wisps is ~80 ~15 GB installed, so assets that would benefit from being directly accessible as if they were in memory will be some amount less than 80 15 GB.

I know people like to compare it to virtual memory, but people have to remember that it's not like virtual memory like what virtual memory is in Windows. In windows programs are swapped into and out of virtual memory. This incurs both reading and writing costs. As well, IIRC, you can't easily just read individual things out of virtual memory.

Velocity arch. is an extension of the changes they've made to the I/O subsystem as a whole. There's no swapping of data into and out of this SSD storage space, thus there is no write performance cost. You can also read only what you immediately require out of virtual memory versus swapping either the entire program and data into memory or pages/chunks of virtual memory into memory. And all of this is being done with the hardware assist (they can likely leverage the hardware that is used for SFS) that's been built into the SOC for the I/O subsystem. This means you can likely just pull fragments of data directly from this space versus whole items from this space.

Regards,
SB
 
Last edited:
I know you already know this, but just in case others haven't noticed your previous posts.

This can obviously also be smaller than 100 GB. There's no reason to reserve a larger amount of space on the SSD than any particular game requires. Not all games will have 100 GB of game assets that need to be stored in a directly accessible manner on the SSD.

Ori and the Will of the Wisps is ~80 GB installed, so assets that would benefit from being directly accessible as if they were in memory will be some amount less than 80 GB.

Regards,
SB
Umm....Ori and the Will of the Wisps is nowhere near 80gb installed.
 
Back
Top