Velocity Architecture - Limited only by asset install sizes

Is your complaint that I used the word 'fluff'? Do you consider that derogatory? This is a B3D technical thread in a B3D technical forum. The quality of our reference materials is important in understanding how to interpret the language to gain a technical understanding. As this piece is not designed for forum dwellers like us, it doesn't have much technical merit in this discussion, no?

I don't get what you're arguing. I'm pointing out that because this isn't a technical document, "just in time" doesn't necessarily mean "with minimal latency" and is likely simplified language to communicate with the audience "quickly" in relation to how things were done before.

In a technical discussion about the potential non-stick chemistry of a new range of pots, would you look to the language of the marketing materials to try and understand what they are doing?

My first impression was was that you were calling the mere mention of instant access to a 100 GB of data on the SDD as “fluff”.

I still have trouble with the term in reference to the article. The article adds nothing new for us but our knowledge about the situation is built from consuming and sharing a plethora of articles, interviews, tweets and other sources. The purpose of the article is to enlighten the reader with the gist of the knowledge we have gathered without all the effort.

It adds no practical value to our discussion because that’s not its intent. Its purpose isn’t to better inform the already well informed.

Marketing fluff often involves a ton of wordiness and superlatives.
 
It adds no practical value to our discussion because that’s not its intent. Its purpose isn’t to better inform the already well informed.
Which is all I meant by 'fluff'. It's not a reference point for us in this discussion. I wasn't commenting on its value as a piece of public-facing marketing material; that's OT for this discussion.
 
I think you've clearly addressed the lack of clear, new and detailed information. A lot of people still think there is some sort of separate 100GB of storage or something weird like that. Although the way they've mentioned virtual RAM makes me believe they've added extra hardware to make this much more efficient. I though they'd add an HBCC for finer granularity in paging and RAM utilization for all data. For example when Phil says that the developer doesn't have to think about the limits of RAM when developing a game, it just screams HBCC and it's not a trivial statement. Request any file and the controller will determine what page is actually in physical memory. And I know game devs like low level control of the hardware and AFAIK AMD's HBCC is programmable. If there's one thing where I think they need to add more clarification it's the virtual RAM. If it was just normal virtual RAM they shouldn't have made it a marketing point.

as per radeon's pro ssg card, it could be similar to how they potentially use their onboard nvme ... a 100GB memory-mapped space ..

"AMD indicated that it is not using the NAND pool as a memory-mapped space currently, but it can do so, and it is a logical progression of the technology. The company also commented that some developers are requesting a flat memory model. Even though the company provides the entire pool as storage, in the future it can partition a portion of the pool, or all of it, to operate as memory, which will open the door to many more architectural possibilities."

This would be my assumption too. HBCC could describe what the XSX is doing perfectly based on the limited information that we have, it's AMD tech which is largely what the XSX is composed of, and there's precedent for it's use in this fashion in the form of the ProSSG. Future PC's with a similarly evolved HBCC controller could potentially do the same but with DRAM acting as an extra level of cache between the SSD and VRAM to make up for any shortcomings of lower end PC IO.
 
This would be my assumption too. HBCC could describe what the XSX is doing perfectly based on the limited information that we have, it's AMD tech which is largely what the XSX is composed of, and there's precedent for it's use in this fashion in the form of the ProSSG. Future PC's with a similarly evolved HBCC controller could potentially do the same but with DRAM acting as an extra level of cache between the SSD and VRAM to make up for any shortcomings of lower end PC IO.

Makes me wonder how HBCC (if the consoles employ a similar functionality) really works. Does the DRAM actually mimic a traditional cache? It would seem to me if you wanted to avoid writes to the SSD as much as possible then that would seriously influence the cache design of the dram/ssd portions of the hierarchy.
 
Last edited:
From one of James Stanard's recent tweets, he seems to allude to the "100GB of assets" wording simply referring to the fact that the average game install size is around 100GB.
 
From one of James Stanard's recent tweets, he seems to allude to the "100GB of assets" wording simply referring to the fact that the average game install size is around 100GB.

From what I can tell from the Vega whitepaper, HBCC specifically allows the game to see the SSD and video memory as a single unified pool of memory. The HBCC controller automatically moves the most recently or commonly used pages from the "slow memory segment" (SSD) to the "fast memory segment" (VRAM) to keep things running as fast as possible, but pages can be called directly from the slow memory segment if for whatever reason they haven't been pre-cached in the fast memory segment.

This coupled with relatively fast, low latency access to the slow memory segment (SSD) would certainly fit the Microsoft description of what the XSX is doing with "100GB of instantly accessible game data" given the presumed game size of 100GB.

The game effectively see's the entire SSD as VRAM.
 
From what I can tell from the Vega whitepaper, HBCC specifically allows the game to see the SSD and video memory as a single unified pool of memory. The HBCC controller automatically moves the most recently or commonly used pages from the "slow memory segment" (SSD) to the "fast memory segment" (VRAM) to keep things running as fast as possible, but pages can be called directly from the slow memory segment if for whatever reason they haven't been pre-cached in the fast memory segment.

This coupled with relatively fast, low latency access to the slow memory segment (SSD) would certainly fit the Microsoft description of what the XSX is doing with "100GB of instantly accessible game data" given the presumed game size of 100GB.

The game effectively see's the entire SSD as VRAM.
Is there some special way the game files have to be packaged on the SSD for this all to work?
 
It seems that the directstorage riddle has been resolved:

We have long suspected that MS has figured out a way of memory mapping a portion of the SSD and to reduce the I/O overhead considerably. I looked out for research on SSD storage from Xbox research members with no success until I realised that I was looking in the wrong place to begin with. MS research happens to count within its ranks Anirudh Badam as Principal Research Scientist. The latter has a paper published in IEEE about the concept of flashmap which subsumes three layers of address of translation into one (https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/flashmap_isca2015.pdf). The claimed performance gain is a reduction of latency of SSD access by up to 54%.
 
It seems that the directstorage riddle has been resolved:

We have long suspected that MS has figured out a way of memory mapping a portion of the SSD and to reduce the I/O overhead considerably. I looked out for research on SSD storage from Xbox research members with no success until I realised that I was looking in the wrong place to begin with. MS research happens to count within its ranks Anirudh Badam as Principal Research Scientist. The latter has a paper published in IEEE about the concept of flashmap which subsumes three layers of address of translation into one (https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/flashmap_isca2015.pdf). The claimed performance gain is a reduction of latency of SSD access by up to 54%.

Yet, many thought they knew better then MS and where sure they couldn't resolve this :p
 
I can't see why there would be.
Currently many assets are compressed and bundled in .PAK files (very similar to .zip) that exist in the NTFS filesystem to access a particular assets you are dealing with the NTFS filesystem and file I/O. I'd wager that not putting all the assets into a massive .PAK files that need to be navigated would improve latency. How much though? ¯\_(ツ)_/¯
 
Currently many assets are compressed and bundled in .PAK files (very similar to .zip) that exist in the NTFS filesystem to access a particular assets you are dealing with the NTFS filesystem and file I/O. I'd wager that not putting all the assets into a massive .PAK files that need to be navigated would improve latency. How much though? ¯\_(ツ)_/¯
You wouldn't use any packaging at all for optimal performance. Well maybe small package for things that must always be loaded. As long as you don't get any bandwidth problems (I don't expect that at all at those high bandwidth the new consoles will give) it is not needed to pack anything at all. This would only lead to wasted bandwdith (because you will load things you don't need) or to wasted CPU-cycles. "files" (or whatever we want to call the data) can still get compressed individual. E.g. Textures that get divided into small chunks can still get compressed individual or you use one file and only load the parts of the texture you need, but than you can't compress the file without wasting cpu or bandwidth when you want to read only small parts of it.
This would be overkill for a HDD but not for an ssd.
 
It seems that the directstorage riddle has been resolved:

We have long suspected that MS has figured out a way of memory mapping a portion of the SSD and to reduce the I/O overhead considerably. I looked out for research on SSD storage from Xbox research members with no success until I realised that I was looking in the wrong place to begin with. MS research happens to count within its ranks Anirudh Badam as Principal Research Scientist. The latter has a paper published in IEEE about the concept of flashmap which subsumes three layers of address of translation into one (https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/flashmap_isca2015.pdf). The claimed performance gain is a reduction of latency of SSD access by up to 54%.
So, that is why the OS still needs that much memory?
flashmapindex.png


Or that is the reason why they say 100GB and not 1TB ^^
 
You wouldn't use any packaging at all for optimal performance. Well maybe small package for things that must always be loaded. As long as you don't get any bandwidth problems (I don't expect that at all at those high bandwidth the new consoles will give) it is not needed to pack anything at all. This would only lead to wasted bandwdith (because you will load things you don't need) or to wasted CPU-cycles. "files" (or whatever we want to call the data) can still get compressed individual.

Yeah, this is a tricky one. If Series X's filesystem is predicated on NTFS then the consequence of pulling tens of thousands of individual asset files out of .PAK files for every game would result in a massive filesystem bloat. I think Microsoft probab;y have some sensible middle-ground.
 
It seems that the directstorage riddle has been resolved:

We have long suspected that MS has figured out a way of memory mapping a portion of the SSD and to reduce the I/O overhead considerably. I looked out for research on SSD storage from Xbox research members with no success until I realised that I was looking in the wrong place to begin with. MS research happens to count within its ranks Anirudh Badam as Principal Research Scientist. The latter has a paper published in IEEE about the concept of flashmap which subsumes three layers of address of translation into one (https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/flashmap_isca2015.pdf). The claimed performance gain is a reduction of latency of SSD access by up to 54%.

Superb find!

This is more or less what I was trying to suggest earlier this thread, as it would explain lower overhead, a finite size for the mapped space (the 100GB) and crucially the talk of low latency. Except this is a lot more detailed. And done by people who are actually clever.

So, that is why the OS still needs that much memory?
flashmapindex.png


Or that is the reason why they say 100GB and not 1TB ^^

My speculation a little while back was that the "100 GB" comment was entirely due to limiting the amount of reserved OS space required. ~200MB would be a small price to pay in terms of reserved memory, and as it's in the OS space the developer never needs to worry about it. Thinking about it, being able to store parts of the dash and in-game user interface in a similar fashion might well actually allow a smaller OS reserve overall. It's down from 3GB to 2.5GB despite potentially storing a "Flash Map" for the game.

And I'm going to hold to my guess that they'd build the "Flash Map" during install, and simply load it along with the game, or when you switch to a "recently played" game from a resume slot.

Shit, I'm missing the big presentation!
 
My speculation a little while back was that the "100 GB" comment was entirely due to limiting the amount of reserved OS space required. ~200MB would be a small price to pay in terms of reserved memory, and as it's in the OS space the developer never needs to worry about it. Thinking about it, being able to store parts of the dash and in-game user interface in a similar fashion might well actually allow a smaller OS reserve overall. It's down from 3GB to 2.5GB despite potentially storing a "Flash Map" for the game.
I'd suggest that unless Microsoft have plans for Series X that we don't know about, like allowing it run to run Windows, they just don't need the full implementation and bloat of metadata storage for their filesystem. You can save a ton of space, and increase I/O, but chucking out everything you don't need.
 
Regarding SFS, does the "approximately 2.5x the effective I/O throughput and memory usage above and beyond the raw hardware capabilities on average" statement come on top of the 4.8GB/s compressed number?
 
Regarding SFS, does the "approximately 2.5x the effective I/O throughput and memory usage above and beyond the raw hardware capabilities on average" statement come on top of the 4.8GB/s compressed number?

Yes. It is multiplicative, in that it minimizes how much data you need to send.

Assume you originally need to load 2.5 GB of data without compression and without SFS, here is how they interact...

So with compression of 2x that original 2.5 GB of data may be down to 1.25 GB of data.
So with SFS that original 2.5 GB of data may be down to 1 GB of data.
So with SFS and compression that 2.5 GB of data may be down to 0.5 GB of data.
 
Yes. It is multiplicative, in that it minimizes how much data you need to send.

Assume you originally need to load 2.5 GB of data without compression and without SFS, here is how they interact...

So with compression of 2x that original 2.5 GB of data may be down to 1.25 GB of data.
So with SFS that original 2.5 GB of data may be down to 1 GB of data.
So with SFS and compression that 2.5 GB of data may be down to 0.5 GB of data.

Thanks for you answer. Is SFS something that adds more work for developers, in that they have to manually manage what texture to load and when based on the sampler feedback?

I guess since how effective SFS is the wast majority of the data being streamed from the SSD is textures?
 
Last edited:
Regarding SFS, does the "approximately 2.5x the effective I/O throughput and memory usage above and beyond the raw hardware capabilities on average" statement come on top of the 4.8GB/s compressed number?
6GB/s, the max number they advertised, is "approximately 2.5x the effective I/O throughput and memory usage above and beyond the raw hardware capabilities on average". Is that just a coincidence? Asking those who know.
 
Back
Top