Velocity Architecture - Limited only by asset install sizes

Please excuse the mentioning of the PC @BRiT, but I want to tie a few things together with reference to the XSX, and the capabilities of the machine.

RTX_IO.jpg


What I find interesting here is that these "RTX IO" numbers kinda match what MS said about XSX.

If you take the green bar at 14 GB/s and divide it by 2.4GB/s for the XSX SSD you get 5.83. Now divide the .5 cores for "RTX IO" by that 5.83 and you get 0.86 cores.

That's (un)suspiciously close to the CPU overhead figures MS were throwing around for Direct Storage when they were revealing the XSX specs, even accounting for different CPU cores and workloads. The savings are roughly as staggeringly big.

[Edit: I may have boobed a little. I assumed "Read Bandwidth GB/s" meant the drive's read bandwidth - literally what is being read from the actual drive (e.g. 14 GB/s from two raided PCIe 4 SSDs). But if the bars are showing output after reading from the drive and processing / decompression using something like BCPack (would make no sense to do that to me) ... well ... that only makes the XSX look *even better* in terms of relative overhead. Anyway, doesn't change my conclusions below one bit!]


So anyway, we already know that the XSX GPU can read directly from the SSD without it having to go into memory first, that the GPU can process that data and then write it out to GPU memory, and that doing so using Direct Storage has a similarly low (almost negligible) overhead on the CPU.

Basically, XSX can already do what Nvidia have cleverly branded "RTX IO". At very low cost XSX can pull data directly into the GPU, process it, and write it out to memory for later use. Only differences I can see at this point are that XSX can have (optionally) put it through their hardware decompression block first, and Nvidia aren't tied to a 2.4 GB/s drive.

Then again, it's not like MS can't release an optional faster drive at some point ... in theory. Whether that would make sense is another matter, but I'm pretty sure they could, and they could pump the data straight to the GPU to do whatever decompression they wanted to just like Nvidia are showing in the slide above. It's not like the XSX couldn't afford the CPU overhead. ;)
 
Last edited:
Basically, XSX can already do what Nvidia have cleverly branded "RTX IO"
I don't really see them as competing.
The velocity architecture is the implementation of the direct storage api with a superset of some features.
RTX IO is making use of direct storage also which doesn't seem to be ready yet, on pc anyway.
They both will make use of the new api. So they had to call their thing something, even if it is marketing, same way amd will call theirs something also.
 
Quickly back to my previous post: If those "Read Bandwidth" figures are not for the SSD itself but actually mean "read / decompress / output" (which wouldn't apply to the first 3 entries on the graph so wth) then for them to mean anything they would need to be referring to some kind of fixed rate compression algorithm. Only one I can think of off hand that gives an exact 2:1 ratio is probably BCPack.

Which might make sense, as 2:1 lossless compression of textures would make perfect sense to do on the GPU itself.

I don't really see them as competing.
The velocity architecture is the implementation of the direct storage api with a superset of some features.
RTX IO is making use of direct storage also which doesn't seem to be ready yet, on pc anyway.
They both will make use of the new api. So they had to call their thing something, even if it is marketing, same way amd will call theirs something also.

Yeah, I wasn't meaning they were competing as such, more that this is the next stage for everyone and MS are basically already there. But Nvidia - like Apple - are really good at presentation and naming and making it look like their implementation of something is a new idea. Can't blame them.

I've some people in some less enlightened corners of the internet declaring that RTX IO makes "XSX" look obsolete. But RTX IO is basically Nvidia GPUs supporting MS's new Storage API - at some point in the future. Where as MS otoh have had to deliver a complete solution, in totality, from SSD to controllers to decompression blocks to GPU customisations.

I mean, you probably already got that. I just thought it was worth pointing out! :D

Anyway, I just hope that PC RDNA 2 has the same kind of Direct Storage support that XSX does!
 
I'll be pretty amazed if RDNA2 on the PC isn't doing something very similar to RTX IO.

Yeah, my hope is that MS and AMD worked on this together with a view to it going into RDNA 2. Same way Cerny said some of their requests for PS5 would make their way into RDNA 2 for other platforms if they made sense.

Well, this certainly makes sense and it's certainly the right time. And no doubt MS would be chuffed for both PC DX and for XSX/S if this got a good level of support.

But on the other hand ... I'm always expecting to be disappointed.
 
They probably mean in speed, it's both faster then XSX and PS5 ssd tech. I think not many expected this on the pc.

I think there is some weird thing where people buy Sony on a pedestal and think they are ahead of the curve. However a redesign of IO for windows has been in the works for almost 5 years at this point. This isn't something new that they thought of since nov.
 
They probably mean in speed, it's both faster then XSX and PS5 ssd tech. I think not many expected this on the pc.

Well, the folks I've seen seem to think this is a new paradigm that makes XSX fundamentally outdated. Even thought it's mostly just .... doing in 2021/2022 what XSX is doing this year.

And I very strongly suspected this was possible, I just wasn't sure anyone would be up for changing everything from SSD firmware through to chipset drivers through to GPU. But it's kinda awesome that this is happening. MS have done it on console, and Nvidia will now support their end on PC, and so everyone else has no choice but to come along at some point. Good stuff!

Considering that XSX can decompress on its GPU like Ampere, or on CPU, or through it's custom decompression block (with no hit to GPU or CPU), and that CPU overhead is if anything lower than on PC even with DS and RTX / IO .... I'm pretty sure that SSD IO is going to be the least of XSX's problems.*

*That's going to be sales.
 
I think there is some weird thing where people buy Sony on a pedestal and think they are ahead of the curve. However a redesign of IO for windows has been in the works for almost 5 years at this point. This isn't something new that they thought of since nov.

Thats because Sony attracts a generally younger audience with play station. NV and MS probably have invested a ton in this tech, NV said in the billions for Ampere.

Well, the folks I've seen seem to think this is a new paradigm that makes XSX fundamentally outdated. Even thought it's mostly just .... doing in 2021/2022 what XSX is doing this year.

The PC solution is just faster, but perhaps the same tech. So no, its not outdated in any way.

But could it be equally or more capable than what is a likely quite a cheap hardware unit? I see no evidence to suggest it wouldn't be and I see Nvidia claiming very specifically that it will be.

And I very strongly suspected this was possible, I just wasn't sure anyone would be up for changing everything from SSD firmware through to chipset drivers through to GPU. But it's kinda awesome that this is happening. MS have done it on console, and Nvidia will now support their end on PC, and so everyone else has no choice but to come along at some point. Good stuff!

Considering that XSX can decompress on its GPU like Ampere, or on CPU, or through it's custom decompression block (with no hit to GPU or CPU), and that CPU overhead is if anything lower than on PC even with DS and RTX / IO .... I'm pretty sure that SSD IO is going to be the least of XSX's problems.*

MS and NV/PC are not competing directly, MS worked with Nvidia to create this IO/SSD tech. Together they achieved this fastest SSD tech so far.

Edit: where did pjbliverpool's post-quote come from lol.
 
For Microsoft it actually comes from needing better performance on mobile devices like the neo . It was a big focus on windows 10 x. The problem is that windows is like a huge boat and it takes quite a while for boat to change course. Not only does microsoft have to do a lot of work to make sure none of this breaks anything like programs from 30 years ago but they also need to make sure it comes out as hardware vendors are ready for it. If they announced it last year and nvidia didn't have hardware support and nvme makers didn't have hardware for it what would be the point ?
 
For Microsoft it actually comes from needing better performance on mobile devices like the neo . It was a big focus on windows 10 x. The problem is that windows is like a huge boat and it takes quite a while for boat to change course. Not only does microsoft have to do a lot of work to make sure none of this breaks anything like programs from 30 years ago but they also need to make sure it comes out as hardware vendors are ready for it. If they announced it last year and nvidia didn't have hardware support and nvme makers didn't have hardware for it what would be the point ?

NV has the hardware expertise, GPU's are excellent for decompression, they perhaps made some changes somewhere to accomodate it. MS sure has the software and resources, their also working on XSX/next gen ssd tech and probably want atleast parity on the pc platform.
 
.

Then again, it's not like MS can't release an optional faster drive at some point ... in theory. Whether that would make sense is another matter, but I'm pretty sure they could, and they could pump the data straight to the GPU to do whatever decompression they wanted to just like Nvidia are showing in the slide above. It's not like the XSX couldn't afford the CPU overhead. ;)

I would guess this is possible, but I heavily suspect they are limited by their encryption.

Xbox one encrypts data in and out of memory / HDD. It only ever is decrypted when within the APU. I would think they will continue this as prior security has proven very resistant to attack.

What they do now I doubt we will know until the end of the generation, but I think any change to IO will have to fit around whatever is implemented now, and how much overhead this has is unknown.
 
I would guess this is possible, but I heavily suspect they are limited by their encryption.

Xbox one encrypts data in and out of memory / HDD. It only ever is decrypted when within the APU. I would think they will continue this as prior security has proven very resistant to attack.

What they do now I doubt we will know until the end of the generation, but I think any change to IO will have to fit around whatever is implemented now, and how much overhead this has is unknown.

This is a very good point.

My feeling though is that as MS are going to be using large volumes of the Anaconda silicon in their Azure blades, and that as the Azure team were working closely with Xbox throughout development, they would make sure they could make full use of the PCIe x4 bandwidth.

I wouldn't be surprised to see the Azure devices equipped with faster SSDs than consumer XSX units. Given how expensive server farms are to set up and run I can't imagine they would "cheap out" on a relatively tiny block of silicon, and have no option to use faster drives.

Like you say MS aren't going to be giving out too many details, but down the line perhaps they would be willing to say if their Azure Anaconda units can use top end Gen 4 NVMe drives.
 
MS does not get anywhere near the kudos it deserves for it's incredible ability to produce new O/S that don't break everything that runs on the old O/S. DirectStorage is genuinely exciting even if the prospect of regular GPU, SSD and mobo BIOS updates while the tech matures is not
I mostly agree, but a shit ton of applications/games now needs the right Windows compatibility layer settings to run at all particularly a lot of games from the lates 2010s are just a nightmare to run. Try running Fallout 3 or Elder Scrolls IV Oblivion without having scarified a chicken and arranged its entrails in the right orientated.

Not a knock against Microsoft, you inevitably get to a point when you just need to break code to move forward but there have been more software breaking changing to Windows in the last ten years than in the previous 20 years. Fortunately VMs are such a seamless solution, it's no longer a bit deal.
 
This is a very good point.

My feeling though is that as MS are going to be using large volumes of the Anaconda silicon in their Azure blades, and that as the Azure team were working closely with Xbox throughout development, they would make sure they could make full use of the PCIe x4 bandwidth.

I wouldn't be surprised to see the Azure devices equipped with faster SSDs than consumer XSX units. Given how expensive server farms are to set up and run I can't imagine they would "cheap out" on a relatively tiny block of silicon, and have no option to use faster drives.

Like you say MS aren't going to be giving out too many details, but down the line perhaps they would be willing to say if their Azure Anaconda units can use top end Gen 4 NVMe drives.

I can see that the hypervisor it runs, or the mode code is executed under could be different for Azure data work, this might be able to bypass the encryption, although any route round is an attack vector.
Given the virtual nature of the hardware provision, the ability to run 4 Xbox one s instances I think your right, this will have Azure specific customisation built in to the Apu.

With the split board design I think IO is on the daughter board, this might be completely different for the Azure blades , likely not a coincidence.
 
We can see what seems to be a short clip demonstrating SFS/Velocity Architecture in the Inside the Series S vid:

Starts around 2:51 in the vid.
The loading demo is a little surprising. It is the second demo of loading time and it still shows about 4x loading speed of last-gen console with HDD.

Is it still using "unoptimized" code? Or this is a typical case since 2 loading demos both show 4x~5x actual loading speed improvement.
 
Back
Top