Velocity Architecture - Limited only by asset install sizes

The loading demo is a little surprising. It is the second demo of loading time and it still shows about 4x loading speed of last-gen console with HDD.

Is it still using "unoptimized" code? Or this is a typical case since 2 loading demos both show 4x~5x actual loading speed improvement.
Which part are you referring to? Outer Worlds?
 
Yea that's running a BC title. I'm not exactly sure if that is the ideal comparison with respect to what new generation titles would be doing .
It'll will feel odd if smaller, 8th generation games load slower than 9th generation games but we could indeed be looking at this. I recall similar load times with State of Decay 2 on Series X.
 
It'll will feel odd if smaller, 8th generation games load slower than 9th generation games but we could indeed be looking at this. I recall similar load times with State of Decay 2 on Series X.

SoD2 was a quick switch, so would have been saving down the prior game state and loading a new one.

I think with TOW loading we're not seeing the benefits of hardware decompression.
 
SoD2 was a quick switch, so would have been saving down the prior game state and loading a new one.

I think with TOW loading we're not seeing the benefits of hardware decompression.
Everything on current gen is zlib compressed and uncompressed using hardware decompression before the assets are stored in memory. Zlib decompression is still the default way (except textures that will use BCPack) to decompress things on Xbox Series. So this demo has not choice but use Zlib decompression using the decompression hardware as the data is compressed like this on XB1 games.

"Our second component is a high-speed hardware decompression block that can deliver over 6GB/s," reveals Andrew Goossen. "This is a dedicated silicon block that offloads decompression work from the CPU and is matched to the SSD so that decompression is never a bottleneck. The decompression hardware supports Zlib for general data and a new compression [system] called BCPack that is tailored to the GPU textures that typically comprise the vast majority of a game's package size.


https://www.eurogamer.net/articles/digitalfoundry-2020-inside-xbox-series-x-full-specs
 
Yea that's running a BC title. I'm not exactly sure if that is the ideal comparison with respect to what new generation titles would be doing .
Well, it is still that, what MS shows. Just backcompat titles.

MS needs to really get a focus on newer games here. But I still expect we get many, many loading times. Not really because of the games on the SSD, but because of the games that were backed up on an external HDD and must be copied over to start them. After all even 1TB is not much, and minus OS all consoles have much less.
 
Well, it is still that, what MS shows. Just backcompat titles.

MS needs to really get a focus on newer games here. But I still expect we get many, many loading times. Not really because of the games on the SSD, but because of the games that were backed up on an external HDD and must be copied over to start them. After all even 1TB is not much, and minus OS all consoles have much less.
Indeed, it is unfortunate that it really does appear that the software side of their strategy has fallen through the floor for this launch.
 
In the inside Series S video shows a quick switch between Fallen Order and Minecraft Dungeons. That was under 5 seconds. A bit quicker than the Xbone switching we saw at the X reveal. The 360 games loaded quicker, so not sure if they've optimised quick switch or just that Fallen Order/Minecraft are not using all of the Xbox One's 5.5 GB.
 
In the inside Series S video shows a quick switch between Fallen Order and Minecraft Dungeons. That was under 5 seconds. A bit quicker than the Xbone switching we saw at the X reveal. The 360 games loaded quicker, so not sure if they've optimised quick switch or just that Fallen Order/Minecraft are not using all of the Xbox One's 5.5 GB.

I could swear there was a tweet about the games running in BC mode and not being ported/ optimized for the SSD. And now I cannot find it.
 
Yes. 12 seconds on xss and 53 seconds on x1s. Again it shows 4x~5x loading speed.
Seems possible that the bottleneck is the file processing / decompression (CPU-side ~3-4x faster CPU + it's also Unreal Engine sooo..). Durango was limited to 40MB/s (60MB/s for Scorpio) so the SSD should blow away any bottleneck for plain file transfer leaving it up to processing the data.
 
SoD2 was a quick switch, so would have been saving down the prior game state and loading a new one.
Fair point, switching is a load and a save. And any special efficiencies introduced to make the save faster on nextgen hardware is not going to apply to current gen games unless/until they are patched.
 
At the very least all the UE4 games will need at least to pull in the latest engine patch, it's at least UE 4.25 if not a later version.
 
Yea that's running a BC title. I'm not exactly sure if that is the ideal comparison with respect to what new generation titles would be doing .
The outer worlds and both expansions are optimized for Xbox Series X.
It is one of the games supporting Smart Delivery.

https://www.eurogamer.net/articles/xbox-smart-delivery-system-list-6400
https://news.xbox.com/en-us/2020/07/30/the-outer-worlds-peril-on-gorgon-expansion/

The outer worlds is not a BC title. It shows about 4~5x improvement of loading speed.
In fact last loading demo also shows similar loading improvement.

Does the real world loading speed actually improve 4~6x?
 
The outer worlds and both expansions are optimized for Xbox Series X.
It is one of the games supporting Smart Delivery.

https://www.eurogamer.net/articles/xbox-smart-delivery-system-list-6400
https://news.xbox.com/en-us/2020/07/30/the-outer-worlds-peril-on-gorgon-expansion/

The outer worlds is not a BC title. It shows about 4~5x improvement of loading speed.
In fact last loading demo also shows similar loading improvement.

Does the real world loading speed actually improve 4~6x?
Description of that video you linked;

"Demo uses backward compatible Xbox title to demonstrate load time technology and does not represent gameplay optimized for Xbox Series X."
 
The outer worlds and both expansions are optimized for Xbox Series X.
It is one of the games supporting Smart Delivery.

https://www.eurogamer.net/articles/xbox-smart-delivery-system-list-6400
https://news.xbox.com/en-us/2020/07/30/the-outer-worlds-peril-on-gorgon-expansion/

The outer worlds is not a BC title. It shows about 4~5x improvement of loading speed.
In fact last loading demo also shows similar loading improvement.

Does the real world loading speed actually improve 4~6x?
Optimized is not the same as being built on that hardware.
 
The outer worlds and both expansions are optimized for Xbox Series X.
It is one of the games supporting Smart Delivery.

https://www.eurogamer.net/articles/xbox-smart-delivery-system-list-6400
https://news.xbox.com/en-us/2020/07/30/the-outer-worlds-peril-on-gorgon-expansion/

The outer worlds is not a BC title. It shows about 4~5x improvement of loading speed.
In fact last loading demo also shows similar loading improvement.

Does the real world loading speed actually improve 4~6x?
If you look up several posts:

https://forum.beyond3d.com/posts/2153858/
 
I guess for starters: you'd start with compression. That should cut the time down by 50% (in theory)
Secondly, you only send as much textures as you need to begin the level, so no need to load the level as far out as XBO needs to load it; you can begin the level and be confident your SSD will be filling in the rest.

So by needing less, and compressing further, you should be able to start the level faster. I suspect this is the aim for next gen fast loading.
 
Please excuse the mentioning of the PC @BRiT, but I want to tie a few things together with reference to the XSX, and the capabilities of the machine.

RTX_IO.jpg


What I find interesting here is that these "RTX IO" numbers kinda match what MS said about XSX.

If you take the green bar at 14 GB/s and divide it by 2.4GB/s for the XSX SSD you get 5.83. Now divide the .5 cores for "RTX IO" by that 5.83 and you get 0.86 cores.

That's (un)suspiciously close to the CPU overhead figures MS were throwing around for Direct Storage when they were revealing the XSX specs, even accounting for different CPU cores and workloads. The savings are roughly as staggeringly big.

[Edit: I may have boobed a little. I assumed "Read Bandwidth GB/s" meant the drive's read bandwidth - literally what is being read from the actual drive (e.g. 14 GB/s from two raided PCIe 4 SSDs). But if the bars are showing output after reading from the drive and processing / decompression using something like BCPack (would make no sense to do that to me) ... well ... that only makes the XSX look *even better* in terms of relative overhead. Anyway, doesn't change my conclusions below one bit!]


So anyway, we already know that the XSX GPU can read directly from the SSD without it having to go into memory first, that the GPU can process that data and then write it out to GPU memory, and that doing so using Direct Storage has a similarly low (almost negligible) overhead on the CPU.

Basically, XSX can already do what Nvidia have cleverly branded "RTX IO". At very low cost XSX can pull data directly into the GPU, process it, and write it out to memory for later use. Only differences I can see at this point are that XSX can have (optionally) put it through their hardware decompression block first, and Nvidia aren't tied to a 2.4 GB/s drive.

Then again, it's not like MS can't release an optional faster drive at some point ... in theory. Whether that would make sense is another matter, but I'm pretty sure they could, and they could pump the data straight to the GPU to do whatever decompression they wanted to just like Nvidia are showing in the slide above. It's not like the XSX couldn't afford the CPU overhead. ;)

I think this bolded part is a bit misleading. On the XSX there is no system/video RAM distinction. So the decompression block sends data first into RAM then the GPU can access it. The diagram you attached shows how the decompression sw/hw on the NVIDIA cards bypasses system RAM to place it into VRAM. After that it is then used again. The GPU can only use byte addressable data from RAM. So it cannot read directly from the SSD.
 
Back
Top