Xbox Series X [XBSX] [Release November 10 2020]

Interesting ML technique to turn foveated rendering into a full image. Another form of reconstruction.
Not sure if this would be something they could use, but in theory, you could use render out to VRS very sparse foveated, and have it reconstructed.
I don't think there is enough processing power of course to do it. But interesting never the less.
 
So, I've been going through Microsoft's patents again, and found a few interesting points that I haven't seen referenced before. One was about ML-based approaches to texture compression, which I've linked over in the ML thread, and the other is this one, which may be relevant to the mysterious BCPack:

REDUCING THE SEARCH SPACE FOR REAL TIME TEXTURE COMPRESSION

Methods and devices for real time texture compression may include accessing graphics hardware incompatible compressed textures in a format incompatible with the GPU, and a metadata file associated with the graphics hardware incompatible compressed textures, wherein the metadata file includes at least one hint that provides information to use for compression of decompressed textures from the graphics hardware incompatible compressed textures into hardware compatible compressed textures. The methods and devices may include converting the graphics hardware incompatible compressed textures into the decompressed textures. The methods and devices may include selectively compressing the decompressed textures into the hardware compatible compressed textures usable by the GPU according to the at least one hint from the metadata file. The methods and devices may include transmitting the hardware compatible compressed textures to the GPU.

If I'm reading it right (and abstract aside, it's relatively readable, by patent standards), they're proposing that rather than trying to crunch already-compressed BCn textures down even further, they can store the texture in a non-GPU compatible format that can achieve higher compression ratios, then recompress it in realtime to BCn on load, using metadata hints provided alongside the texture to speed up the (re)compression process.

Of course, it could just be an idea that they came up with during the R&D phase, and not something they've actually implemented in the final product, but maybe its just crazy enough to actually work.
 
So, I've been going through Microsoft's patents again, and found a few interesting points that I haven't seen referenced before. One was about ML-based approaches to texture compression, which I've linked over in the ML thread, and the other is this one, which may be relevant to the mysterious BCPack:



If I'm reading it right (and abstract aside, it's relatively readable, by patent standards), they're proposing that rather than trying to crunch already-compressed BCn textures down even further, they can store the texture in a non-GPU compatible format that can achieve higher compression ratios, then recompress it in realtime to BCn on load, using metadata hints provided alongside the texture to speed up the (re)compression process.

Of course, it could just be an idea that they came up with during the R&D phase, and not something they've actually implemented in the final product, but maybe its just crazy enough to actually work.

There is confirmation that this ML implementation is being activley used by some MS studios already. Will pair nicely with sampler feedback.
 
So, I've been going through Microsoft's patents again, and found a few interesting points that I haven't seen referenced before. One was about ML-based approaches to texture compression, which I've linked over in the ML thread, and the other is this one, which may be relevant to the mysterious BCPack:



If I'm reading it right (and abstract aside, it's relatively readable, by patent standards), they're proposing that rather than trying to crunch already-compressed BCn textures down even further, they can store the texture in a non-GPU compatible format that can achieve higher compression ratios, then recompress it in realtime to BCn on load, using metadata hints provided alongside the texture to speed up the (re)compression process.

Of course, it could just be an idea that they came up with during the R&D phase, and not something they've actually implemented in the final product, but maybe its just crazy enough to actually work.

If my memory is not failing me, this is exactly what carmak did for Rage Megatextures as well as our boiii sebbbi.

And they implemented in software pretty much the same stuff that later features like tiled resources and sampler feedback came to adress. They were very ahead of their times there.

As crazy as it is, it could be done on a 360 at 60fps, and sure as hell could have been done on ps4bone if anyone bothered, but lazy devs are gonna lazy dev.

None of this shit is new, it's just it came to the point you'd have to be crazy NOT to go that route now, but it's always been a smarter and more elegant aproach to texturing if you were willing to put in the work to implement. Thank god it seems the industry at large accepted they can't postpone this any longer.
 
If my memory is not failing me, this is exactly what carmak did for Rage Megatextures as well as our boiii sebbbi.

And they implemented in software pretty much the same stuff that later features like tiled resources and sampler feedback came to adress. They were very ahead of their times there.

As crazy as it is, it could be done on a 360 at 60fps, and sure as hell could have been done on ps4bone if anyone bothered, but lazy devs are gonna lazy dev.

None of this shit is new, it's just it came to the point you'd have to be crazy NOT to go that route now, but it's always been a smarter and more elegant aproach to texturing if you were willing to put in the work to implement. Thank god it seems the industry at large accepted they can't postpone this any longer.
edit: I did not read patent. This interpretation is wrong.
This one is a bit different. Similar to how DLSS is trained from a 1080p image and given an output of 16K image. The NN job is to learn how to make that translation.

In this patent, they took an input of a texture compressed using some heavy duty
Compression. And provided it an output of the compressed BCn format. So the neural network isn’t decompressing, it’s just looking at the values and swapping across. It has no clue what it’s doing. It just knows how to translate it into its respective BCn texture compression.

that’s pretty novel as it’s not the same as taking a BCn texture and using say Kraken to compress it. And then you decompress and you get BCn texture. You see. In that process you already have a compressed file and you are compressing it further.

From what i understand; What this patent does is allows the image in raw format to be compressed by heavy compression Directly. And then it just converts that file directly into the respective BCn format for consumption.

you can’t use hardware decompressor to do this. It’s not decompressing. You’d want tensor cores though. If this works, it offers an interesting advantage of delaying when you want to covert the texture. You can keep it compressed in memory as buffer and wait until a critical point in which you perform asynchronous compute to covert it to a BCn texture.
 
Last edited:
This one is a bit different. Similar to how DLSS is trained from a 1080p image and given an output of 16K image. The NN job is to learn how to make that translation.

In this patent, they took an input of a texture compressed using some heavy duty
Compression. And provided it an output of the compressed BCn format. So the neural network isn’t decompressing, it’s just looking at the values and swapping across. It has no clue what it’s doing. It just knows how to translate it into its respective BCn texture compression.

that’s pretty novel as it’s not the same as taking a BCn texture and using say Kraken to compress it. And then you decompress and you get BCn texture. You see. In that process you already have a compressed file and you are compressing it further.

From what i understand; What this patent does is allows the image in raw format to be compressed by heavy compression Directly. And then it just converts that file directly into the respective BCn format for consumption.

you can’t use hardware decompressor to do this. It’s not decompressing. You’d want tensor cores though. If this works, it offers an interesting advantage of delaying when you want to covert the texture. You can keep it compressed in memory as buffer and wait until a critical point in which you perform asynchronous compute to covert it to a BCn texture.

As I remember, carmak streamed textures from a custom format, decompressed it AND re-compressed it into BCn in software. Sebbbi did the same, only the source textures were tiled and merged with decals during virtual page build (For rage that was pre-baked while in Trials it was done at runtime). I think sebbbi said they used straight up jpg for the compression.

So, yeah, they were pretty close. The only novelty is the AI doing the decompress and recompress in one go, and with the added imprecision of that. Still very intriguing.
 
Last edited:
As I remember, carmak stremed textures from a custom format, decompressed it AND re-compressed it into BCn in software. Sebbbi did the same, only the source textures were tiled and merged with decals during virtual page build (For rage that was pre-baked while in Trials it was done at runtime). I think sebbbi said they used straight up jpg for the compression.

So, yeah, they were pretty close. The only novelty is the AA doing the decompress and recompress in one go, and with the added imprecision of that. Still very intriguing.
Oh. Lol. Wow. Some heavy duty lifting in software there.
Indeed it was jpeg. They compressed 1TB of textures to 17GB.
Insane.
 
Last edited:
This one is a bit different. Similar to how DLSS is trained from a 1080p image and given an output of 16K image. The NN job is to learn how to make that translation.

In this patent, they took an input of a texture compressed using some heavy duty
Compression. And provided it an output of the compressed BCn format. So the neural network isn’t decompressing, it’s just looking at the values and swapping across. It has no clue what it’s doing. It just knows how to translate it into its respective BCn texture compression.

that’s pretty novel as it’s not the same as taking a BCn texture and using say Kraken to compress it. And then you decompress and you get BCn texture. You see. In that process you already have a compressed file and you are compressing it further.

From what i understand; What this patent does is allows the image in raw format to be compressed by heavy compression Directly. And then it just converts that file directly into the respective BCn format for consumption.

you can’t use hardware decompressor to do this. It’s not decompressing. You’d want tensor cores though. If this works, it offers an interesting advantage of delaying when you want to covert the texture. You can keep it compressed in memory as buffer and wait until a critical point in which you perform asynchronous compute to covert it to a BCn texture.

I suggest you peruse the patent a little deeper as the patent explicitly talks about a compression engine with the capacity to decompress and block compress in realtime and the term decompression is used throughout the patent.

In one implementation a ML algorithm is run offline to repeatedly block compress the texture under many configurations to find the configuration that provides the best quality or the best compression ratio. That data is used to generate a meta file which is compressed (in a lossless format) along with texture (in a lossy format). The compression engine uses the meta data to set the configuration of the block compression once the texture has been decompressed.
 
Last edited:
I suggest you peruse the patent a little deeper as the patent explicitly talks about a compression engine with the capacity to decompress and block compress in realtime and the term decompression is used throughout the patent.

In one implementation a ML algorithm is run offline to repeatedly block compress the texture under many configurations to find the configuration that provides the best quality or the best compression ratio. That data is used to generate a meta file which is compressed (in a lossless format) along with texture (in a lossy format). The compression engine uses the meta data to set the configuration of the block compression once the texture has been decompressed.
Ah thanks. Yea I’ll be honest I didn’t read it past the description LOL.
This makes more sense now that you’ve written it.

I seem to have mixed up this one:
https://patentscope.wipo.int/search....wapp2nA?docId=US253950223&tab=PCTDESCRIPTION

As such, at runtime and/or installation of the application, the devices and methods may quickly convert the GPU-incompatible compressed texture into a format usable by the GPU by using machine learning to decompress the compressed textures directly into a format usable by the GPU. In addition, the devices and methods may quickly reconstruct the first image and/or any removed intermediate images of any modified MW chains for use with textures in the application. Thus, based on the present disclosure, textures may be compressed using a higher compression ratio compression algorithms (relative to GPU-compatible compression algorithms) for storage and/or transmission, and decompressed directly into a format usable by the GPU. Decompressing the GPU incompatible texture may result in a compressed GPU compatible texture.
with the one quoted earlier:
http://www.freepatentsonline.com/y2019/0304138.html
 
Last edited:
This sounds promising.

https://developer.microsoft.com/en-...ud-is-helping-game-developers-stay-connected/

What I love most about gaming is its ability to bring people to together. Whether playing with your family on the couch or with a group of friends scattered throughout the world, gaming helps fulfill one of our most basic needs: connection with one another. With many of us self-isolating and working from home, this connection is even more important.

Like many people, the Xbox and Project xCloud team has been adapting to the new environment we find ourselves in. We are having to change how we communicate, collaborate and connect with one another to maintain the inclusive environment we have worked hard to build.

The same is true with our developer community and we know firsthand the challenges that working from home presents. Specifically, many developers are unable to access their Xbox development kits. By not having access to their usual tools, game developers, artists, and designers may not be able to maintain the rapid iteration cycles needed to turn out new content for gamers. Many of the traditional Xbox remote access tools are designed to be used within an office environment; bringing developer kits home is often not an option and activities like playtesting and gameplay tuning require high frame rates and minimal latency.

We've set out to solve these remote access challenges by re-allocating our Project xCloud resources; going beyond mobile and creating a PC app experience for developers that runs a low-latency 60fps gaming experience that allows for continued game development. By giving developers access to our PC Content Test App (PC CTA) they can remotely connect to their Xbox Development Kits from their PC, allowing them to test, play and iterate as if they were in the office. It also prevents them from having to download daily builds to local hardware in the home, which can often take hours.

To date, developers across many of the biggest gaming studios have used the PC CTA to significantly improve their remote-working environment. We have received great feedback on the overall quality from those within Xbox Game Studios as well as from several of our third-party partners, including:

  • Eidos-Montréal
  • Infinity Ward
  • Ninja Theory
  • Playground Games
  • Rare
  • Turn 10 Studios
  • Undead Labs
Many of our partners were early adopters of this solution and shared this with the team:

"xCloud will give the opportunity to dev teams and also internal and external QA teams to put their hands on our latest game builds from everywhere minutes after their release. By allowing the teams to connect remotely to their devkits and take advantage of the high bandwidth LAN network from our various office locations, xCloud will also add another layer of security as the content created will stay on our corporate network." - Guillaume Le-Malet, Infrastructure Director – Eidos-Montréal

"Our transition to work from home introduced some significant hurdles into our QA and development process. We went overnight from being able to test 2-3 builds daily to being limited to one build for the whole team, downloaded overnight. This was especially painful if that build failed in any way and could wipe out whole days. Using the PC Content Test app enables us to bring back our old workflows wholesale. Installs to kits on-site are now minutes rather than the hours it takes to download remotely, and we have the flexibility to react when something goes wrong." - Sean Davies, Technical Director – Rare

Microsoft's key principle is enabling people to achieve more, and our goal in Xbox has always been to bring gaming to more people. Whether it's by offering up the latest games on a diverse set of platforms or enabling developers to work more effectively at home, we are always looking at ways to help our partners achieve more. We recently announced a new initiative from Azure to create a fast and secure build transfer solution for game developer working remotely, and now we're offering even more ways to work remotely by using Project xCloud technology.

For Xbox developers interested in utilizing the PC CTA, start by contacting your Microsoft Program Representative who can grant you access. Once you have installed the application you can use the Direct Connect functionality to remotely access your Xbox Development Kit. From here you can use all the developer tools needed to develop your game such as performance simulations, network testing, system resource monitoring and general trouble shooting
 
If I were to be honest. Just looking at this page of discussion; this type of technology (should it has overhauled say Rage Engine) makes things work better for this generation. As it doesn’t have the SSD.
If this runs on this generation I/O becomes less of a problem as does memory bandwidth. If You just ship extremely low textures for streaming there exists an interesting possibility:

  • Hard drive streams very small tiles to memory
  • Low end texture stay resident in DDR3
  • When required compressed low resolution textures are sent directly to GPU for processing
  • GPU decompress and ML is performed to upscale the results are saved in esram
  • Or GPU decompress and upscale and immediately render before writing out to esram
  • GPU Dispatch to render as required Get CPUs out of the fold for long GPU processing chains
So with that we can sort of say we may have solved a I/O memory issue.
CPU is tougher. But moving to GPU dispatch should relieve a lot of CPU, and makes any of this actually possible.
So perhaps there is some middle ground for current gen to stay alive in.

the goals would be to over saturated compute as much as possible; to make up for the deficit in bandwidth and I/O. Not sure how far one can stretch 1.2TF though.

this thread will be better discussed in the engine/game scaling thread.

So just some rough math -- looking at bandwidth levels.
If you require XBO to match XSX at peak bandwidth then,
XSX running 4K textures is 2^12 at 2.5GB/s
XBO is 100MB/s peak, so lets say 25x less
So XBO would need to run 2^5 smaller
so XBO would need to run at 2^7, or 128x128 pixel textures.
If you can upscale by 4x the size = that will get you to 2^9 which is still too short the goal.

So ti can't match peak bandwidth. So either you get very creative or you don't do instant level loads.

But if your'e talking standard texture streaming, 4K/8K textures, lets say uses about 500-800MB/s
then...thats only at most 2^3 less
So.. it can store in I/O 2^9 (512x512) and upscale to 2^11 (2048x2048) standard 2K textures. If 900p is the target resolution....
then you can get away with a 2^10 upscale (1024x1024).

So in that scenario, I/O is solved.
Memory footprint is dramatically solved. The memory is only 2x. But we're storing textures 2^3 less. So we're good there too.

none of this accounts for compression, but we don't need to decompress right away since there is still esram.

So the only thing is CPU.
 
Last edited:
XSX has only 2.5 GB reserved for OS. No doubt the SSD helps, but if MS could sacrifice some dash responsiveness and get the X1 reserve down to 2.5 GB also then that'd grant some extra memory for pooling textures from the HDD.

Infact to hell with it. X1 is nearing the end of its retail cycle. Slick changing between game and dash is no longer a selling point, they just both need to work properly in their own right. Cut the reserve to 2GB by any means necessary for cross gen games (user can just wait longer to get back into dash) and give the X1 another GB for pooling Oodle compressed assets and accept there may be some visible LoD pop.

Let users who care enough about pop in run from an external USB3 SSD - I had an external SSD for GTA 5 on 360*. Maybe even offer a cheap adapter for the XSX expansion drive, so X1 owners can "pre buy" the expansion drive and know they can still use it fully when they get an XSX / Lockhart - get the most from your X1 now but know you're not wasting money. Games should automatically scale to anything faster than the basic 500GB 5400 drive, up until you hit your I/O or CPU threshold for streaming.

If MS want a good cross gen period there's probably a lot they can do to facilitate this. Reducing the OS reserve, freeing up the rest of the 7th core, bringing a virtual memory style system over to X1 (one implementation for all 4 supported Xbox consoles), backporting as many API features and dev environment changes as possible ... MS have a huge amount of experience of this from both PC development tools and PC AAA games like Gears 5.

It'd be pretty cool to see just what they can bring to X1 to help cross gen games out.

*Put the GTA 5 install disk on an external SSD and then put the "play from from optical" disk on the internal HDD. Streaming performance was far better than anything I saw from either PS3 or X360 on the DF tech analysis! Very solid experience.
 
If I were to be honest. Just looking at this page of discussion; this type of technology (should it has overhauled say Rage Engine) makes things work better for this generation. As it doesn’t have the SSD.
If this runs on this generation I/O becomes less of a problem as does memory bandwidth. If You just ship extremely low textures for streaming there exists an interesting possibility:

  • Hard drive streams very small tiles to memory
  • Low end texture stay resident in DDR3
  • When required compressed low resolution textures are sent directly to GPU for processing
  • GPU decompress and ML is performed to upscale the results are saved in esram
  • Or GPU decompress and upscale and immediately render before writing out to esram
  • GPU Dispatch to render as required Get CPUs out of the fold for long GPU processing chains
So with that we can sort of say we may have solved a I/O memory issue.
CPU is tougher. But moving to GPU dispatch should relieve a lot of CPU, and makes any of this actually possible.
So perhaps there is some middle ground for current gen to stay alive in.

the goals would be to over saturated compute as much as possible; to make up for the deficit in bandwidth and I/O. Not sure how far one can stretch 1.2TF though.

this thread will be better discussed in the engine/game scaling thread.

So just some rough math -- looking at bandwidth levels.
If you require XBO to match XSX at peak bandwidth then,
XSX running 4K textures is 2^12 at 2.5GB/s
XBO is 100MB/s peak, so lets say 25x less
So XBO would need to run 2^5 smaller
so XBO would need to run at 2^7, or 128x128 pixel textures.
If you can upscale by 4x the size = that will get you to 2^9 which is still too short the goal.

So ti can't match peak bandwidth. So either you get very creative or you don't do instant level loads.

But if your'e talking standard texture streaming, 4K/8K textures, lets say uses about 500-800MB/s
then...thats only at most 2^3 less
So.. it can store in I/O 2^9 (512x512) and upscale to 2^11 (2048x2048) standard 2K textures. If 900p is the target resolution....
then you can get away with a 2^10 upscale (1024x1024).

So in that scenario, I/O is solved.
Memory footprint is dramatically solved. The memory is only 2x. But we're storing textures 2^3 less. So we're good there too.

none of this accounts for compression, but we don't need to decompress right away since there is still esram.

So the only thing is CPU.

The Xbone lacks the first step of the process: sampler feedback to figure out what needs to be streamed(if need be). It's the information retuned by sampler feedback that enables the rest of the streaming stack. Guess you can solve it by prefetching but then we are back to square one of memory footprint ballooning and figuring out what is going to be in the view frustrum. (Sampler feedback requires no guesswork and minimises prefetching).
 
Infact to hell with it. X1 is nearing the end of its retail cycle. Slick changing between game and dash is no longer a selling point, they just both need to work properly in their own right.
X1 is struggling enough as it is, to be honest if they could cut back OS more they probably would've by now.
It is what it is I reckon
 
I'm curious how they even cut 1GB off the OS Reservations for Series X. I would love to see a memory map of how much is used and for what purpose. Also curious about the other side, as they have not aid how much their games can use.

I dont think it was from the applications, as Series X still supports having apps run while gaming, that's how they support Streaming or playing music. Possibly shrink some of buffers?
 
X1 is struggling enough as it is, to be honest if they could cut back OS more they probably would've by now.
It is what it is I reckon

You could be right, but on the other hand I don't think the X1 has been struggling particularly because of memory but rather because of power (and the ddr3 bottleneck). Releasing more memory wouldn't really help for PS4 games where the X1 is using smaller buffers anyway, so you might as well make the dash faster to switch into. MS have cut an an awful lot of chaff since launch (Kinect, snap etc), but the OS reserve has never reduced. Infact it's as big as on X1X which buffers 4K video streams!

I'm thinking along the lines of stuff that would enable better cross gen ports, and unlike current games a large buffer to compensate for not having an SSD would be useful in a lot of circumstances. So I think there's an impetus there that hasn't existed previously.

That's an optimisation, not an enabler. Successful texture streaming has been achieved on platforms without sampler feedback.

Indeed, and for cross gen purposes MS could hide a kind of "pseudo SFS mode" behind an API. That way, you optimise for XSX and it "just works" on X1 even if the implementation is a bit of a cludge and not as accurate and efficient as the real deal.

Anything to allow one, solid, next gen implementation to be back ported with a reasonable degree of effectiveness would be useful for the next couple of years. Even if it's no better than a native implementation that's been possible for the last 7 years, the commonality could have a high degree of value.
 
Back
Top