Xbox Series X [XBSX] [Release November 10 2020]

great, it's got the full feature set and I think they've made that clear for some time now. But it's also clear it does not have the infinity cache. By labelling L2 as 5MB, infinity cache has as weird position to sit in.

Die shot once again: There's nothing here.
die_shot.png
https://d2skuhm0vrry40.cloudfront.net/2020/articles/2020-08-21-16-48/die_shot.png
 
Last edited:
I believe the Infinity Cache is purely there to compensate for the VRAM bandwidth deficit.
yea it is. In the same manner that esram was there to make up for the bandwidth deficit for XBO.
It makes sense to have infinity cache on the larger ones, the more CUs you have the more bandwidth you need to feed things. But that gets extremely expensive as you keep going up so you need an alternative method. This looks like a decent method of doing it, I was expecting HBM at first, but clearly still not ready for prime time here.

Yea, I don't think PS5 or XSX has this cache. The CUs are probably too low for it to matter.
 
Because based on it's INT4/8 capabilities the XSX is half as fast with ML than an RTX 2060. And a 4K upscale on an RTX 2060 takes 2.5ms. That means it will take ~5ms on the XSX which is more than 1/3rd of the frame time at 60fps. This may still be worth it, but it's also dependent on this single dev studio creating a model that's comparable in quality and performance to Nvidia with all their $billions in R&D, access to the worlds fastest ML supercomputers, synergy with their own hardware/tensor cores and massive ML experience (being the world leaders in the hardware that runs it and all).

So while I'm not saying it's impossible. I do think it's sensible to take any such claims with a massive pinch of salt until real world results have been shown and independently verified. After all, "ML upscaling" could mean almost anything and doesn't necessarily have to be comparable to DLSS which is outputting anywhere between 2.25x - 9x the original number of pixels.

TBF, couldn't MS leverage their own resources and expose a lot of this functionality to devs through the APIs? That way the work devs require to implement their own upscaling models, etc, are lessened? They are already doing this to a large extent with auto HDR and while there are some cases of it not working as intended with a small handful of BC titles, by and large it seems to do as advertised, and I'm guessing can be tuned to what a dev wants in particular if they wish to expend the resources.

I don't see any major barriers in preventing a similar model for image upscaling; MS already did mention of an upscaler in Series S. Granted that could just be in reference to standard upscalers a lot of devices use, but what would necessarily be the point in explicitly bringing it up for a games console when they can't be oblivious to the tech-focused discussions happening here and elsewhere WRT resolution upscaling techniques?

At the very least I'm hoping they have some type of API stack present regards image upscaling through DirectML that devs can leverage immediately, and fine-tune to their own results if they wish. Nvidia definitely has been king in this field, but it's not like MS doesn't have the resources or R&D teams into their own research for data models to build and train image res upscaling techniques and the customized silicon to provide it at the hardware level. In fact I'd argue they have more resources in that department, question is if they have utilized them to such a purpose. Guess we'll find out sometime soon, but AMD seemed pretty bold about this when addressing DX12U support in their new GPUs today.

FWIW, a lot of the same could be said for Sony as well, I'm sure they have been doing some legwork into training data models to push forward with checkerboarding techniques. They have patents for further implementation of foveated rendering, among other things. Maybe they have a set of APIs for devs on their platform to leverage for image upscaling that's relatively easy to implement into existing engines and frameworks, but has flexibility where needed. In either case, we'll find out within a year where the consoles are on this front.


Welp, there it is. Still not sure if it says too much though regards PS5. No mention of cache scrubbers today for example, that could however be one of the features Cerny talked about that AMD adopt for a future design (maybe an RDNA 3 GPU?).

I think it's VERY clear tho that DX12U runs very deep in RDNA2's design, so if Sony are providing equivalent features but can't use DX12U, then it's clear they'd have to of customized large parts of their GPU out of necessity to provide them.


yea it is. In the same manner that esram was there to make up for the bandwidth deficit for XBO.
It makes sense to have infinity cache on the larger ones, the more CUs you have the more bandwidth you need to feed things. But that gets extremely expensive as you keep going up so you need an alternative method. This looks like a decent method of doing it, I was expecting HBM at first, but clearly still not ready for prime time here.

Yea, I don't think PS5 or XSX has this cache. The CUs are probably too low for it to matter.

Maybe there's still a slim chance the Infinity Cache is present, just in a cut-down fashion. I remember people doing an SRAM count of the Series X GPU but a decent chunk of it still wasn't accounted for. Some things like constant caches were mentioned, but maybe there's a small chance a bit of it is also for Infinity Cache?

The 6900 series are basically 2 PS5s smashed together, there's still maybe a small chance that system has some Infinity Cache implementation too? I'm just being a wishful dreamer here, but it's fun.

Also quick mention: keep in mind AMD switched the cache labeling for RDNA2. L2 is really L3 on their GPUs, L1 is L2, and L0 is L1. So the L2 (really L3) might be 5 MB, but that doesn't mean the L1 (really L2) collectively is smaller.

And while a small chance, could still have some form of Infinity Cache built in? I'd suspect it's scalable, so much of RDNA 2 seems scalable as-is.
 
Last edited:
Yea, I don't think PS5 or XSX has this cache. The CUs are probably too low for it to matter.
Yeah. Consider the memory access pattern of GPUs: mostly streaming, random access only occurs over small range, it makes sense that good enough bandwidth is good enough.
 
yea it is. In the same manner that esram was there to make up for the bandwidth deficit for XBO.
It makes sense to have infinity cache on the larger ones, the more CUs you have the more bandwidth you need to feed things. But that gets extremely expensive as you keep going up so you need an alternative method. This looks like a decent method of doing it, I was expecting HBM at first, but clearly still not ready for prime time here.

Yea, I don't think PS5 or XSX has this cache. The CUs are probably too low for it to matter.

Also ROPs. The 6800XT/6900XT have 128 ROPs, so feeding those would require buckets of bandwidth just to keep up with the fillrate. Consoles are probably on the edge of what they need to support 64 ROPs, but even then they wouldn't get full blending rates @ 32bpp, so a larger LLC would still be appreciable I think.

Even 64MB would have been interesting to see on console. That fits 2x32bpp 4K render targets, and that's keeping in mind that AAA developers are seemingly sticking to a variation of Clustered Forward+ instead of deferred / fat G-buffer.

Between a 320-bit bus and 5MB L2 vs 256-bit and 64MB IC.... I'd think the extra die cost would be worthwhile.
 
Also ROPs. The 6800XT/6900XT have 128 ROPs, so feeding those would require buckets of bandwidth just to keep up with the fillrate. Consoles are probably on the edge of what they need to support 64 ROPs, but even then they wouldn't get full blending rates @ 32bpp, so a larger LLC would still be appreciable I think.

Even 64MB would have been interesting to see on console. That fits 2x32bpp 4K render targets, and that's keeping in mind that AAA developers are seemingly sticking to a variation of Clustered Forward+ instead of deferred / fat G-buffer.
Pull a cut and run ;) say goodbye to ROPS .. go compute only. That's sort of the advantage of consoles. You don't need to support every type of game made. Games will adjust to the hardware.
 
Welp, there it is. Still not sure if it says too much though regards PS5. No mention of cache scrubbers today for example, that could however be one of the features Cerny talked about that AMD adopt for a future design (maybe an RDNA 3 GPU?).
Coherency is probably not worth mentioning for a press event like this, it's too technical. For example the CUs in GCN and up GPUs are coherent among themselves, but nobody (that is, enthusiasts and journalists) cares.
Maybe there's still a slim chance the Infinity Cache is present, just in a cut-down fashion. I remember people doing an SRAM count of the Series X GPU but a decent chunk of it still wasn't accounted for. Some things like constant caches were mentioned, but maybe there's a small chance a bit of it is also for Infinity Cache?
There are more caches and buffers than just L3 L2 L1 [L0] in both GPUs and CPUs.
 
great, it's got the full feature set and I think they've made that clear for some time now. But it's also clear it does not have the infinity cache. By labelling L2 as 5MB, infinity cache has as weird position to sit in.

Die shot once again: There's nothing here.
die_shot.png
https://d2skuhm0vrry40.cloudfront.net/2020/articles/2020-08-21-16-48/die_shot.png

I'm still not sure what Infinity Cache even is. Can you recommended a decent link on the subject? I'm only finding speculation type stuff via my weak Google-Fu.

The ideas I've seen seem to ping about between a large separate cache and it being some type of fabric for more normal caches.
 
I'm still not sure what Infinity Cache even is. Can you recommended a decent link on the subject? I'm only finding speculation type stuff via my weak Google-Fu.

The ideas I've seen seem to ping about between a large separate cache and it being some type of fabric for more normal caches.
Seems like a very large smart cache that replaces L2 if I'm looking at it correctly.
 
I'm still not sure what Infinity Cache even is. Can you recommended a decent link on the subject? I'm only finding speculation type stuff via my weak Google-Fu.

The ideas I've seen seem to ping about between a large separate cache and it being some type of fabric for more normal caches.
Looks to be a very large L3 to me.
 
Want 1 of 3 Xbox Series X fridges? Now's your chance....


he one. The only. Xbox Series X Fridge giveaway. Follow and retweet with #XSXFridgeSweeps for a chance to win the Xbox Series X Fridge. Ends 11/04/20. Rules: https://xbx.lv/3mnFCYi

To enter, you must be a legal resident of any Xbox Live supported region (https://www.xbox.com/en-US/live/countries)

Snoop Dogg owner of Fridge 1 posted a longer video...


Tommy McClain
 
Seems like a very large smart cache that replaces L2 if I'm looking at it correctly.
I think the L2 is still present in RDNA2. There's sets of rectangles on either side of the command processor block in the middle that look like they would correspond to the L2. Some of the code changes that started mentioning the existence of the new cache still reference the L2 as well.
 
I'm not sure if you misunderstood my meaning but I wasn't talking about a fixed 100GB. Merely that MS have effectively claimed before that they can read data from the SSD into the GPU "instantly". Which ties into the idea of bypassing VRAM.



Indeed but that's something entirely different and far simpler.



Because based on it's INT4/8 capabilities the XSX is half as fast with ML than an RTX 2060. And a 4K upscale on an RTX 2060 takes 2.5ms. That means it will take ~5ms on the XSX which is more than 1/3rd of the frame time at 60fps. This may still be worth it, but it's also dependent on this single dev studio creating a model that's comparable in quality and performance to Nvidia with all their $billions in R&D, access to the worlds fastest ML supercomputers, synergy with their own hardware/tensor cores and massive ML experience (being the world leaders in the hardware that runs it and all).

So while I'm not saying it's impossible. I do think it's sensible to take any such claims with a massive pinch of salt until real world results have been shown and independently verified. After all, "ML upscaling" could mean almost anything and doesn't necessarily have to be comparable to DLSS which is outputting anywhere between 2.25x - 9x the original number of pixels.

If the XSX took 5ms to upscale from 1080p to 4K on average then any game that natively runs at 1080p with a frame time of 11.6ms could be upscale to 4K 60fps. It becomes a matter of giving up 30% of your fps in return for 4X the resolution. At 4K50 (VRR is now a thing) it's just a matter of giving up 25% of your fps for the 4X boost in resolution. Its the difference between 4K at 30 fps and 1080p at 35 fps.

And Xbox devs wouldn't necessarily be forced to create their own solutions.

https://venturebeat.com/2020/02/03/...t-generation-of-games-and-game-development/2/

MS is at least considering offering this as a part of GameStack.

You were talking about machine learning and content generation. I think that's going to be interesting. One of the studios inside Microsoft has been experimenting with using ML models for asset generation. It's working scarily well. To the point where we're looking at shipping really low-res textures and having ML models uprez the textures in real time. You can't tell the difference between the hand-authored high-res texture and the machine-scaled-up low-res texture, to the point that you may as well ship the low-res texture and let the machine do it... Like literally not having to ship massive 2K by 2K textures. You can ship tiny textures... The download is way smaller, but there's no appreciable difference in game quality. Think of it more like a magical compression technology. That's really magical. It takes a huge R&D budget. I look at things like that and say — either this is the next hard thing to compete on, hiring data scientists for a game studio, or it's a product opportunity. We could be providing technologies like this to everyone to level the playing field again.
 
Last edited:
I think the L2 is still present in RDNA2. There's sets of rectangles on either side of the command processor block in the middle that look like they would correspond to the L2. Some of the code changes that started mentioning the existence of the new cache still reference the L2 as well.
So does L2 then go to infinity cache and the infinity cache to the memory controllers? Or are we looking at something closer to esram setup, where it's being deployed as a scratch pad that is not developer accessible?
 
Ark enhancements.

What do they mean 30-60fps?

Says available now, so hopefully get some XSX |S look at it.
 

Attachments

  • IMG_20201028_185343.jpg
    IMG_20201028_185343.jpg
    159.7 KB · Views: 4
Back
Top