Xbox Series X [XBSX] [Release November 10 2020]

I'm not sure, maybe cache groups in L2.

My point is not a value in the table that seems larger, but the fact that Ms is trying something very different from what is implemented in navi21

it's going to be very interesting to see how the series x compares to the comparable RDNA2 desktop part, cu for cu, I do wonder why the changes have been made, I suppose with more waves per SIMD, it might be the case that with a game workload you can keep the usage of the GPU higher. I wonder how the additional waves per SIMD affect the GPU's compute capability. would be very interesting to see the tradeoffs between the two configurations
 
it's going to be very interesting to see how the series x compares to the comparable RDNA2 desktop part, cu for cu, I do wonder why the changes have been made, I suppose with more waves per SIMD, it might be the case that with a game workload you can keep the usage of the GPU higher. I wonder how the additional waves per SIMD affect the GPU's compute capability. would be very interesting to see the tradeoffs between the two configurations

Maybe he calculates "offline" an INT, or maybe he uses the data in the caches and calculates ray/tri asynchronous like this Imagination patent.

https://patents.google.com/patent/US9607426
 
Would the bolded in particular suggest potentially better RT than an equivalent desktop Navi 21 variant that would have roughly similar CU counts, SE and SA arrangements?
Does this have anything to do with "GPU Work Creation"?

"GPU Work Creation – Xbox Series X and Xbox Series S add hardware, firmware and shader compiler support for GPU work creation that provides powerful capabilities for the GPU to efficiently handle new workloads without any CPU assistance. This provides more flexibility and performance for developers to deliver their graphics visions."

From here: https://news.xbox.com/en-us/2020/03/16/xbox-series-x-glossary/amp/
 
Ms has been talking a lot about "beyond hardware specs" so i've seen xbox has used dual pipe to process "off line" using idle cycles from SIMDs to deliver things like auto HDR, or the BVH start offline
View attachment 4821 View attachment 4822
Having two graphics queues was brought up with Navi 10, although it seemed like it didn't find much use in that generation.
The WGP0 and WGP1 distinction is unclear in its meaning to me. The counts don't match up with prior generations enough to imply there's a difference in functionality for BC purposes, and no obvious physical difference from the die shots. It might be a subdivision within an SA for other purposes, like load-balancing or which groups in an SA can afford defect recovery.

More waves per SIMD too. Sounds a little more flexible than the desktop variant.
Backwards compatibility with GCN would probably merit keeping the 20 waves per SIMD from RDNA1, since GCN's 10 waves per 16-wide SIMD would need to be packed into the context of a 32-wide RDNA SIMD.

just curious, but do you know what the 'num_gl2c' refers to? I see that it has a few more than the desktop part
Looks like L2 cache slices, which would indicate that a smaller GPU has more slices than the top-tier GPU. Whether those slices are equivalent between GPUs in how much bandwidth or capacity they have isn't indicated.
 
Would the bolded in particular suggest potentially better RT than an equivalent desktop Navi 21 variant that would have roughly similar CU counts, SE and SA arrangements?

there's a rumor that RT is only for top AMD cards, that low end would focus on perf/wat.

There is another rumor that navi21 would have reserved 20cu for RT. What would make sense navi21 single pipe GPC.

In short, dual pipe would be a way for the xbox to have HW RT without having the "dedicated CUs" maintaining the size of the silicon.
 
there's a rumor that RT is only for top AMD cards, that low end would focus on perf/wat.

There is another rumor that navi21 would have reserved 20cu for RT. What would make sense navi21 single pipe GPC.

In short, dual pipe would be a way for the xbox to have HW RT without having the "dedicated CUs" maintaining the size of the silicon.
There's no "dedicated CUs", it would be beyond stupid. Each CU has RT unit in it's texture complex and none of the RT tasks "reserve" the ALUs of those CUs either.
 
Last edited:
Having two graphics queues was brought up with Navi 10, although it seemed like it didn't find much use in that generation.
The WGP0 and WGP1 distinction is unclear in its meaning to me. The counts don't match up with prior generations enough to imply there's a difference in functionality for BC purposes, and no obvious physical difference from the die shots. It might be a subdivision within an SA for other purposes, like load-balancing or which groups in an SA can afford defect recovery.

Yes, i saw that it also appears in GCN.

My point is that with them saying that autoHDR does not use the CPU or GPU, where is this computed? if not the dual-pipe is what? cloud?
 
There's no "dedicated CUs", it would be beyond stupid. Each CU has RT unit in it's texture complex and none of the RT tasks "reserve" the ALUs of those CUs either.


But you need to compute anyway and this, apparently, is done in the CUs, if it is done in CUnum12 or CU73 it doesn't matter, what need is to set a budget, right?

Or are you saying that CUs can compute ray ops and texture at the same time?
 
But you need to compute anyway and this, apparently, is done in the CUs, if it is done in CUnum12 or CU73 it doesn't matter, what need is to set a budget, right?

Or are you saying that CUs can compute ray ops and texture at the same time?
No, I'm not saying they can do both tex and ray ops at the same time (as in, simultaneously), but that they do whichever is needed whenever it's needed, not by dedicating bunch of CUs to sit idle half the time.
 
https://news.xbox.com/en-us/2020/10...x-series-s-backward-compatibility-update/amp/

"Auto HDR is enabled by the console’s hardware, there is absolutely no performance cost to the CPU, GPU or memory and there is no additional latency added ensuring you receive the ultimate gaming experience"
I'm not sure if that implies separate blocks, or just that for backwards compatibility the GPU would restrict the number of CUs available to the game and have leftover CUs for additional processing.
 
Does this have anything to do with "GPU Work Creation"?

"GPU Work Creation – Xbox Series X and Xbox Series S add hardware, firmware and shader compiler support for GPU work creation that provides powerful capabilities for the GPU to efficiently handle new workloads without any CPU assistance. This provides more flexibility and performance for developers to deliver their graphics visions."

From here: https://news.xbox.com/en-us/2020/03/16/xbox-series-x-glossary/amp/
That's been around since XBO was developed. This particular passage seems to describe the command known as
DX12: ExecuteIndirect
Vulkan:VK_NV_device_generated_commands
CUDA: Kernels can be launched from within kernels since Kepler <<< >>>

There are some customizations by MS that allow for slightly more flexibility on executeIndirect than what is available on the PC space. In terms of what we think is available in functionality
PC < Xbox One < Xbox One X < XB |SX
 
Last edited:
I'm not sure if that implies separate blocks, or just that for backwards compatibility the GPU would restrict the number of CUs available to the game and have leftover CUs for additional processing.

They said it is a machine learning algorithm, tuned game by game and can be turned off.

They have hw for inference, but would it cost zero for the rest of the hw?

SwAChQXS2JsgGPpGVJvekP-1920-80.jpg
 
They said it is a machine learning algorithm, tuned game by game and can be turned off.

We believe for AutoHDR that Microsoft is using the INT8 / INT4 portions of the GPU, that is entirely unused for BC games, hence the "for free" statement.

I vaguely recall it being a parallel path to the rest of the GPU, so it doesn't have a resource impact. But that part is fuzzy in my memory.
 
It certainly has allocation of hw for the OS, but I doubt it will be used for anything else
Let me change the term to a more appropriate one.
System reservation from OS reservation.
By definition any system level tasks can be done in it.
We believe for AutoHDR that Microsoft is using the INT8 / INT4 portions of the GPU, that is entirely unused for BC games, hence the "for free" statement.
The only problem with that being done outside system reservation, is that it's still using gpu resources that would otherwise be used.
Unless they can guarantee that it can fit in bubble of async compute.
So i would see it being both lower precision and system reservation.
 
Last edited:
Back
Top