Xbox Series X [XBSX] [Release November 10 2020]

Of course hardware is involved. RDNA 2 is VRS tier 2 compatible.

The patent in question consistently refers to the methods described as “application-directed”, as in, the operations performed by the GPU are not completely self-determined in hardware.
Perhaps I’m not understanding, The developer has to choose how VRS works. It’s not a completely fixed function unit that just does work without being programmed to do it.
 
I think a comparison with Ampere is more intresting, seeing that Turing is a 2018 product.
There is now two distinctions with RDNA2 RT acceleration:

1- It can't accelerate BVH Traversal, only ray intersections, traversal is performed by the shader cores.
2- Ray Intersections is shared with texture units.

In comparison, Turin RT cores:
1- Accelerate BVH traversal on their own
2- Ray Intersections is independent and is not shared with anything else

So in a sense RDNA2 solution is hybrid, as it is shared between both textures and shaders compared to Turing's solution.

(also posted in the RDNA 2 PC thread)
 
Of course hardware is involved. RDNA 2 is VRS tier 2 compatible.

The patent in question consistently refers to the methods described as “application-directed”, as in, the operations performed by the GPU are not completely self-determined in hardware.
Isn't that supposed to be obvious? You don't even need to dig into the patent to see it, it's all in the D3D12 specifications.
 
My stand corrected a bit on RDNA scheduling — according to the ISA documentation, RDNA actually does track data dependencies in hardware, and does not require manually inserted wait state even though now it exposes the multi-cycle execution pipeline (unlike GCN).

So while I can’t rule out that it can do superscalar issue “by the books”, my two cents is that it probably isn’t, since it can already issue from a huge pool of wavefronts even with each of them contributing just 1 per clock.

From what I've seen, the one instruction per clock per wavefront rule still applies. A bunch of the wait states listed for prior generations may have been removed due to the addition of the second scalar unit and scheduler. Resources that used to be shared between multiple SIMDs were linked to only one, removing an area of contention between wavefronts.
The various waitcnt scenarios were another category of dependence tracking, and have generally grown in number versus the official table of wait states. However, there are a number of errata concerning instruction combinations or branch scenarios that get cited as requiring NOPs, stalls, or non-dependent instructions for 1-2 cycles for GFX10. AMD labels those bugs, so perhaps RDNA2 fixes them, although the end result for Navi still means wait states.
 
From what I've seen, the one instruction per clock per wavefront rule still applies. A bunch of the wait states listed for prior generations may have been removed due to the addition of the second scalar unit and scheduler. Resources that used to be shared between multiple SIMDs were linked to only one, removing an area of contention between wavefronts.
The various waitcnt scenarios were another category of dependence tracking, and have generally grown in number versus the official table of wait states. However, there are a number of errata concerning instruction combinations or branch scenarios that get cited as requiring NOPs, stalls, or non-dependent instructions for 1-2 cycles for GFX10. AMD labels those bugs, so perhaps RDNA2 fixes them, although the end result for Navi still means wait states.
I am referring to the general case for most VALU/SALU instructions, now that the “hazard-free” (ish) back-to-back issuing is gone in Navi. It is true that indefinite long latency requests like VMEM or export still require explicit waiting, and that there are errata. But otherwise the arch does not require NOP to be inserted by the compiler to fill the pipeline bubbles, when two consecutive instructions have data dependency. The tracking/scoreboarding apparently covers also registers used by e.g. VMEM instructions in limited scenarios.

That said I think RDNA still relies on compile time scheduling to minimise stalls, since there is no sign that it moves away from in-order execution.
 
Last edited:
Perhaps I’m not understanding, The developer has to choose how VRS works. It’s not a completely fixed function unit that just does work without being programmed to do it.

Yes, that is my point. I mean, inherently things are done differently by virtue of D12X vs. whatever software backend Sony has. The key is that I don’t read anything in the HW description that seems beyond the tier 2 VRS standard, unless I missed something.
 
Last edited:
which metric makes more sense in your opinion?
Wouldn't ray-tri be better than rays shot in terms of understanding how much performance is available? You can shoot rays into nothingness and never get a hit-return on your intersection.
If I understand it correctly AMD test is basically shooting against scene of single triangle. (Or box.)
Nvidia uses some form of average/common depth until hit, combining box tests and final triangle tests in real world scene.

So to get quess of what performance we have with Microsofts numbers for a scene we need to get average amount of box intersections before polygon intersections and average polygon count within final AABB etc box.
With those we could have some sort of quess on performance for a rays.. in simplest case with coherent rays.

Performance in case of incoherent rays and all sort of sorting etc methods need proper testing with same scenes etc.
Same goes for shading and possible traversal restrictions and so on.
 
Last edited:
I suppose this presentation makes Lockhart really come into focus.

The big killer with cost for XSX is the SoC itself. As big as X1X but on a much more expensive process. Second biggest killer is GDDR6. XSX has a lot of it, and the cost of dram is dropping only very slowly now.

The two biggest savings with Lockhart are on the die (GPU and memory bus significantly reduced) and also the amount of GDDR6 (likely down by 37.5%).

There probably couldn't be the XSX we're seeing here without Lockhart.
 
I don’t think that’s a necessary conclusion. The PS5 has a remarkably similar BOM (estimated within $40) and will in all likelihood go on to sell at a higher volume.

Exactly, MS don't want to be eating $40 per unit only to be wiped out in the market. If that was the plan XSX simply wouldn't exist IMO. But we're moving away from pure tech talk now (which is probably my fault!). :)

Back on topic, anyone want to take a guess at how much MS can shave of the XSX die for Lockhart? As we have a die shot for XSX, and we know roughly about TFlops and memory configuration we can probably start making some rough guesses now. Time to crack open MS paint and load up this:

mU23LQCtEe9ePVZe3DpU7S-1920-80.jpg
 
Exactly, MS don't want to be eating $40 per unit only to be wiped out in the market. If that was the plan XSX simply wouldn't exist IMO. But we're moving away from pure tech talk now (which is probably my fault!). :)

Back on topic, anyone want to take a guess at how much MS can shave of the XSX die for Lockhart? As we have a die shot for XSX, and we know roughly about TFlops and memory configuration we can probably start making some rough guesses now. Time to crack open MS paint and load up this:
Well, I guess they could reuse the whole chip to increase the production output. With 56 CUs on board, chips that don't make it may have at least 20 working CUs on board. This could greatly decrease production cost.
And well, then there might be another chip just for lockhart with 22-24 CUs but the rest more or less unchanged (well a smaller GDDR6 interface). So I would guess .. ~30% smaller Die size.
 
Won't having two different SKUs effect the economies of scale?

It definitely could, ideally the differences between XBSX and XBSS are about what XBSS does not have so that you can run both motherboards through the same production line and simply not add additional chips at certain points of the line. of course that would maximise your economy of scale on assembly and production but would hurt you in that every XBSS is now shipping with an m/b that is too feature rich for the job and thus not that much cheaper. On the other hand you can go with two separate lines and tailor both devices leaving you with very little overlap in terms of assembly but perhaps a materially tighter BoM that lets you seriously undercut PS5 with XBSS.

The costs of getting these manufacturing choices wrong are very, very high (look at how many drastic PS3 revisions we got) so that's why I wasn't expecting a two SKU launch from MS (I don't count not shipping an ODD in one model as a multiple SKU strategy). It will be very interesting to see the degree of commonality between the two models as in some way it's a sign of confidence in their strategy, low amounts of commonality means they are going hard on maximising that discount on XBSS, high amounts of commonality suggests they are unsure of what their manufacturing split will be in the real world and want flexibility to adjust the mix up and down with little manufacturing impact.
 
It definitely could, ideally the differences between XBSX and XBSS are about what XBSS does not have so that you can run both motherboards through the same production line and simply not add additional chips at certain points of the line. of course that would maximise your economy of scale on assembly and production but would hurt you in that every XBSS is now shipping with an m/b that is too feature rich for the job and thus not that much cheaper. On the other hand you can go with two separate lines and tailor both devices leaving you with very little overlap in terms of assembly but perhaps a materially tighter BoM that lets you seriously undercut PS5 with XBSS.

The costs of getting these manufacturing choices wrong are very, very high (look at how many drastic PS3 revisions we got) so that's why I wasn't expecting a two SKU launch from MS (I don't count not shipping an ODD in one model as a multiple SKU strategy). It will be very interesting to see the degree of commonality between the two models as in some way it's a sign of confidence in their strategy, low amounts of commonality means they are going hard on maximising that discount on XBSS, high amounts of commonality suggests they are unsure of what their manufacturing split will be in the real world and want flexibility to adjust the mix up and down with little manufacturing impact.
XBSX is 2 board design - it is more expensive.
XBSS won't have it.
But of course they will have similarly designed parts (PS3 with EE part on MB looks like torn from PS2).
 
XBSX is 2 board design - it is more expensive.
XBSS won't have it.
But of course they will have similarly designed parts (PS3 with EE part on MB looks like torn from PS2).
Have they confirmed XBSS is single board?
split motherboard.jpg
This looks like the APU + RAM modules are on board A and the I/O ports and SSD are on B but I can't find any imagery to confirm that

If they do abandon the dual board design then they are very confident in their strategy and will be going very hard for a lower price as you have basically trashed every part of the XBSX physical design at that point and will be custom everything for the XBSS (excepting the APU, presumably).
 
Last edited:
Have they confirmed XBSS is single board?
View attachment 4482
This looks like the APU + RAM modules are on board A and the I/O ports and SSD are on B but I can't find any imagery to confirm that

If they do abandon the dual board design then they are very confident in their strategy and will be going very hard for a lower price as you have basically trashed every part of the XBSX physical design at that point and will be custom everything for the XBSS (excepting the APU, presumably).
There was a rumor that S is not a tower.
Also X devkit looks like Scorpio devkit (dunno if it has single board).
There is no reason to make not very power hungry device with 2 boards.
 
Have they confirmed XBSS is single board?
View attachment 4482
This looks like the APU + RAM modules are on board A and the I/O ports and SSD are on B but I can't find any imagery to confirm that

If they do abandon the dual board design then they are very confident in their strategy and will be going very hard for a lower price as you have basically trashed every part of the XBSX physical design at that point and will be custom everything for the XBSS (excepting the APU, presumably).

MS have been managing the One S/X just fine. They're arguable more divergent that the Series S/X. Same for PS4/Pro.
 
Back
Top