Xbox Series X [XBSX] [Release November 10 2020]

BRiT · Jun 14, 2020

Would you kindly keep Sony and Cerny out of the Microsoft thread!

Shifty Geezer · Jun 15, 2020

Ronaldo8 said:
I thought so too for a long time. But it really does not make sense.

It's the only option that makes sense. In choice of GPU, everyone is going wider, because efficiency is far, far higher at lower clock speeds and GPU workloads are inherently parallel. You'd be daft to go narrow and clocked high if you didn't have to.

MS is following the industry here with XBSX's design. They might have some improvements but they aren't doing anything out of the ordinary with their GPU presenting extraordinary problems to solve. The patent you reference sounds like just another look at squeezing more efficiency from GPUs.

goonergaz · Jun 15, 2020

t0mb3rt said:
Not so much that Cerny was lying but trying to justify their choice to a fanbase who had just seen the relatively massive GPU Microsoft rolled out. And it worked because now the PS5 subreddit is full of people parroting Cerny's comments as a means to say the PS5 is somehow more powerful than the XSX.

The Cherno (ex EA dev who has worked on a few game engines) seemed to agree that faster and narrow is better or rather easier for devs to work with and get the best from.

Shifty Geezer · Jun 15, 2020

goonergaz said:
The Cherno (ex EA dev who has worked on a few game engines) seemed to agree that faster and narrow is better or rather easier for devs to work with and get the best from.

For a given TFs, yes. It's the same as CPUs - one core clocked at 8 GHz is preferable over 8 cores clocked at 1 GHz. A GPU capable of 12 TFs with 32 CUs would be better than a GPU capable of 12 TFs using 52 CUs. However, that's not possible and we need width and parallelism to increase our processor performance. There's no way MS could have put 12 TFs into XBSX using, say, 36 CUs, so for their performance target, they went wide, as is the norm for GPUs. Note that 'CUs' isn't at all representative of the parallelism we're talking about here; we're talking thousands of stream/shader processors.

PSman1700 · Jun 15, 2020

iroboto said:
I do feel for the Sony fans who don't care for BC, they are right that BC held them from seeing a higher potential PS5. I have no idea what the future will hold for the next PS device. This problem needs to be resolved.

Sony could be working on a more software oriented solution for the next PS (if one is coming before cloud takes over).

Shifty Geezer said:
A GPU capable of 12 TFs with 32 CUs would be better than a GPU capable of 12 TFs using 52 CUs. However, that's not possible and we need width and parallelism to increase our processor performance. There's no way MS could have put 12 TFs into XBSX using, say, 36 CUs, so for their performance target, they went wide, as is the norm for GPUs.

But where does the point of diminishing returns come into play? I think DF took this up at a certain point, the higher you clock the less gain you see in performance. That was regarding RNDA1 though (and basically all GPUs now).
It's the same with CPU overclocking, for example an i7 920, a OC from 2.67 to 3.2 sees a relatively large boost, going from 3.2 to 3.4 less so etc.

mrcorbo · Jun 15, 2020

I wonder to what degree Azure and/or XCloud factored into MS's ability to leverage a larger chip. My thinking here is that MS would have a use for chips that don't have power/thermal characteristics suitable for use in a console in their server installations and that this might reduce the pressure on yields somewhat.

Shifty Geezer · Jun 15, 2020

PSman1700 said:
But where does the point of diminishing returns come into play?

At the knee of the power/performance scale for the given architecture on the given lithographic process.

cheapchips · Jun 15, 2020

mrcorbo said:
I wonder to what degree Azure and/or XCloud factored into MS's ability to leverage a larger chip. My thinking here is that MS would have a use for chips that don't have power/thermal characteristics suitable for use in a console in their server installations and that this might reduce the pressure on yields somewhat.

Presumably a rejected chip with <52 active CU's may still be able to run 3 instances of Xbone's for XCloud , instead of 4.

mrcorbo · Jun 15, 2020

cheapchips said:
Presumably a rejected chip with <52 active CU's may still be able to run 3 instances of Xbone's for XCloud , instead of 4.

Good point. I hadn't considered that even chips with defective processing elements could potentially be harvested.

eastmen · Jun 15, 2020

mrcorbo said:
I wonder to what degree Azure and/or XCloud factored into MS's ability to leverage a larger chip. My thinking here is that MS would have a use for chips that don't have power/thermal characteristics suitable for use in a console in their server installations and that this might reduce the pressure on yields somewhat.

I know 4 projects that use the apu from the xsx devices

iroboto · Jun 15, 2020

PSman1700 said:
Sony could be working on a more software oriented solution for the next PS (if one is coming before cloud takes over).

But where does the point of diminishing returns come into play? I think DF took this up at a certain point, the higher you clock the less gain you see in performance. That was regarding RNDA1 though (and basically all GPUs now).
It's the same with CPU overclocking, for example an i7 920, a OC from 2.67 to 3.2 sees a relatively large boost, going from 3.2 to 3.4 less so etc.

Bandwidth, memory requests. Power is the main bottleneck for performance which is why we don’t keep clicking higher. But provided you can, then memory requests and bandwidth become the next issue. Your compute is running so much faster than your memory arrives; it’s not really doing much you'll run into other issues.

eastmen said:
I know 4 projects that use the apu from the xsx devices

provided you don't mean lockhart is a scarlett derivative, I can't say I can get quite up to 4 devices.

eastmen · Jun 15, 2020

iroboto said:
Bandwidth, memory requests. Power is the main bottleneck for performance which is why we don’t keep clicking higher. But provided you can, then memory requests and bandwidth become the next issue. Your compute is running so much faster than your memory arrives; it’s not really doing much you'll run into other issues.

provided you don't mean lockhart is a scarlett derivative, I can't say I can get quite up to 4 devices.

I can't say sadly but hopefully they are all announced

Jay · Jun 15, 2020

Shifty Geezer said:
There's no way MS could have put 12 TFs into XBSX using, say, 36 CUs, so for their performance target, they went wide, as is the norm for GPUs.

This is what I've said a few times, it's about the target TF that was required.
Is 52 CU's particularly wide for 12TF?
Is it running particularly slowly for 12TF?
Forgetting its an APU in a console, even for a discreet GPU.

Is big navi going to be less than 52CU's due to these issues that people talk about? (I doubt it)

iroboto · Jun 15, 2020

Jay said:
This is what I've said a few times, it's about the target TF that was required.
Is 52 CU's particularly wide for 12TF?
Is it running particularly slowly for 12TF?
Forgetting its an APU in a console, even for a discreet GPU.

Is big navi going to be less than 52CU's due to these issues that people talk about? (I doubt it)

. If the bandwidth is available at a reasonable price point it can be as high as 80 CU for big Navi. So HBM is going to be a requirement at that point.

function · Jun 15, 2020

eastmen said:
I know 4 projects that use the apu from the xsx devices

Hmm .. so if any of those uses are none gaming, perhaps the CPU has more cache than Renoir and it's closer to the chiplet designs. MS were bragging about 76MB of on chip sram, for whatever that's worth.

If Anaconda has an external IF link you could make some pretty potent multi socket blades out of it. Connect them via a custom hub / IO controller (like the ones supporting the chiplets in Zen servers and desktops) attached to a terrabyte or whatever of ram and all the SSDs in the world, and you'd have a pretty potent setup. That could also run games.

Hell, cache games in a big pool of ram via the IO hub and you could maybe run recompiled cloud based PS5 games, exactly like MS are in talks to do (just store the game data decompressed and pull it in via IF).

cheapchips · Jun 16, 2020

Anandtech's SSD article has me wondering how cheap the expansion cards will be. If they probably don't need DRAM and use the internal controller, the cost is mostly nand chips. They're not going to be using top end chips either. They're poised to ride the cheaper end of the nand market.

Silent_Buddha · Jun 17, 2020

cheapchips said:
Anandtech's SSD article has me wondering how cheap the expansion cards will be. If they probably don't need DRAM and use the internal controller, the cost is mostly nand chips. They're not going to be using top end chips either. They're poised to ride the cheaper end of the nand market.

Yes, even with any "manufacturer tax" it's likely going to be a fair bit cheaper to expand storage on the XBSX due to them not relying on cutting edge NAND speeds. Likewise they won't included any unnecessary hardware as you would get when putting in a consumer enthusiast class NVME drive into a machine.

It just needs to be able to guarantee sustaining the speeds that XBSX requires.

Regards,
SB

Allandor · Jun 17, 2020

cheapchips said:
Anandtech's SSD article has me wondering how cheap the expansion cards will be. If they probably don't need DRAM and use the internal controller, the cost is mostly nand chips. They're not going to be using top end chips either. They're poised to ride the cheaper end of the nand market.

Yes, that could be possible. But I would predict, that prices are not that far away from a really fast nvme-ssd. Just because there is only one supplier.
But well, if the controller is not on the chip and the card is "only" a bunch of cooled nand-chips, that it could get really cheap.

function · Jun 17, 2020

I think you might still need the controller on the expansion card as something has to handle communication across the PCIe bridge. It's over my head for sure, but I get the feeling you couldn't just directly connect flash to the far end of a PCIe bridge.

Might be another reason why MS chose a relatively mainstream performance controller with modest power consumption, and went the custom firmware and decompression hardware route.

t0mb3rt · Jun 17, 2020

function said:
I think you might still need the controller on the expansion card as something has to handle communication across the PCIe bridge. It's over my head for sure, but I get the feeling you couldn't just directly connect flash to the far end of a PCIe bridge.

Might be another reason why MS chose a relatively mainstream performance controller with modest power consumption, and went the custom firmware and decompression hardware route.

They're clearly banking on Sampler Feedback Streaming (ie: being selective about what gets streamed versus just streaming everything super fast) to meet their I/O goals instead of investing in super fast hardware. If SFS works as well as they say (50-66% reduction in data transferred) then it would seem like the a super elegant solution.

Xbox Series X [XBSX] [Release November 10 2020]

BRiT

(>• •)>⌐■-■ (⌐■-■)

Shifty Geezer

uber-Troll!

goonergaz

Shifty Geezer

uber-Troll!

PSman1700

mrcorbo

Foo Fighter

Shifty Geezer

uber-Troll!

cheapchips

mrcorbo

Foo Fighter

eastmen

iroboto

Daft Funk

eastmen

Jay

iroboto

Daft Funk

function

None functional

cheapchips

Silent_Buddha

Allandor

function

None functional

t0mb3rt

Similar threads