It's the only option that makes sense. In choice of GPU, everyone is going wider, because efficiency is far, far higher at lower clock speeds and GPU workloads are inherently parallel. You'd be daft to go narrow and clocked high if you didn't have to.I thought so too for a long time. But it really does not make sense.
Not so much that Cerny was lying but trying to justify their choice to a fanbase who had just seen the relatively massive GPU Microsoft rolled out. And it worked because now the PS5 subreddit is full of people parroting Cerny's comments as a means to say the PS5 is somehow more powerful than the XSX.
For a given TFs, yes. It's the same as CPUs - one core clocked at 8 GHz is preferable over 8 cores clocked at 1 GHz. A GPU capable of 12 TFs with 32 CUs would be better than a GPU capable of 12 TFs using 52 CUs. However, that's not possible and we need width and parallelism to increase our processor performance. There's no way MS could have put 12 TFs into XBSX using, say, 36 CUs, so for their performance target, they went wide, as is the norm for GPUs. Note that 'CUs' isn't at all representative of the parallelism we're talking about here; we're talking thousands of stream/shader processors.The Cherno (ex EA dev who has worked on a few game engines) seemed to agree that faster and narrow is better or rather easier for devs to work with and get the best from.
I do feel for the Sony fans who don't care for BC, they are right that BC held them from seeing a higher potential PS5. I have no idea what the future will hold for the next PS device. This problem needs to be resolved.
A GPU capable of 12 TFs with 32 CUs would be better than a GPU capable of 12 TFs using 52 CUs. However, that's not possible and we need width and parallelism to increase our processor performance. There's no way MS could have put 12 TFs into XBSX using, say, 36 CUs, so for their performance target, they went wide, as is the norm for GPUs.
At the knee of the power/performance scale for the given architecture on the given lithographic process.But where does the point of diminishing returns come into play?
I wonder to what degree Azure and/or XCloud factored into MS's ability to leverage a larger chip. My thinking here is that MS would have a use for chips that don't have power/thermal characteristics suitable for use in a console in their server installations and that this might reduce the pressure on yields somewhat.
Presumably a rejected chip with <52 active CU's may still be able to run 3 instances of Xbone's for XCloud , instead of 4.
I wonder to what degree Azure and/or XCloud factored into MS's ability to leverage a larger chip. My thinking here is that MS would have a use for chips that don't have power/thermal characteristics suitable for use in a console in their server installations and that this might reduce the pressure on yields somewhat.
Bandwidth, memory requests. Power is the main bottleneck for performance which is why we don’t keep clicking higher. But provided you can, then memory requests and bandwidth become the next issue. Your compute is running so much faster than your memory arrives; it’s not really doing much you'll run into other issues.Sony could be working on a more software oriented solution for the next PS (if one is coming before cloud takes over).
But where does the point of diminishing returns come into play? I think DF took this up at a certain point, the higher you clock the less gain you see in performance. That was regarding RNDA1 though (and basically all GPUs now).
It's the same with CPU overclocking, for example an i7 920, a OC from 2.67 to 3.2 sees a relatively large boost, going from 3.2 to 3.4 less so etc.
provided you don't mean lockhart is a scarlett derivative, I can't say I can get quite up to 4 devices.I know 4 projects that use the apu from the xsx devices
Bandwidth, memory requests. Power is the main bottleneck for performance which is why we don’t keep clicking higher. But provided you can, then memory requests and bandwidth become the next issue. Your compute is running so much faster than your memory arrives; it’s not really doing much you'll run into other issues.
provided you don't mean lockhart is a scarlett derivative, I can't say I can get quite up to 4 devices.
This is what I've said a few times, it's about the target TF that was required.There's no way MS could have put 12 TFs into XBSX using, say, 36 CUs, so for their performance target, they went wide, as is the norm for GPUs.
. If the bandwidth is available at a reasonable price point it can be as high as 80 CU for big Navi. So HBM is going to be a requirement at that point.This is what I've said a few times, it's about the target TF that was required.
Is 52 CU's particularly wide for 12TF?
Is it running particularly slowly for 12TF?
Forgetting its an APU in a console, even for a discreet GPU.
Is big navi going to be less than 52CU's due to these issues that people talk about? (I doubt it)
I know 4 projects that use the apu from the xsx devices
Anandtech's SSD article has me wondering how cheap the expansion cards will be. If they probably don't need DRAM and use the internal controller, the cost is mostly nand chips. They're not going to be using top end chips either. They're poised to ride the cheaper end of the nand market.
Yes, that could be possible. But I would predict, that prices are not that far away from a really fast nvme-ssd. Just because there is only one supplier.Anandtech's SSD article has me wondering how cheap the expansion cards will be. If they probably don't need DRAM and use the internal controller, the cost is mostly nand chips. They're not going to be using top end chips either. They're poised to ride the cheaper end of the nand market.
I think you might still need the controller on the expansion card as something has to handle communication across the PCIe bridge. It's over my head for sure, but I get the feeling you couldn't just directly connect flash to the far end of a PCIe bridge.
Might be another reason why MS chose a relatively mainstream performance controller with modest power consumption, and went the custom firmware and decompression hardware route.