Current Generation Hardware Speculation with a Technical Spin [post launch 2021] [XBSX, PS5]

Status
Not open for further replies.
I think you may have accidentally used 'IF' (Infinity Fabric) instead of 'IC' (Infinity Cache). Infinity cache is the big GPU L3, Infinity Fabric is an interconnect between various components used on both AMD GPUs and CPUs e.g. connecting the two CCXs on a Zen 2 chiplet, or connecting one chip on a multi-chip-module to another.

If you're talking about Infinity Cache, then I'd agree with everything you're saying.

There are an awful lot of acronyms out there. And if you dig into one you normally find out it's made up of of several more. It's acronyms all the way down ....

I think you mean Infinity Cache. Infinity Fabric is something else and is present in more or less every AMD SoC.

Yea yea, I dont know what I was thinking there. lol Well aware it's Infinity Cache! Obviously had Infinity Fabric on the brain for some reason.
 
Locuza has released a twitter thread on his interpretation of those recently released high quality PS5 die shots.

28 tweets in the thread, this is just the first (the "128-bit FPU on CPU" is from an old tweet, don't get excited!)


Pretty much the same conclusions as from earlier shots ... but now with amazing photos! Worth a couple of minutes if you're interested in that sort of thing.

Reminded me of the different render back end in PS5 (RDNA 1 like) and XS (RDNA 2 like). Mulling over a couple of thoughts on that front. I think they could explain one or two things we're seen ... maybe.

PS5 seems to have twice the depth ROPs, and redundant units to boot. Big additional silicon cost vs XSX, and you lose a (potentially) useful RDNA 2 feature. But I think there could be cases - particularly edge cases - where it nets the PS5 a not insignificant advantage. I'll try and think of an example to show what I'm getting at, and where we might see something like it.

Anyway....

IsThis 2.jpg
 
Locuza has released a twitter thread on his interpretation of those recently released high quality PS5 die shots.

28 tweets in the thread, this is just the first (the "128-bit FPU on CPU" is from an old tweet, don't get excited!)


Pretty much the same conclusions as from earlier shots ... but now with amazing photos! Worth a couple of minutes if you're interested in that sort of thing.

Reminded me of the different render back end in PS5 (RDNA 1 like) and XS (RDNA 2 like). Mulling over a couple of thoughts on that front. I think they could explain one or two things we're seen ... maybe.

PS5 seems to have twice the depth ROPs, and redundant units to boot. Big additional silicon cost vs XSX, and you lose a (potentially) useful RDNA 2 feature. But I think there could be cases - particularly edge cases - where it nets the PS5 a not insignificant advantage. I'll try and think of an example to show what I'm getting at, and where we might see something like it.

Anyway....


His (and others) analysis of both APUs showed that:

- PS5 CPU has not (really) a cutdown FPU, allegedly only a few exotic instructions missing but the AVX 256 native instructions should be there. It's not like Cerny didn't make a very precise statement about PS5 supporting AVX 256 native instructions. And Cerny does not talk in riddles or PR talks.
PlayStation 5 is especially challenging because the CPU supports 256-bit native instructions that consume a lot of power.

- Xbox Series X has heavily cutdown ROPs (about half size of PS5 ROPs, PS5 has double Z/stencil ROPs) which could explain plenty of "weird", "unexplainable" framerate drops in many games from launch to now (the last one being little nightmare 2, almost all those games with DRS using different engines).

- Both Xbox and PS5 have "RDNA 1.5" primitive Unit / rasterizer structure (well RDNA1 actually), not RDNA2.
 
His (and others) analysis of both APUs showed that:

- PS5 CPU has not (really) a cutdown FPU, allegedly only a few exotic instructions missing but the AVX 256 native instructions should be there. It's not like Cerny didn't make a very precise statement about PS5 supporting AVX 256 native instructions. And Cerny does not talk in riddles or PR talks.

Yeah, Cerny is a cool guy. I tried in my last post to point out that Locuza was not saying that PS5 APU was 128-bit here, and that what appeared in the tweet was a re-tweet of an old comment he would later elaborate on.

Personally, I like the idea that PS5 FPU is able to be smaller due to thermal density. PS5 has a pre-determined (and uniform across all consoles) relationship between activity and heat/power, and it's at a lower threshold than other Zen 2 units. So why not take advantage of that!

- Xbox Seires X has heavily cutdown ROPs (about half size of PS5 ROPs, PS5 has double Z/stencil ROPs) which could explain plenty of "weird", "unexplainable" framerate drops in many games from launch to now (the last one being little nightmare 2, almost all those games with DRS using different engines).

I've been thinking about that, and it's not that XSX ROPs are "cut down", it's that with RDNA 2 AMD has decided to use the die area elsewhere. But this doesn't mean that where the die area had previously been used (PS5 / RDNA 1) is irrelevant. If you've got it, and you can use it, there will be times where it shows.

It's a mistake IMO to think that one balance is universally better than another, and that old and new averaged balances don't have a gradual transition. RDNA 2 RBE is a best guess at the future, but it's not a "best in all cases" thing.

- Both Xbox and PS5 have "RDNA 1.5" primitive Unit / rasterizer structure (well RDNA1 actually), not RDNA2.

It's more nuanced than that I think. XSX supports mesh amplification shader, and AFAIK PS5 doesn't, so they aren't on the same level of progression. XSX branched later, IMO. But the significance of that is yet to be proven.

Neither is as advanced as the latest AMD products, and we shouldn't get too held up on '1' vs '2' (not saying you personally are!).
 
Neither is as advanced as the latest AMD products, and we shouldn't get too held up on '1' vs '2' (not saying you personally are!).
Huh? What exactly is XSX lacking compared to Navi2x? Other than one extra layer of cache which is architecturally irrelevant.
 
Huh? What exactly is XSX lacking compared to Navi2x? Other than one extra layer of cache which is architecturally irrelevant.

In terms of features it's bang on (as far as we can tell), but the front end is structured differently to the block diagrams for PC RDNA 2 parts that released later. So it's probably older.

Truth is that GPUs are modular and in constant development. Modules are ready at different times. Consoles don't have to fit in exactly with PC product roadmaps, and vice versa.

As I said, PC components aren't necessarily the be all and end all for none PC components that exist on a different part of the (branched) roadmap.
 
In terms of features it's bang on (as far as we can tell), but the front end is structured differently to the block diagrams for PC RDNA 2 parts that released later. So it's probably older.

Truth is that GPUs are modular and in constant development. Modules are ready at different times. Consoles don't have to fit in exactly with PC product roadmaps, and vice versa.

As I said, PC components aren't necessarily the be all and end all for none PC components that exist on a different part of the (branched) roadmap.
You can't really read details like that from highly stylized artist pieces vs actual die shots. Even if there is a difference it could just be a layout thing with exact same functionality to fit the SoC better.
 
You can't really read details like that from highly stylized artist pieces vs actual die shots. Even if there is a difference it could just be a layout thing with exact same functionality to fit the SoC better.

The differences in the front end seem architectural rather than layout IMO. Don't know how significant those changes are tho...
 
The differences in the front end seem architectural rather than layout IMO. Don't know how significant those changes are tho...
What makes you think that? I mean seriously, we have heavily stylized representations by artist, not dieshots for Navi2x.
 
What makes you think that? I mean seriously, we have heavily stylized representations by artist, not dieshots for Navi2x.

Sorry @Kaotik, I was rushing during an online game and also very drunk last night. Here's a slide from Locuza's thread on the different systems.

E956ZDUX0AAf2rf

I got it backwards last night. But yeah, I get your point that this isn't the same as actually having some clever folks actually identify the parts on a good die shot, so some degree of withholding judgement might be a good idea. Fair point.

I'll see if I can find where he got this info from.
 
Last edited:
Well, the official block diagrams for PC RDNA 1 very definitely show 2 Primitive Units and and 2 Rasterisers per Shader Engine (presumably 1 each per Shader Array), but only 1 Primitive Unit and 1 Rasteriser per Shader Engine in PC RDNA 2.

I'd assume this is exactly the kind of detail that these block diagrams are supposed to convey, so I'll roll with this for now.


So they do seem to be have been elevated from the SA level up to the SE level for PC RDNA 2. Prolly allows for better utilisation of overall resource.
 
Last edited:
- PS5 CPU has not (really) a cutdown FPU, allegedly only a few exotic instructions missing but the AVX 256 native instructions should be there.
I haven't heard of missing 256bit instructions. What Sony+AMD apparently did was just use high-density transistor libraries instead of high performance ones. Heat density when running 256bit FP instructions @ iso clocks should be considerably greater compared to PC/Series Zen2 APus, but it seems those are pretty rare in gaming workloads so even it pushes the CPU clocks down (like Cerny suggested) it should result in a negligible performance difference.

Perhaps another interesting question is why Microsoft opted out of these area savings, considering the rising costs of high end nodes. They do need to use the Series X SoC on Azure servers, but is that the case with the Series S?
 
I haven't heard of missing 256bit instructions. What Sony+AMD apparently did was just use high-density transistor libraries instead of high performance ones. Heat density when running 256bit FP instructions @ iso clocks should be considerably greater compared to PC/Series Zen2 APus, but it seems those are pretty rare in gaming workloads so even it pushes the CPU clocks down (like Cerny suggested) it should result in a negligible performance difference.

Perhaps another interesting question is why Microsoft opted out of these area savings, considering the rising costs of high end nodes. They do need to use the Series X SoC on Azure servers, but is that the case with the Series S?
They also can have high heat density thanks to liquid metal cooling.

Edit: Corrected
 
Last edited:
At Hotchips MS said that the cap on their CPU frequency was heat / cooling noise when the AVX units were under heavy stress. And they guarantee to never throttle - uniform clocks were a design decision.

So MS's heat / noise threshold was 3.8 gHz with what are quite possibly less densely packed FPUs than in PS5.

During road to PS5, Cerny said (IIRC) that they had trouble guaranteeing 3 gHz with static clocks. I think as with MS, they had probably hit a heat / noise threshold with the FPUs under stress. 256-bit instructions and their associated power did seem to be what Cerny was associating with the variable clocks. So Sony and MS may well have taken a different path in part because of the FPU heat density.

Perhaps another interesting question is why Microsoft opted out of these area savings, considering the rising costs of high end nodes. They do need to use the Series X SoC on Azure servers, but is that the case with the Series S?

Series S has static gaming clocks too, and probably needs to have to parallel XSX development. And going by Cerny's comments, if 3 gHz was difficult for PS5 to guarantee, I wonder how low MS would have needed to set their static clocks given the simpler / cheaper cooler on Series S?

If Series S had to be set at 2.8 gHz (or whatever) to accommodate more a densely packed FPU, I think that would have been problem.
 
Status
Not open for further replies.
Back
Top