Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
when we get to 32 mb its then esram ;)
Lets just call it what it is... xbo backwards compatiblity.

That's likely how all the Xbox One first party titles will manage to run on PCs!
 
indeed. they are based on RDNA 2. But custom means you can add/remove blocks.

In this case, it looks like they either opted to remove infinity cache, or if it's present it's smaller.
This is why I was trying to understand what infinity cache was the other day.
As its possible all/some consoles use it, just not the same size.
i.e. the cache architecture and logic could be infinity based.

Also they could easily consider infinity cache to be part of the memory interface and not RDNA2 feature set.
So you could have RDNA2 laptop implementation without infinity cache using GDDR5 or something. Yet it would be considered full RDNA2.
 
This is why I was trying to understand what infinity cache was the other day.
As its possible all/some consoles use it, just not the same size.
i.e. the cache architecture and logic could be infinity baaed.

Also they could easily consider infinity cache to be part of the memory interface and not RDNA2 feature set.
So you could have RDNA2 laptop implementation without infinity cache using GDDR5 or something. Yet it would be considered full RDNA2.
I mean... with recent revelation that textures can be streamed into GPU directly bypassing vram.

I do question if that means textures can go straight into something like a infinity cache.
The infinity cache just works to retain 1 copy of each thing. The L2 will check if the IC has it. If not it requests from memory controllers?
And perhaps there are APIs you can just dump stuff directly into the IC?

I dunno.. seems far fetched and too wishful
 
indeed. they are based on RDNA 2. But custom means you can add/remove blocks.

In this case, it looks like they either opted to remove infinity cache, or if it's present it's smaller.
Feature wise, aside from infinity cache, which is more like bandwidth augmentation, it's got all the same features.

Yeah, in terms of the feature set available to developers it's got everything + more. It's probably not even exposed to PC developers.

Infinity cache as you say is about bandwidth augmentation, but it wouldn't benefit (or would benefit much less) a system that already already has a wider bus than the 6900 XT but has significantly less bandwidth consumption.
 
Xbox Series X has full RDNA2 integration, plus has Sampler Feedback Streaming exclusively, as for the developers, it's a real game changer.

What is the full RDNA2 feature set? Or rather what bits of the feature set is the PS5 missing? I think it was discussed somewhere else in this thread that SFS is nice/good to have, but not a big game changer.
 
Yeah, in terms of the feature set available to developers it's got everything + more. It's probably not even exposed to PC developers.

Infinity cache as you say is about bandwidth augmentation, but it wouldn't benefit (or would benefit much less) a system that already already has a wider bus than the 6900 XT but has significantly less bandwidth consumption.
I guess for me, the question is if they bypass VRAM for SFS streaming as per Beard Man's comments and it's going directly to the GPU.
Well then... where is it going?
L2 is connected to memory controllers...
L1 is the Shader Arrays
L0 is the CUs...
so where are the textures being dumped? How do we quickly distribute that incoming data to all the shader arrays that require it?

if there was a cache of some size but not L3, of which the purpose is to hold 1 copy of everything and L1 can check it or L2 for it... before going out to memory... then perhaps this setup might make sense to have even in a smaller configuration - even something as small as esram.
 
I think there might be some conflation between RDNA's2 instruction set (which is important for developers) and RDNA2's architectural enhancements for power saving and increased clocks (which are transparent to developers).

It seems to me the SeriesX has the full RDNA2 instruction set of the PC cards, whereas the PS5 is more custom. The SeriesX has no Infinity Cache because it's using a wider external bus to the GDDR6, and (perhaps as a result) it's not clocking anywhere near the desktop GPUs. The PS5 does clock close to the desktop parts and it has a narrower bus to GDDR6, so it might have a similar solution to Infinity Cache.

Regardless, it makes sense that the SeriesX is the most featureset compatible with the desktop cards, considering they're both targeting the same DX12 Ultimate API.
 
  • Like
Reactions: Jay
It seems to me the SeriesX has the full RDNA2 instruction set of the PC cards, whereas the PS5 is more custom. The SeriesX has no Infinity Cache because it's using a wider external bus to the GDDR6, and (perhaps as a result) it's not clocking anywhere near the desktop GPUs.
6800 has a 1800 Mhz game clock.
that's exact fit for XSX clocking in at 1825. I'm sure if they decided for variable clocks it would match the 6800 quite well. This is likely a sweet spot for it around the 56 CU range.

And a ton of cards have a 256 bit bus. You're making it seem like this is anemic suddenly.
You're thinking a 300mm^2 with 36 CU has the same bandwidth requirements of the 60/72/ and 80CU GPUs?

Check this picture out.
It's 360mm^2.
Remove the 16 CUs and 2 memory controller, but also lose 60mm^2. Then find a way to put in a 100mm^2 128mb cache.

die_shot.png
 
Last edited:
I guess for me, the question is if they bypass VRAM for SFS streaming as per Beard Man's comments and it's going directly to the GPU.
Well then... where is it going?
L2 is connected to memory controllers...
L1 is the Shader Arrays
L0 is the CUs...
so where are the textures being dumped? How do we quickly distribute that incoming data to all the shader arrays that require it?

if there was a cache of some size but not L3, of which the purpose is to hold 1 copy of everything and L1 can check it or L2 for it... before going out to memory... then perhaps this setup might make sense to have even in a smaller configuration - even something as small as esram.
I'm still not sure what Mr beard guy fully meant.
But going straight to GPU and bypassing memory wouldn't necessarily mean bypassing any caches.
Could just mean putting it straight into the infinity cache.

I'm not saying I think it has infinity cache. I don't know, don't have a view as don't know what it actually means only what it results in for dGPU.
 
I'm still not sure what Mr beard guy fully meant.
But going straight to GPU and bypassing memory wouldn't necessarily mean bypassing any caches.
Could just mean putting it straight into the infinity cache.

I'm not saying I think it has infinity cache. I don't know, don't have a view as don't know what it actually means only what it results in for dGPU.
he said bypass memory.

that's leaves me to guess where it could go. i agree, I would like more detail on how this is accomplished.
 
6800 has a 1800 Mhz game clock.
Boosting up to 2105MHz and probably averaging a lot closer to that value, if the RDNA1 cards are anything to go by.

And a ton of cards have a 256 bit bus. You're making it seem like this is anemic suddenly.
VRAM bandwidth-per-compute is indeed anemic on Big Navi compared to its closest relative that we know, Navi 10.
What is weird here is your persistence in claiming it's not, considering this was one of AMD's own talking points when presenting the new cards today.


You're thinking a 300mm^2 with 36 CU has the same bandwidth requirements of the 60/72/ and 80CU GPUs?
Quotation needed.
 
I dunno. Maybe, maybe not. I knew that Sony wouldn't have stuff like SFS exactly because that's MS's nomenclature for that particular technology which would be their own implementation. It's not that Sony may not have their own version of it, but if so they've likely indicated it back in March where it was pertinent.

Thinking on it for a bit tho, one thing that does stand out to me is how deeply ingrained DX12U is into the RDNA2 architecture. It seems to be going a ways beyond than simply AMD making their product and then MS trying to ensure their stuff is as compatible as possible afterwards. No, this is looking like it was a deep operation between them from the get-go. Which, there were already rumors regarding that (tbf, there were also rumors that Sony co-designed Navi along with AMD)...but some of the stuff mentioned today shows it's deeper than even first thought.

DX12U has probably been in the planning for several years, with the likes of Nvidia and AMD talking to MS about what their they'd like to provide and what would sell to their customers, and MS talking about what features they and their customers would see as being valuable in the future.

It makes absolute sense that RDNA2 is designed to support the next significant iteration of DX, but that doesn't mean that these features only exist because of what MS wanted for DX12U. VRS for example makes sense whatever the API, as it's a great way to save on performance - Nvidia had the capability years before DX12U was published and they'll have been in on the discussion for a long time.

If Sony don't have a form of VRS it's almost certainly not because it wouldn't benefit them (the principle is universal), it's because other factors (perhaps when they branched out from the RDNA line of technologies) meant it wasn't there for them. It would be the same for hardware functionality enabling what MS call Sampler Feedback, if it turns out Sony don't have a similar feature (which they still might, of course!).

Like, it's to a point where I now see whatever customizations Sony's done on their GPU, might've been out of sheer necessity to make sure they had the silicon needed for their API stacks that clearly aren't DX12-based (because they have their own API stuff). Maybe there is a chance in the future AMD spin off an RDNA GPU card that is more compatible with Vulkan, if what they're going to be putting out right now turns out to not be too Vulkan-friendly?

Vulkan and DX12 have pretty similar goals, so it's unlikely that you'd target hardware specifically at Vulkan. If something's good for graphics (e.g. VRS) it should be good regardless of API. And if the API isn't good for what's good for graphics, then the API needs to change.

I think the same will be true for PS5 and its API. One API might be a little better than others for certain things, but normally if there's a feature that an API has that's worthwhile, it'll be adopted elsewhere over time. Ultimately the job of any API is to abstract hardware, and so with PS5 the starting point of development is unlikely to have been "this is our API" but "what do we want the hardware to do, what hardware does AMD have, and what customisations are desired and possible within our budget / timeframe.
 
Game clocks are AMDs expected values.
So it won't be holding closer to boost values.
It's expecting to hold at game clock values as per their own statements on what game clock is.

VRAM bandwidth-per-compute is indeed anemic on Big Navi compared to its closest relative that we know, Navi 10.
What is weird here is your persistence in claiming it's not, considering this was one of AMD's own talking points when presenting the new cards today.
256-bit bus is used for cards all the way up to 2080 - 10TFs.
For 36 CUs it's not anemic.

Quotation needed.
What do you want the quotation on? 300mm^2? or that you are implying that 128mb is on the PS5? if the latter, okay you said similar.
fair enough.
but even 64mb is still far too large it's going to take up 50mm^2
 
Last edited:
If Sony don't have a form of VRS it's almost certainly not because it wouldn't benefit them (the principle is universal), it's because other factors (perhaps when they branched out from the RDNA line of technologies) meant it wasn't there for them. It would be the same for hardware functionality enabling what MS call Sampler Feedback, if it turns out Sony don't have a similar feature (which they still might, of course!).
One of MS's quotes was about waitng until they had access to the full RDNA2 fearure set.
But I'm not reading too much into that as that doesn't by extention mean that Sony didn't.

My initial guess is SF, and then MS added SFS. Be nice if DF or someone could get a clear answer to this though.
 
"In our quest to put gamers and developers first we chose to wait for the most advanced technology from our partners at AMD before finalizing our architecture."

This to me strongly implies that Sony just chose to finalize their hardware sooner than MS...

Seems unlikely that they'd have a custom equivalent to some of these things before AMD themselves had them....now what that ultimately means in practice is another thing.
 
Status
Not open for further replies.
Back
Top