Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

function · Oct 28, 2020

iroboto said:
I guess for me, the question is if they bypass VRAM for SFS streaming as per Beard Man's comments and it's going directly to the GPU.
Well then... where is it going?
L2 is connected to memory controllers...
L1 is the Shader Arrays
L0 is the CUs...
so where are the textures being dumped? How do we quickly distribute that incoming data to all the shader arrays that require it?

if there was a cache of some size but not L3, of which the purpose is to hold 1 copy of everything and L1 can check it or L2 for it... before going out to memory... then perhaps this setup might make sense to have even in a smaller configuration - even something as small as esram.

This is an interesting question. I don't know enough about this even in general, but thinking about it, I suppose if you are indeed treating it like it's in vram (which has been mentioned a few times) your IO unit would pass the data to whatever requested it, in as similar a manner as possible to how vram is accessed. So maybe you evict something from some level of cache and dump it there. For textures I suppose L1 would make most sense?

Those "SoC memory coherency" things on the die shot seem kind of beefy, and there seems to be one per shader engine. I'll point my finger at them and say based on nothing in particular that they manage the job.

I suppose you'd have to be able to chose whether data was copied in vram afterwards, or simply used and discarded to be fetched again if needed.

Deleted member 2197 · Oct 28, 2020

Quick question: Do XBSX and PS5 use the same RT/acceleration approach?

chris1515 · Oct 28, 2020

pharma said:
Quick question: Do XBSX and PS5 use the same RT/acceleration approach?

Sony said they use standard RDNA 2 RT.

iroboto · Oct 28, 2020

This is going to be an interesting week.

https://twitter.com/x/status/1321501744623357952

Silent_Buddha · Oct 28, 2020

AzBat said:
https://twitter.com/x/status/1321555899811565568

Tommy McClain

If true, I wonder how Sony is doing their RT? Or if the only difference is how the RT hardware is accessed? Or if there's 1 or 2 features of AMD's hardware RT that are on XBS and not PS5?

Mesh Shaders was already suspected to be different from whatever Sony is doing. VRS and SF [edit: fixed, previously I erroneously had SFS there] were suspected only because Sony haven't said anything about it. NOTE - previously this didn't mean it couldn't have it, just that it wasn't mentioned. If this is true, then it's probably that it doesn't have them or that Sony aren't using AMD's hardware implementation for them.

Regards,
SB

AzBat · Oct 28, 2020

Silent_Buddha said:
If true, I wonder how Sony is doing their RT? Or if the only difference is how the RT hardware is accessed? Or if there's 1 or 2 features of AMD's hardware RT that are on XBS and not PS5?

Mesh Shaders was already suspected to be different from whatever Sony is doing. VRS and SFS were suspected only because Sony haven't said anything about it.

Regards,
SB

Not using DirectX?

Tommy McClain

turkey · Oct 28, 2020

AzBat said:
https://twitter.com/x/status/1321555899811565568

Tommy McClain

How would Microsoft know what's in or not in Sony's APU?

Or how much is this actually an advantage rather than semantics covering an equivalent custom solution or even a deficiency .

Eg mesh shader Vs geometry engine which is rumoured to be in RDNA3. (Not looked it up in detail, just using it as an example)

PR is going to PR....?

Edit: talking of semantics, the VRS is custom and not RDNA2 or that is at least what Microsoft has stated so far....

Silent_Buddha · Oct 28, 2020

AzBat said:
Not using DirectX?

Tommy McClain

Not using DirectX doesn't preclude using the hardware in RDNA2 that DX uses for RT. What the tweet is implying is that XBS consoles are the only ones that fully implement all of the RT hardware that DX gives developers access to. IE - either Sony are doing something different or there's some bit of RDNA2 RT hardware that doesn't exist on the PS5.

This also doesn't meant that Sony don't have extra hardware added to support their implementation of RT on the PS5 SOC.

Regards,
SB

eastmen · Oct 28, 2020

turkey said:
How would Microsoft know what's in or not in Sony's APU?

Or how much is this actually an advantage rather than semantics covering an equivalent custom solution or even a deficiency .

Eg mesh shader Vs geometry engine which is rumoured to be in RDNA3. (Not looked it up in detail, just using it as an example)

could be that in both navi 2 and 3 amd has used microsoft patented tech to make vrs or sampler feed back or whatever work ?

So either sony doesn't have it , or they had to figure out another non patented way of doing it ?

snc · Oct 28, 2020

Silent_Buddha said:
Not using DirectX doesn't preclude using the hardware in RDNA2 that DX uses for RT. What the tweet is implying is that XBS consoles are the only ones that fully implement all of the RT hardware that DX gives developers access to. IE - either Sony are doing something different or there's some bit of RDNA2 RT hardware that doesn't exist on the PS5.

This also doesn't meant that Sony don't have extra hardware added to support their implementation of RT on the PS5 SOC.

Regards,
SB

not necessarily, tweet only imply that only xbox fully support rdna 2 and further just describe rdna 2 features (missing infinity cache tough) so imply that some of this feature ps5 gpu is lacking but not necessarily all of them

pjbliverpool · Oct 28, 2020

AzBat said:
https://twitter.com/x/status/1321555899811565568

Tommy McClain

So if MS are specifically calling out all those features could that mean the PS5 lacks them all?? Note they specified DirectX RT, not RT in general, allowing room for Sony to have developed their own solution.

iroboto · Oct 28, 2020

both of the consoles would benefit greatly from infinity cache, especially PS5 given how high clocked it is. Any trip to VRAM would stall the pipeline and toss away it's potential throughput (cycles) while waiting for data to come in.
So there's definitely a reason to have this in there.
But
for a variety of reasons it also should not be in there, and at least to me, it outweighs the pros.
a) silicon budget/die costs
b) backwards compatibility is going to be an issue
c) shrinking of the die is still tougher.

we've also seen this sort of cache augmentation in the past, and the combined bandwidth 76 GB/s + 192 GB/s way surpassed what was available on PS4 176 GB/s, and it still got it's ass whopped. 32mb of esram at 1/4 resolution. 128mb of infinity cache at 4k.
shrug.

so I just don't see the consoles going this way. It makes sense for both of them to steer clear of IC.

eastmen · Oct 28, 2020

portable xbox series s with x amount of infinity cache to decrease ram costs and power consumption ? maybe based on 5nm ?

mpg1 · Oct 28, 2020

Interesting quote from this Anandtech article about RDNA2 RT:

https://www.anandtech.com/show/1620...-starts-at-the-highend-coming-november-18th/2

"Ray tracing itself does require additional functional hardware blocks, and AMD has confirmed for the first time that RDNA2 includes this hardware. Using what they are terming a ray accelerator, there is an accelerator in each CU. The ray accelerator in turn will be leaning on the Infinity Cache in order to improve its performance, by allowing the cache to help hold and manage the large amount of data that ray tracing requires, exploiting the cache’s high bandwidth while reducing the amount of data that goes to VRAM.

AMD is not offering any performance estimates at this time, or discussing in depth how these ray accelerators work. So that will be something else to look forward to once AMD offers deep dives on the technology."

So basically the RT hardware seems highly dependent on the Infinity Cache in terms of performance. What does this mean for consoles?...

Kugai Calo · Oct 28, 2020

turkey said:
How would Microsoft know what's in or not in Sony's APU?

Edit: talking of semantics, the VRS is custom and not RDNA2 or that is at least what Microsoft has stated so far....

Mentioned features are patented?

BRiT · Oct 28, 2020

mpg1 said:
So basically the RT hardware seems highly dependent on the Infinity Cache in terms of performance. What does this mean for consoles?...

It can't be, considering BVH is 1 GB to 1.5 GB and IC is only 128 Meg.

iroboto · Oct 28, 2020

mpg1 said:
So basically the RT hardware seems highly dependent on the Infinity Cache in terms of performance. What does this mean for consoles?...

I’m not sure if this is a statement from AMD or something Ryan is pondering. Best to ask him directly here @Ryan Smith

I can’t see much use for a cache with incoherent rays. And given the size of the bvh structures (based upon our knowledge of Turing) we are looking at 1GB to 1.5GB vram reservation IIRC. @Dictator will likely have an more accurate average for games here.

I would welcome statements from both Ryan and Alex here on what they think IC will mean for RT performance.

Deleted member 13524 · Oct 28, 2020

iroboto said:
Game clocks are AMDs expected values.

If only there were reviewers who measured average clocks on Navi 10 in games and found out those are always above AMD's stated "game clocks"..

iroboto said:
256-bit bus is used for cards all the way up to 2080 - 10TFs.
For 36 CUs it's not anemic.

I very specifically mentioned Big Navi on 256bit, which is neither related to the 2080 nor has 36CUs in any of its SKUs.

iroboto said:
What do you want the quotation on?

On your claims over what I'm thinking.
I never mentioned 128MB. Why would a 36CU GPU have the same cache amount as a 80CU GPU? 128MB and its die area on the PS5 is something you fabricated by yourself, please refrain from putting words in my mouth.

iroboto said:
even 64mb is still far too large it's going to take up 50mm^

Quotation needed.

pTmdfx · Oct 28, 2020

iroboto said:
I can’t see much use for a cache with incoherent rays. And given the size of the bvh structures (based upon our knowledge of Turing) we are looking at 1GB to 1.5GB vram reservation IIRC.

Presumably you could at least have the first few levels of the BVH cached? Traversals all have to start there, so reuse rate should be alright there.

That’s assuming the BVH packs in such a way that levels are contiguously laid out in memory.

Kugai Calo · Oct 28, 2020

ToTTenTranz said:
On your claims over what I'm thinking.
I never mentioned 128MB. Why would a 36CU GPU have the same cache amount as a 80CU GPU? 128MB and its die area on the PS5 is something you fabricated by yourself, please refrain from putting words in my mouth.

Quotation needed.

The 32MB L3 on Zen 2 CPUs takes up roughly half of the 75mm^2 chiplet, so ~50mm^2 for 64MB really is a lower bound estimate here.

Also if you can’t be bothered to do your own research, please at least be polite to other members.

Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

function

None functional

Deleted member 2197

Guest

chris1515

iroboto

Daft Funk

Silent_Buddha

AzBat

Agent of the Bat

turkey

Silent_Buddha

eastmen

snc

pjbliverpool

B3D Scallywag

iroboto

Daft Funk

eastmen

mpg1

Kugai Calo

BRiT

(>• •)>⌐■-■ (⌐■-■)

iroboto

Daft Funk

Deleted member 13524

Guest

pTmdfx

Kugai Calo

Similar threads