Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
Mark Cerny specifically mentioned an example of 36CUs vs. 48CUs. IMO maybe this is how to decide to choose 36CUs.

But I still wonder how much 2.23GHz for other GPU parts would help compensate fewer CUs.

True, but he did state not to take the numbers too seriously and that it was just a thought experiment.

I'm quite looking forward to seeing how the smaller, higher clocked GPU fares against the larger, lower clocked GPU. It's probably going to be the most interesting part of the generation from a technical perspective.

I just hope the fanboys bugger off out of here soon enough (seriously, why can't they retreat back to one of the forums that shall not be named?) so we can all get back to being decent people with rational debates.

But the SSD advantage is huge, and the CPU advantage is miniscule.

Sure, the SSD advantage is more significant than the CPU advantage, but my point was more that neither will bring either platform any real benefits when it comes to multiplatform games. The PS5 will load games twice as fast, the XSX will be less prone to stutters. Neither is a deal breaker or system seller for most people, not when casual Joe is comparing systems and deciding which one to buy.
 
Is it safe to say that RAM is the real bottleneck this gen? Not enough and too slow?

Ram bw is certainly going to be one on PS5, and to a lesser extent on the Xsx.
22.2% per more rasterization power on 20% less bw.
That's 52% more stress on bw from non-compute pipeline on PS5 than Xsx.
Forgoing 16gbps GDDR6 chips was a mistake. Thanks but no thanks Sony Japan!
 
Ram bw is certainly going to be one on PS5, and to a lesser extent on the Xsx.
22.2% per more rasterization power on 20% less bw.
That's 52% more stress on bw from non-compute pipeline on PS5 than Xsx.
Forgoing 16gbps GDDR6 chips was a mistake. Thanks but no thanks Sony Japan!

can you explain this a bit further, only if you have the time that is.
 
Ram bw is certainly going to be one on PS5, and to a lesser extent on the Xsx.
22.2% per more rasterization power on 20% less bw.
That's 52% more stress on bw from non-compute pipeline on PS5 than Xsx.
Forgoing 16gbps GDDR6 chips was a mistake. Thanks but no thanks Sony Japan!

I think you might be confusing the rasterizer with blend/depth/stencil units! ;)

A rasterizer's only job is to determine pixel coverage of the triangles and this operation consumes no bandwidth since it is pipelined. A blend units job is to as it's name implies blend those fragments to get the resultant final colour by reading/writing to those render targets and it's one of the specific stages of the graphics pipeline that consumes bandwidth.
 
Ram bw is certainly going to be one on PS5, and to a lesser extent on the Xsx.
22.2% per more rasterization power on 20% less bw.
That's 52% more stress on bw from non-compute pipeline on PS5 than Xsx.
Forgoing 16gbps GDDR6 chips was a mistake. Thanks but no thanks Sony Japan!

I don't think the peak bandwidths tell the whole story for either of the consoles.

I except the CPU to take bandwidth away from the GPU for both the PS5 and XSX. I except it to be higher on the XSX due to the CPU needing a longer time slice to transfer the same amount of data. What this does to the effective bandwidth is something we will have to wait and see with.
 
I think some are forgetting that regardless of how fast your SSD is things ultimately have to go through system ram, so the amount of system ram and bandwidth is still a limiting factor. Things can get swapped in and out faster and that's important and will be beneficial on both systems but it's not going to be some kind of miracle game changer graphically.

So I've been thinking about this a bit, but it doesn't HAVE to be this way.
It sounds at least on the MS side, they are fully intending to make a portion of the SSD directly addressable by the CPU/GU.
This would in fact give MS 3 different types of memory allocation, fast RAM, slow RAM, and SSD.
They have already indicated they are going to be providing a smart memory allocator to make sure that the right parts of the program get memory allocated from the right block re: the fast/slow ram pools,
adding a 3rd layer to the memory allocation is a small change from there.
So in - best case- theory you could access the SSD memory directly without it going to RAM first, IF anyone does that, is another question.
But I'll always take the system with higher GPU/ system memory bandwidth.


Honestly i think the Sony solution is great, but going from 40MB/s to 2.5Gb/s and then to 5.5Gb/s,
What you get form that last extra 3GB/s isn't going going to be much.
Especially when your Ram is running close to 500Gb/s
 
So I've been thinking about this a bit, but it doesn't HAVE to be this way.
It sounds at least on the MS side, they are fully intending to make a portion of the SSD directly addressable by the CPU/GU.
This would in fact give MS 3 different types of memory allocation, fast RAM, slow RAM, and SSD.
They have already indicated they are going to be providing a smart memory allocator to make sure that the right parts of the program get memory allocated from the right block re: the fast/slow ram pools,
adding a 3rd layer to the memory allocation is a small change from there.
So in - best case- theory you could access the SSD memory directly without it going to RAM first, IF anyone does that, is another question.
But I'll always take the system with higher GPU/ system memory bandwidth.


Honestly i think the Sony solution is great, but going from 40MB/s to 2.5Gb/s and then to 5.5Gb/s,
What you get form that last extra 3GB/s isn't going going to be much.
Especially when your Ram is running close to 500Gb/s

From what I made out of the Road to PS5 presentation, it was pretty clear that devs can treat the SSD as it's own pool of ram. Unless I am mistaken.
 
From what I made out of the Road to PS5 presentation, it was pretty clear that devs can treat the SSD as it's own pool of ram. Unless I am mistaken.

Its a smart decision for both of them. Its affectively just a PCIe device with DMA now so it should be pretty easy to setup/configure.
 
From what I made out of the Road to PS5 presentation, it was pretty clear that devs can treat the SSD as it's own pool of ram. Unless I am mistaken.

So it's a VERY VERY large pool of VERY VERY slow ram!
Thats sorta how I understood it too, also why i said i'd prefer the faster system ram instead.
448 + 5.5 < 560 + 2.4.

Overall I can't wait to find out what they do with these systems.
Even if you fully ignore the ability of the GPU to access the SSD's, what this does for world and systems design is amazing.
a CPU with a memory space of 50+GB that can be access @ 3Gb/s, the possibilities are endless!
 
Geometry engines have been in GCN cards for years. Mesh shaders are a RDNA2 only thing, as they seem to be step above the primitive shaders in RDNA1. @3dilettante help!
GPU shader engines have had geometry engines, or geometry processors, depending on what AMD has decided to call them. If there's a distinction to be made between the two terms, I haven't seen that communicated.

RDNA has the one block called a geometry processor, although there are elements at the shader array level that would seem to cover parts of what fell under that label in prior generations.
I haven't seen further explanation about what has changed there.

Sony's presentation has something called the Geometry Engine (all caps in the presentation, but it seems to fit Sony's liking for labeling something an Engine of some kind), where there seems to be emphasis that there's a specific collection of programmable features and associated IP that can handle geometry and primitive setup. It's discussed as something separate from the conventional setup process.
This has some implications that are similar to mesh shaders as put forward by DX12 or Nvidia, but only some high-level possibilities were mentioned in the presentation. Primitive shaders are an apparent way to provide synthesis or procedural generation, but more general effects rather than details were given to know how they slot in with regards to the stages associated with mesh shading in other formulations.
The PS5 seems to list primitive shaders' most basic formulation to be along the lines of the culling triangle-sieve formulation from the PS4 and GCN, though it seems Sony is committing itself to offering primitive shaders in a form that developers can interact with and program.

Where on the Venn diagram the functionality overlaps with our understanding of prior forms of what AMD has discussed (compiler-created primitive shaders, future versions of primitive shading, surface shaders) and other formulations of the geometry front end (DX12 amplification and mesh shaders, the Xbox Series X's potentially custom elements for it, and/or unique details of Nvidia's task and mesh shaders) would need more disclosures.
 
Mesh shaders are just primitive shaders in AMD terminology. It's Microsoft who's at fault for not using official vendor terminology and having to use another hardware vendor's terminology to describe their implementation. AMD's inner circle has never made any mention of mesh shaders in their documentations or in their open source drivers.

"Mesh shaders" is an abstract concept that Microsoft/Nvidia made up that just coincidentally happens to somewhat map to AMD hardware.

"Task shaders" is not a thing that maps to AMD hardware hence why that shader stage needs to be emulated with an indirect dispatch and a compute shader since their hardware doesn't automatically have a shader stage where they can just outright launch or spawn more mesh shaders like a task shader would.

Another big thing about mesh shaders on AMD hardware is that they have an option to be "less explicit". Their mesh shader implementation can be potentially compiler generated! Their open source driver code even shows an option to automatically apply sphere culling which is one of the use cases for mesh shaders! A graphics programmer could already be using mesh shaders without having to specifically code for it and they wouldn't even know unless they checked their disassembled shader code! By having mesh shaders be potentially compiler 'triggered', AMD is attempting to offer developers "more ways" of hitting their driver's fast-paths without having to change the programming model. That means a graphics programmer can use the regular vertex/geometry/tessellation shaders from the traditional graphics pipeline and they can still get the performance benefits from mesh shaders without having to ever use the new shader stages.
 
So it's a VERY VERY large pool of VERY VERY slow ram!
Thats sorta how I understood it too, also why i said i'd prefer the faster system ram instead.
448 + 5.5 < 560 + 2.4.

Overall I can't wait to find out what they do with these systems.
Even if you fully ignore the ability of the GPU to access the SSD's, what this does for world and systems design is amazing.
a CPU with a memory space of 50+GB that can be access @ 3Gb/s, the possibilities are endless!

Is this treating the SSD like RAM distinct from memory mapping files, or the disk-based half of the PS4's flex memory allocation?
As far as treating the SSD like RAM in terms of bandwidth or latency, it's still orders of magnitude worse.
If it's treating the SSD's contents like other locations in the virtual address space, there are established ways of doing that even with things other than SSDs.

Mesh shaders are just primitive shaders in AMD terminology. It's Microsoft who's at fault for not using official vendor terminology and having to use another hardware vendor's terminology to describe their implementation. AMD's inner circle has never made any mention of mesh shaders in their documentations or in their open source drivers.

"Mesh shaders" is an abstract concept that Microsoft/Nvidia made up that just coincidentally happens to somewhat map to AMD hardware.
Parts of them map to AMD's primitive shaders, or at least the formulation we know of. The primitive shaders we know about lack the developer access, generalized input, or anything but the culling. In that regard, both the Microsoft and Nvidia formulations give small amounts of their descriptions to culling. Nvidia's initial announcements on mesh shading had something like one sentence which encapsulates all that AMD's method does.

Other differences are that primitive shaders as we know them slot into the existing geometry pipeline, which DX12 and Turing do not do.
Primitive shaders work with existing triangle topologies (trianges, strips, fans), DX12 and Turing allow programmer-defined topologies.
AMD's method takes the existing entry point of the geometry pipeline as its input, which the others don't since they exclude the existing pipeline.
Primitive shaders depend on a surface shader to allow the tessellation stage to feed into the primitive shader. The others keep it separate or in DX12's case aim to completely discard that stage.

At least in the public disclosures, what AMD said about going into anything besides culling was maybe a sentence in the Vega white paper.
There are indications in driver code that whatever interfaces are in play for Navi aren't the same as Vega's--with changes explicitly deleting the hooks that were put in for Vega. At least for now, the outward descriptions for what's left in Navi are substantially lower in throughput than what was claimed for Vega, so they under-promise but at least deliver in Navi's case.

"Task shaders" is not a thing that maps to AMD hardware hence why that shader stage needs to be emulated with an indirect dispatch and a compute shader since their hardware doesn't automatically have a shader stage where they can just outright launch or spawn more mesh shaders like a task shader would.
Task shaders and amplification shaders, if going by Microsoft's formulation--which may not map fully to Nvidia's Task shaders.

Another big thing about mesh shaders on AMD hardware is that they have an option to be "less explicit". Their mesh shader implementation can be potentially compiler generated!
The culling shaders can be auto-generated, as AMD does for RDNA, gave up on for Vega, and parallels the triangle sieve shader customization created for the PS4.
Cerny's presentation gives this as the most basic use case for the PS5's Geometry Engine.

What the PS5, Microsoft, and Nvidia have documented or have committed to is significantly beyond what AMD has described for primitive shaders. Perhaps AMD will call whatever it uses to match the DX12 or PS5 functions primitive shaders, but it would be a very different set of functionality than what currently has that name.
 
Status
Not open for further replies.
Back
Top