Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
Asynchronous compute and compute driven pipelines were new thing this gen. Probably someone tipped their toe ealier into those topics but this current gen made those approaches mainstream. Next gen takes this even further thanks to mesh/primitive shaders.
 
One has to separate gpgpu/compute and asynchronous compute. Asynchronous compute was something that was fairly new to this gen and especially kepler and pascal architectures from nvidia struggled with it. Turing is a lot better. Asynchronous compute became a thing as current gen hw in consoles supported it very well and created a nice baseline for developers to use/optimize for. If you go back to early days of this gen you will find plenty of developers glamoring after compute driven pipelines and letting gpu create it's own workloads instead of having cpu drive everything.
 
The UE5 demo has no ray tracing.

Not exactly true!

Quoting from Eurogamer:

Lumen uses ray tracing to solve indirect lighting, but not triangle ray tracing," explains Daniel Wright, technical director of graphics at Epic. "Lumen traces rays against a scene representation consisting of signed distance fields, voxels and height fields. As a result, it requires no special ray tracing hardware."

To achieve fully dynamic real-time GI, Lumen has a specific hierarchy. "Lumen uses a combination of different techniques to efficiently trace rays," continues Wright. "Screen-space traces handle tiny details, mesh signed distance field traces handle medium-scale light transfer and voxel traces handle large scale light transfer."

Lumen uses a combination of techniques then: to cover bounce lighting from larger objects and surfaces, it does not trace triangles, but uses voxels instead, which are boxy representations of the scene's geometry. For medium-sized objects Lumen then traces against signed distance fields which are best described as another slightly simplified version of the scene geometry. And finally, the smallest details in the scene are traced in screen-space, much like the screen-space global illumination we saw demoed in Gears of War 5 on Xbox Series X. By utilising varying levels of detail for object size and utilising screen-space information for the most complex smaller detail, Lumen saves on GPU time when compared to hardware triangle ray tracing.
 
I think he means UE5 Demo does not use the AMD RealTime RayTracing Hardware.
 
Computation was not performed prior to this gen?

We had a fixed function pipeline in the past. Where each shader type had its own type of units. So vertex shaders had vertex units, etc. When we got to 360 era we had unified shaders meaning 1 unit for all shaders types. This really opened the door to computational power because each shader unit was a potential bottleneck. So instead of data going from vertex units to pixel units for instance, the data would be processed in the unified shader as vertex, and then piped the results back into itself to run it as a pixel shader unit following. The scheduler and command processor are responsible for organizing the work to be done through the unified shader units. There are pros and cons with this method of course.

The compute engine, or compute shaders, goes from memory into the ALU, and back into memory. You specify a kernel to run and you indicate how many threads and thread blocks you want this kernel to run over (with the data). The pros are how massively things can be done in parallel, the cons is that each grouping of ALU (a CU in AMD case) can only share data with other neighbouring CUs (so the programs written with compute do so with this in mind).
 
Computation was not performed prior to this gen?
I took 'compute' to mean 'compute features on the GPU'. Obviously computation was performed or we'd have boxes rather than consoles. ;) If you mean all-round processing power, that's probably impossible to answer. Does when compare raw flops, or what's achieved? At which point, maybe the earliest consoles had the largest deltas? Pong to the Atari 2600, or the 2600 to the NES. Way off topic!

I'm pretty sure I heard about GPGPU operations in the X360, in and off videogames.
Well GPGPU isn't really compute. Compute was introduced with shaders designed for generic workloads as opposed to graphics-only tasks that other workloads were mapped to by the devs. GPGPU can be run on any GPU but doesn't have the versatility of compute, so sits between as an evolutionary interim. No console prior to this gen considered running anything on the GPU other than graphics work save maybe some freaky fringe case.
 
Indeed, for PlayStation we have the exact opposite given last gen Cell's SPUs had to help out the GPU.

Haha i was following the convo of gpu computing just now and just about to mention the reverse uno card of the cell processor...really crazy stuff. God of war 3 and ascension as well as killzone 2 and 3 and stuff like beyond two souls are still great looking.

In particular, the types of things god of war was pulling clearly put them ahead of what last generation was capable of in normal scenarios. Still two of the best looking games of all time 10 years from release. Without those spus that would not have been possible with the weak ps3 gpu and just general cpu computations. I dont think any of those ps exclusives could have been mimicked on 360.

Ofc thats just because they were focusing on the strengths of the hw. Last gen is very interesting in retrospect for me because the tech priorities of the two companies was way different and they had clear secret sauces.

Sony with the cell processor and MS with the 10mb cache of edram. Games like ninja gaiden 2 and bayonetta which inherently relied on that edram allowing for tons of enemies on screen, tons of transparencies going off every second definately were not possible on ps3 especially because of how they had to change sigma 2 for ps3 release and the less said about bayos ps3 port the better. The same goes for halo 3 and reach...bungie really maximized usage of the edram with thar hdr lighting and huge open spaces.

Im excited for this gen in a similar way....ms has a negligable cpu advantage and a notable gpu advantage and a weird ram and bw situation. Sony has a sizable ssd and i/o advantage they are touting but are weaker in the other ways mentioned. How that will translate to games is anyones guess but i am here for it
 
Last edited:
Indeed, for PlayStation we have the exact opposite given last gen Cell's SPUs had to help out the GPU.
Indeed. It was such an overpowered CPU, for that time period. Was there a GPU, during that time period (2006), that wouldn't have been GPU limited (when matched with the Cell)?
 
Im excited for this gen in a similar way....ms has a negligable cpu advantage and a notable gpu advantage and a weird ram and bw situation. Sony has a sizable ssd and i/o advantage they are touting but are weaker in the other ways mentioned. How that will translate to games is anyones guess but i am here for it
Sony, also, has that crazy powerful SPU-like 3D audio Tempest Engine (that can also be used for some traditionally CPU processing).
 
Sony, also, has that crazy powerful SPU-like 3D audio Tempest Engine (that can also be used for some traditionally CPU processing).

Sony seems to have tons of fixed function hardware to their machine to take stress off of gpu cpu and ram...but i dont know how much that will contribute so i didnt say anything
 
Last edited:
Indeed. It was such an overpowered CPU, for that time period. Was there a GPU, during that time period (2006), that wouldn't have been GPU limited (when matched with the Cell)?
That depends on the workload. Cell required work to be done in parallel to extract performance out of it. And it wasn't exactly like modern multi-core CPUs, the work of the PPU and SPE's was different, and limited in different ways. Also, PS3's memory configuration held it back when compared to 360. With 360's EDRAM I'm sure there are plenty of PS3 games where the GPU is bandwidth bound on PS3 and performance bound on 360, regardless of the CPU they are paired with.
 
That depends on the workload. Cell required work to be done in parallel to extract performance out of it. And it wasn't exactly like modern multi-core CPUs, the work of the PPU and SPE's was different, and limited in different ways. Also, PS3's memory configuration held it back when compared to 360. With 360's EDRAM I'm sure there are plenty of PS3 games where the GPU is bandwidth bound on PS3 and performance bound on 360, regardless of the CPU they are paired with.
To get maximum performance, Cell required work to be done in parallel. On that point alone, parallel work is the only way to get maximum performance out of multiple cores.

You could be correct on the GPU, potentially, being bandwidth limit (in certain cases). Seeing a lot of top PS3 GDC tech breakdowns, it didn't seem to be a limiting factor for those studios. It could've been for other studiios. I wish we had more data on that.

To bring this back on topic, does anyone have any ballpark numbers on the least expensive 3D audio type convulsion reverb calculations to some of the most expensive? I have little to no knowledge in the area. These fixed function hardwares seem like they could be quite handy tools.
 
You could be correct on the GPU, potentially, being bandwidth limit (in certain cases). Seeing a lot of top PS3 GDC tech breakdowns, it didn't seem to be a limiting factor for those studios. It could've been for other studiios. I wish we had more data on that.
WRT to PS3 GPU bandwidth limitations being a common concern on PS3. Many PS3 games used quarter resolution alpha effects while Xbox 360 versions did not.
 
Status
Not open for further replies.
Back
Top