Next Generation Hardware Speculation with a Technical Spin [post E3 2019, pre GDC 2020] [XBSX, PS5]

Scott_Arm · Mar 6, 2020

Globalisateur said:
Shouldn't it mean that they are indeed using a narrow and fast design ?

There are 4 TMUs per CU in RDNA, so I think a wider GPU would have better sampling rate than a narrower, faster GPU. But I don't know how often texture sampling is really a bottleneck anyway.

anexanhume · Mar 6, 2020

Scott_Arm said:
There are 4 TMUs per CU in RDNA, so I think a wider GPU would have better sampling rate than a narrower, faster GPU. But I don't know how often texture sampling is really a bottleneck anyway.

Interesting added implications given the RT logic lives in the TMUs.

upnorthsox · Mar 6, 2020

anexanhume said:
Interesting added implications given the RT logic lives in the TMUs.

This has been the point that has made me hesitant on GitHub, 80 less TMU's is 80 less RTU's, plus 4TF less compute for anything else RT related. Forget resolution or FPS, if I'm a dev I'm dropping RT on a weaker ps5 first and foremost.

AbsoluteBeginner · Mar 6, 2020

TMU * gfxclk*

Scott_Arm · Mar 6, 2020

upnorthsox said:
This has been the point that has made me hesitant on GitHub, 80 less TMU's is 80 less RTU's, plus 4TF less compute for anything else RT related. Forget resolution or FPS, if I'm a dev I'm dropping RT on a weaker ps5 first and foremost.

RT should scale with resolution pretty linearly, I think, so I don’t know why that’d be necessary.

turkey · Mar 6, 2020

Scott_Arm said:
RT should scale with resolution pretty linearly, I think, so I don’t know why that’d be necessary.

I thought the first thing they did in bf5 was to decouple rays from resolution, I guess we should expect RT to be upscale to the 4k target. How much is wiggle room?

Frenetic Pony · Mar 6, 2020

turkey said:
I thought the first thing they did in bf5 was to decouple rays from resolution, I guess we should expect RT to be upscale to the 4k target. How much is wiggle room?

RT scaling versus resolution is highly dependent on the specific effect desired and scene complexity. Ambient occlusion, rough reflections, and other "blurry" effects offer the possibility of scaling very well, assuming surfaces in view cover at least a relatively large amount of continuous pixels (are near each other and there's not like, a bunch of single column pixel grass far away from each other). This is because you can spatially denoise fairly well with low sample counts, blur the raytacing of pixels near each other in a smart way, and keep raytacing samples low, and as resolution increases the amount of pixels onscreen near each other tends to increase, so you can just do that more.

The good news is, this is also true of framerate, as temporal supersampling works just as well or better for raytracing (forward projection assumes the previous frames rays are still correct and re-uses them to shade). Basically it's like spatial denoising, just with time as another spatial dimension, so 60fps raytaced games should be possible. All of this kind of relies on fairly low scene complexity (all the pixels have to be near each other), and combined with raytracing speed currently relying on relatively static scenes so the BVH doesn't have to be rebuilt much, I'd imagine not all titles will actually use polygonal raytacing much. EG a really dense forest with everything blowing in the wind is going to break a lot of things RT relies on to be fast, while a lot of Call of Duty maps, with their small static environments, can support a lot of raytracing effects even at 60fps.

Scott_Arm · Mar 6, 2020

turkey said:
I thought the first thing they did in bf5 was to decouple rays from resolution, I guess we should expect RT to be upscale to the 4k target. How much is wiggle room?

You’ll have to remind me of the full quote, or of some technical details of bf5.

iroboto · Mar 7, 2020

Scott_Arm said:
You’ll have to remind me of the full quote, or of some technical details of bf5.

I believe RT is impacted more by the number of triangles in the scene. So greater scene complexity should have a harder hit on compute. If BVH intersection works the way I think it does at least

Scott_Arm · Mar 7, 2020

iroboto said:
I believe RT is impacted more by the number of triangles in the scene. So greater scene complexity should have a harder hit on compute. If BVH intersection works the way I think it does at least

I'm sort of remembering how BF5 worked. They divided the screen into tiles and they had a total number of rays and that would be divided amongst the tiles based on the necessity of rays in each one. Something like that. Then they have quality settings that set a few parameters do determine the total number of rays allowed and a roughness factor for reflections. My assumption would be, for a given scene and the same roughness factor that performance should scale something like linearly vs the total number of rays. So 50 rays is roughly twice the performance of 100 rays if they are distributed in roughly the same way. That may be wrong, but logically it seems correct.

3dilettante · Mar 7, 2020

iroboto said:
the biggest takeaway for me, is that RT and VRS are not the only defining points for RDNA 2.0.

As writers have cited earlier, RDNA 1.0 was a hybrid RDNA with GCN technology. That is what the architecture is.
RDNA 2.0 is a full departure from GCN.

What's the criterion for a full departure?
RDNA1 made the effort to move parts of its ISA back console-era encodings, when the nearest GPU to depart from was Vega. AMD still seems concerned with some level of compatibility with GCN, so I haven't seen the claim that Navi 2x is changing this.
I'm not sure a full departure is an optimal solution in the face of backwards compatibility.
Sony would have the most examples of struggling with significant architectural discontinuities, such as the Cell to x86 transition, which I'd argue is at a minimum a large if not full departure.

Shifty Geezer said:
Chrome - the new lens flare/bloom...

New?

London-boy said:
I just looked at that picture AMD used on the RT slide and oh my lord my eyes hurt. Mercury water, chrome balls, chrome walls - this is worse than lens flare.

Perhaps it is time...
https://forum.beyond3d.com/posts/711822/

iroboto · Mar 7, 2020

3dilettante said:
What's the criterion for a full departure?
RDNA1 made the effort to move parts of its ISA back console-era encodings, when the nearest GPU to depart from was Vega. AMD still seems concerned with some level of compatibility with GCN, so I haven't seen the claim that Navi 2x is changing this.
I'm not sure a full departure is an optimal solution in the face of backwards compatibility.
Sony would have the most examples of struggling with significant architectural discontinuities, such as the Cell to x86 transition, which I'd argue is at a minimum a large if not full departure.

I'm unsure; I think we'll need more details on this for sure.
These types of articles came to mind:
https://wccftech.com/amd-radeon-rx-5000-7nm-navi-gpu-rdna-and-gcn-hybrid-architecture/
I believe AdoredTV said something similar in his last video as well. The idea that the 5000 series was a GCN Navi Hybrid. But RDNA 2 would be the first pure RDNA GPU.

Unfortunately I have no way to qualify that statement.

anexanhume · Mar 7, 2020

3dilettante said:
What's the criterion for a full departure?
RDNA1 made the effort to move parts of its ISA back console-era encodings, when the nearest GPU to depart from was Vega. AMD still seems concerned with some level of compatibility with GCN, so I haven't seen the claim that Navi 2x is changing this.
I'm not sure a full departure is an optimal solution in the face of backwards compatibility.
Sony would have the most examples of struggling with significant architectural discontinuities, such as the Cell to x86 transition, which I'd argue is at a minimum a large if not full departure.

New?

Perhaps it is time...
https://forum.beyond3d.com/posts/711822/

IIRC they have dubbed the ISA as GCN and that isn’t changing, despite the architecture being named RDNA.

3dilettante · Mar 7, 2020

anexanhume said:
IIRC they have dubbed the ISA as GCN and that isn’t changing, despite the architecture being named RDNA.

The ISA is a significant part of what is considered the architecture. It's perhaps in a more crowded field with GPUs, in particular AMD's GPUs, because AMD lumped virtually everything in the SOC as being part of the architecture. Again.

Perhaps they're talking about a situation similar to the transition from VLIW4 to GCN, where Southern Islands had hardware similarities to the prior architecture outside of the CU array and ISA.
Sea Islands would implement more of the front-end features that would be most recognizable today, and some additional changes to the ISA as well.

Even if this happens in the case of RDNA2, I would need to see how big a departure things would be. Even within the compute portion of the architecture, there are elements or themes that are reminiscent of GCN still. Some of the particulars may stem from inherited elements, such as the number of wavefronts and barriers per CU, or for example some similar rules for wait counts and the same general pipeline organization for the instruction types.

Lurkmass · Mar 7, 2020

FYI, this guy revealed on the new systems you can feed the GPU a custom BVH and it also has hardware accelerated ray-AABB intersection tests.

JoeJ · Mar 7, 2020

iroboto said:
I believe RT is impacted more by the number of triangles in the scene. So greater scene complexity should have a harder hit on compute. If BVH intersection works the way I think it does at least

I think RT suffers less from scene complexity than rasterization, because of the log n term in its time complexity.
If you imagine a binary tree, you can double the triangle count at the cost of adding one more level to the tree, so each ray has only one step more of traversal, but twice the detail.
In contrast, with rasterization you have constant cost per triangle.

(This is also the reason why the LOD support for RT i'm constantly requesting may not that important for now. Supporting this adds lots of costs, and the base performance of current GPUs could be too low to justify this.)

EDIT: People using RTX often claim RT suffers less from adding detail, and while the above makes sense, i'm personally not sure about it, lacking experience. Assumption is we have much less rays than triangles, so the additional tree levels per ray cost is smaller than the cost from triangle culling and setup.

Lurkmass said:
FYI, this guy revealed on the new systems you can feed the GPU a custom BVH and it also has hardware accelerated ray-AABB intersection tests.

Can you elaborate more? Sounds game changing!
Link only shows github page with out any context to RT.

London Geezer · Mar 7, 2020

Lurkmass said:
FYI, this guy revealed on the new systems you can feed the GPU a custom BVH and it also has hardware accelerated ray-AABB intersection tests.

Care to explain or elaborate for us mere mortals who only count giga rays per second and teraflops per meerkat?

Lurkmass · Mar 7, 2020

JoeJ said:
Can you elaborate more? Sounds game changing!
Link only shows github page with out any context to RT.

He's on discord, you should hop on to see all the juicy dumps from real developers!

Currently, DXR only defines fixed function ray-triangle intersection. To do intersection tests along fancy geometric representations such as spheres, voxels, signed distance fields or any type of procedural geometry really you would have to 'emulate' the ray-AABB intersection tests using intersection shaders. In the patent describing the "ray intersection engine", it mentions the capability to do those "ray-box intersection tests" along with the ray-triangle intersection tests. Highly optimal for doing voxel rendering.

JoeJ · Mar 7, 2020

Well, forget about ERA, GAF, and pastebin. Discord is the new source of real leaks!

I wonder how much this is compute utilizing triangle intersection hardware, or kind of HW accelerated general traversal interface that is compatible with custom data structures.
If true, i'm excited. And i wonder which platforms will have this feature...

turkey · Mar 7, 2020

Scott_Arm said:
You’ll have to remind me of the full quote, or of some technical details of bf5.

This was from memory so not the best reply in the technical forum, here is what I have found from what I remember reading / watching.

The first mention of this was in Alex's first video when BF5 was first shown with RTX, this was Pre release

At 10:45 the Pre release version is performance tied to ray shot per pixel and this set at the native rendering resolution, the other part is the bvh which is also generated at the native rendering resolution. Something Dice were looking at.

Fast forward to the post launch patches and checking with the lower range cards

https://www.eurogamer.net/articles/...attlefield-5-vs-rtx-2060-ray-tracing-analysis

So what do you lose by dropping DXR from ultra to medium? There are two compromises. First of all, the resolution of RT-reflected surfaces drops. Secondly, whether surfaces receive ray tracing or not is tied to their roughness

This seems to read as they addressed the per pixel nature and made it part of a quality setting.

@Dictator may be able to add more, I believe there was other df content on this, how performance had improved and possibly details from Dice.

Next Generation Hardware Speculation with a Technical Spin [post E3 2019, pre GDC 2020] [XBSX, PS5]

Scott_Arm

anexanhume

upnorthsox

AbsoluteBeginner

Scott_Arm

turkey

Frenetic Pony

Scott_Arm

iroboto

Daft Funk

Scott_Arm

3dilettante

iroboto

Daft Funk

anexanhume

3dilettante

Lurkmass

JoeJ

London Geezer

Lurkmass

JoeJ

turkey

Similar threads