Digital Foundry Article Technical Discussion [2023]

Status
Not open for further replies.
Certainly possible. Though with the prior pipeline Pascal was still much better than even RDNA 1 when it came to geometry.
It seems to fall in line with what we’ve read though there is some conflicting information.

If we first regard the information from Alex:
I think primitive shaders Lack amplification Stage, right? Not the same thing?

I talked with Remedy TD at Gamescom about their Mesh Shader usage, they are not using amplification stage for AW2, but they have experimented with it for Nanite like stuff in the future.
Then we include this bit here:

*side note: Martin Fuller is Xbox ATG if I recall correctly.

Combined with this nugget that @3dilettante summarizes for us from the vega whitepaper:
The whitepaper gives the following statement:
"The “Vega” 10 GPU includes four geometry engines which would normally be limited to a maximum throughput of four primitives per clock, but this limit increases to more than 17 primitives per clock when primitive shaders are employed."
The key phrase is likely "when primitive shaders are employed", and in this context probably means the total number of primitives processed (submitted + culled).
The compute shaders used to cull triangles in Frostbite (per a GDC16 presentation) cull one triangle per-thread in a wavefront, which is 16-wide per clock. There would be more than one clock per triangle, but this peak may be assuming certain shortcuts for known formats, trivial culling cases, and/or more than one primitive shader instantiation culling triangles.
There is probably some things I'm suggesting that are inaccurate but if we assume here that these smaller RDNA cards with 1 or 2 SE, they have significantly less triangle discard hardware, so the geometry processor in this case is picking up the slack to discard nearly I'm going to assume anywhere between 4-8x more triangles per clock.

Discard is typically not an issue, but in the amount of triangle density in this game, perhaps it's enough to cause a bottleneck here that without mesh shaders, or a Nanite equivalent, there isn't a method to move forward.

So this would explain the performance on console and Mesh Shader compliant cards. Onto the 5700XT which should only have driver level support for PrSh.
We know it exists, we have the Radeon profiler to showcase it.
Though the level of performance coming from Navi 10 is questionable as per above (if to be believed), the driver does a much better job at PrSh conversion with RDNA 2 over RDNA 1.

Where we seem to be receiving conflicting information whether AMD removed it or not - though I'm not particularly sure if stream out means what I think it means.
 
One of the main benefits of RTXDI is that it should scale well with large numbers of light sources including area lights like windows.
Sure but it's not magic - the tradeoff is dumping a lot more noise and reliance on temporal data (which is always a tradeoff with ghosting) into the inputs of the reconstruction pass(es). While those are definitely getting pretty sophisticated, there will always be a point where you just don't have enough samples (in motion, disocclusion, etc. depending on what space the samples are cached in).

If the majority of the lighting interactions are fully static and you have the art/iteration budget, some precomputed lighting still makes sense.

Which DF article?
 
Where we seem to be receiving conflicting information whether AMD removed it or not - though I'm not particularly sure if stream out means what I think it means.
Stream out is basically the D3D equivalent of transform feedback. One of the major features behind primitive shaders/NGG is that it's functionally compatible with the old geometry pipeline which includes features like stream out. Primitive shaders/NGG have the capability to trivially emulate stream out stage in the older geometry pipeline ...

The merge request disables primitive shaders/NGG on GFX10.X whenever stream out/transform feedback is used and re-activates the hardware's old geometry pipeline (LS/HS/ES/GS/VS) in that case since they find that combination more stable and the old geometry pipeline will always be available on any GFX10.X hardware implementations ...
Whatever methods were used in the available synthetic benchmarks. Tessellation, standard triangle setup, culling etc.
Nvidia's advantage in HW tessellation was due it's polymorph engine but that advantage is going out as more games start moving away from using the feature itself ...
 
Nvidia's advantage in HW tessellation was due it's polymorph engine but that advantage is going out as more games start moving away from using the feature itself ...
With tessellation support being added to nanite would the polymorph engines be relevant at all? Is there any way to garner use of them in software rasterization?

When looking at the geometry pipeline prior to mesh/primitive shading, what methods would Nvidia not be faster at?
 
With tessellation support being added to nanite would the polymorph engines be relevant at all? Is there any way to garner use of them in software rasterization?
Nanite also uses the mesh shading pipeline which by default makes HW tessellation inaccessible over there. They can still try to emulate tessellation itself but that still wouldn't let them use the polymorph engine but at that point you're just emulating the polymorph engine or any fixed function HW tessellation unit ...

On console APIs, you can actually use the HW tessellation unit with the mesh/primitive shading pipeline!
When looking at the geometry pipeline prior to mesh/primitive shading, what methods would Nvidia not be faster at?
One use case with geometry shaders where AMD might be relatively faster than Nvidia is doing geometry amplification but no game really abused this ...
 
With tessellation support being added to nanite would the polymorph engines be relevant at all? Is there any way to garner use of them in software rasterization?

The polymorph engines do a bunch of fixed function stuff besides tessellation. Vertex attribute fetch, clipping and preparing triangles for rasterization. The mesh shader pipeline can’t use the tessellator or vertex attribute fetch but still uses the clipping and setup hardware. A pure compute rasterizer wouldn’t use any of the polymorph engine stuff.

AD102 has 72 polymorph engines so presumably Nvidia still has an advantage in raw triangle processing in vertex/mesh shader pipelines. There aren’t any good geometry throughput tests these days though with which to test that theory.
 
Is DF planning to cover the recent mobile resident evil port? I’m quite interested in their opinions.
I did some primitive testing but the iPhone version doesn’t seem to support fps readings. From my eyes, it cannot maintain a stable 30fps even at the lowest resolution with Performance MetalUpscaling. A lot of hiccups when wandering in the first indoor area with the baby. Tbh this doesn’t look well optimized.. or maybe my iPhone is overheating, but then again they should consider it anyway.
 
Is DF planning to cover the recent mobile resident evil port? I’m quite interested in their opinions.
I did some primitive testing but the iPhone version doesn’t seem to support fps readings. From my eyes, it cannot maintain a stable 30fps even at the lowest resolution with Performance MetalUpscaling. A lot of hiccups when wandering in the first indoor area with the baby. Tbh this doesn’t look well optimized.. or maybe my iPhone is overheating, but then again they should consider it anyway.
Yeah we will cover it
 
I think what _all_ of us are trying to say is that what Alex says in this video does not invalidate what we have written, but validates it.
That thread talks about RDNA 1 having Primitive Shaders. It covers PS5 having Primitive Shaders. It covers that they do not have Amplification shaders. It covers the subtle difference between Primitive Shader and a Mesh Shader (which there is 1 only discernable difference).

Nothing Alex has said in this video goes against anything anyone has written.
Well fair enough if this is the case, but it seems a bit confusing as somebody else points out that the this 'critical' dividing aspect between primitive and mesh shaders is worded like an optional part of mesh shaders according to some documentation. So yea, maybe it's just a little bit of semantic vaguery confusing things. I apologize if I was being a little aggressive there about this.
 
Well fair enough if this is the case, but it seems a bit confusing as somebody else points out that the this 'critical' dividing aspect between primitive and mesh shaders is worded like an optional part of mesh shaders according to some documentation. So yea, maybe it's just a little bit of semantic vaguery confusing things. I apologize if I was being a little aggressive there about this.
It’s fine. No worries. Just know that not all of us respond with the intention of winning an argument. I’m long past that point in my time here at b3d. Sometimes I see value in bringing up counter views just so that we don’t have an echo chamber snowball, and sometimes I’m trying to slightly nudge people towards the answers they seek.

In this case, it’s messy. And as others have written each generation of card actually is actually getting better hardware at supporting mesh shaders so it gets much more complex than just saying primitive or mesh. Some people are looking at haves and have nots as being a performance indication but it’s not really like that.

More like do it this way or that way. And right now the angle looks like compute shaders with mesh/primitive is winning out over amplification + mesh in the multiplatform space. But support for the feature is limited because it’s more challenging to do it this way.

Though, if you don’t have to support PS5, the latter is doable, and I suspect amplification + mesh is easier for most developers to do.
 
It’s fine. No worries. Just know that not all of us respond with the intention of winning an argument. I’m long past that point in my time here at b3d. Sometimes I see value in bringing up counter views just so that we don’t have an echo chamber snowball, and sometimes I’m trying to slightly nudge people towards the answers they seek.

In this case, it’s messy. And as others have written each generation of card actually is actually getting better hardware at supporting mesh shaders so it gets much more complex than just saying primitive or mesh. Some people are looking at haves and have nots as being a performance indication but it’s not really like that.

More like do it this way or that way. And right now the angle looks like compute shaders with mesh/primitive is winning out over amplification + mesh in the multiplatform space. But support for the feature is limited because it’s more challenging to do it this way.

Though, if you don’t have to support PS5, the latter is doable, and I suspect amplification + mesh is easier for most developers to do.

IIRC, I read somewhere that Remedy commented that while they are using mesh shaders they aren't using amplification shaders for AW2. However, they are looking into using amplification shaders for their next title so that they can potentially have Nanite levels of geometric density.

This might imply there might be a soft ceiling on what can be accomplished with mesh shaders without using amplification shaders.

Regards,
SB
 

It's a piece of crap. Ninja Gaiden, GTA, MGS. Three franchises that defined the early 2000s were all remastered and all did a shit job.
 

It's a piece of crap. Ninja Gaiden, GTA, MGS. Three franchises that defined the early 2000s were all remastered and all did a shit job.
The games you mentioned are all old enough to be remade, and no one would bat an eye. GTA is a huge ask, but the others would be better off as remakes, in my opinion.
They would have been better off just adding the BC versions of Metal Gear Solid 2 and 3 on the Series consoles and call it a day.
 
This might imply there might be a soft ceiling on what can be accomplished with mesh shaders without using amplification shaders.

Regards,
SB

One intended usage for amplification shaders is culling entire meshlets that can be skipped if they’re out of view. AW2 may be processing every meshlet and doing per triangle culling in the mesh shader instead.
 
Status
Not open for further replies.
Back
Top