Current Generation Games Analysis Technical Discussion [2023] [XBSX|S, PS5, PC]

Status
Not open for further replies.
No, not "two different workloads". Both are pure compute workloads with simulation.
They are different workloads.
One is the most complex rendering technique on this planet, the other is just hair rendering.
So they are different then.
A 3.6ms workload on a 90 TFLOPs GPU is absurd.
How do you know?

Have you any technical information about how they're doing the hair? Strand counts? Collision details?

No? The you can't claim it's absurd.
So? It's the exception, not the rule.

Just like COD games are when it comes to performance on AMD vs Nvidia.

And?
 
I've just tested it myself

  • No hair: 139fps
  • Medium: 112fps
  • Max: 110fps
Or

  • No hair: 7.3ms per frame
  • Max: 8.8ms per frame

All other settings max at a native 1440p.
 
Last edited:
I personally do not find the hair on PC overly taxing, for what it is doing. It looks a lot better than the console version's hair in a number of ways, which means that Capcom put in the time to give the PC version the high quality.

Only issue is the hair does not scale down to console quality in terms of strand count, strand width, or lighting model (the lighting model is different on console for hair, and worse). The "normal" setting even when making the resolution the same is quite obviously better than what is going on on console. I would prefer if there was a "low" setting below normal that was what consoles have.

hair.00_03_23_28.stiljgcpk.png
 
"Realistic" hair rendering (in this case I believe stand based) is complex which is the reason why characters in games for the most part don't have actual hair but basically wear hats. It faces a similar issue that ray tracing has in that we're kind of used the "good enough" fake method (especially with static shot comparison, again something similar to RT) with everyone wearing hard hats.
Hairworks is nearly 10 years old. Here from The Witcher 3:
On average, 10,000 to 40,000 strands of tessellated hair are applied to HairWorks-enhanced models at combat view distances, with up to 60,00 applied to the game's furriest creatures. Up close, when creatures are futilely trying to end Geralt's life, hair counts are cranked up to ensure maximum fidelity, bringing the 40,000 average up to 125,000 in some cases. As they then flee for their lives, hair counts are dynamically scaled back until the enemy is a mere dot in the distance. On Geralt, hair counts average at 30,000, and scale up to 115,000, with around 6,000 hairs just for his beard.

There is no level of detail in Resident Evil. Performance lost is identical as long as hair is visible.

I've just tested it myself

  • No hair: 139fps
  • Medium: 112fps
  • Max: 110fps
Or

  • No hair: 7.3ms per frame
  • Max: 8.8ms per frame

All other settings max at a native 1440p.
Deactivate Raytracing and try again.
 
Hairworks is nearly 10 years old. Here from The Witcher 3:


There is no level of detail in Resident Evil. Performance lost is identical as long as hair is visible.


Deactivate Raytracing and try again.

Different techniques and radically different quality. There's no reason to do LODs here, the characters with hair are basically consistent in screen size at all times. Using LODs with a coarser simulation would mean more static hair, pops in quality during sudden camera changes (like kicks, cutscene intros, etc -- likely presenting as the hair jiggling/repositioning) and probably negligable performance improvements when the game is running near the 60fps target.

Regarding your bolded point: Like I said above, nobody is optimizing games to run at 500fps, it's just not practical unless every part of the computer is vastly faster than the target spec. The game runs at over 100 fps with everything on -- the fact that it can't scale up towards ~200-300 fps when you turn off some settings is not a failure, the game is running how it was designed. At a certain point with standard content you can't scale down indefinitely to feed a balanced workload to a fast gpu and you start having to stall and wait on some things. If stalling and waiting means that your game only runs at 144 fps instead of 240 fps... you're fine.


Lastly, I'm pretty sure there are more hair strands here than there are triangles in an average scene in portal rtx. Not every technique scales the same way!
 
Last edited:
Deactivate Raytracing and try again.

Easy...
  • No hair: 146fps
  • Medium: 117fps
  • Max: 111fps
Or
  • No hair: 6.6ms per frame
  • Max: 9ms per frame
Your system is at fault here as proven by the fact your RTX4090's power consumption drops by over 100w when advanced hair is enabled.

It shouldn't be dropping power consumption at all, if anything turning on advanced hair rendering increases the work load so it should be drawing more power.

My 4070ti is not only rendering the advanced hair with less of a performance hit than your RTX4090 is but the power consumption for my RTX4070 doesn't drop when enabling advanced hair either.

Your system is the problem, not the game.
 
The 4090 is around 70%+ faster than a 4070TI. When the limitation happens on the software stack, scaling goes down and the difference is less than that. Is this 1080p? Without hair rendering my 4090 OC would be twice as fast as your 4070TI but with hair rendering it is less than 40%...
 
Unless you think your 4090 dropping over 100w in power consumption when you turn on a demanding setting is normal?
I'm not a hardware/IT expert, but it isnt that implausible to me that it could be normal. If the game has to finish the compute workload and store the hair data before it can start rasterizing and doing RT, it's possible that the 4090 is mostly un-utilized during the hair compute, and then only fires up to its higher clocks for a couple of ms to do the rest of the rendering -- less average clock speed across the time spent rendering. There might just not be a high enough quality hair setting to fully feed the 4090, vs if they had a setting which had double the strands it would show more of an advantage compared to more reasonably specced gpus.
 
I'm not a hardware/IT expert, but it isnt that implausible to me that it could be normal. If the game has to finish the compute workload and store the hair data before it can start rasterizing and doing RT, it's possible that the 4090 is mostly un-utilized during the hair compute, and then only fires up to its higher clocks for a couple of ms to do the rest of the rendering -- less average clock speed across the time spent rendering. There might just not be a high enough quality hair setting to fully feed the 4090, vs if they had a setting which had double the strands it would show more of an advantage compared to more reasonably specced gpus.

And yet when I dropped my RTX4070ti to 720p native and turned advanced hair rendering on my power consumption stays the same at around ~250w.

720p on my 4070ti should present the same load as 1080o does to a 4090.
 
And yet when I dropped my RTX4070ti to 720p native and turned advanced hair rendering on my power consumption stays the same at around ~250w.

720p on my 4070ti should present the same load as 1080o does to a 4090.
The 4090 has like a zillion more cores than the 4070, it's reasonable to me it might draw less when it's not fully saturated. At a certain point there's only so much compute work to do and it's probably close to what it takes to fully saturate a 4070 -- the hair doesn't really scale based on resolution, so it's not easy to measure performance on just the hair -- you might see similar things in otherwise expensive scenes with the hair turned down to medium though, where you use less watts than you'd expect after turning hair on?
 
See? The problem is not my PC. The problem is the software stack running above the driver. Going from +70% (or nearly +100% in 1440p) to +22% (or +42%) shows it perfectly fine that this hair rendering is unoptimized for nVidia hardware.
 
And the demo is incredibly well threaded so it should still be rendering with those extra cores and putting them to use.

So is there a CPU limitation int here?
There could be, but what I'm imagining (pure speculation, no warranties) is that computing the hair takes ~X number of cores for ~Y number of cycles. Until the hair is computed, no other rendering work can start -- you need the hair to render shadowmaps, you need to hair to render depth, and you need the hair to render the scene.

On a reasonable high end GPU, which has approximately X number of cores in the first place, the hair fully saturates the gpu, take ~3ms, and then the rest of the render starts and takes ~4ms -- perfect for about a 120fps budget. On a super high end gpu, maybe it has ~X*3 cores. Rather than seeing a 3x speedup, two thirds of the gpu sits idle during hair compute. In a perfect world maybe you would find extra compute work to do here, (you could just add a lot more hair strands to fill the gpu up all of the way and maybe satisfy some 4090 owners egos, but there would be no visual improvement, so it would be wasted resources) but there just isnt any extra work worth doing. (Generally the kind of compute work you would do to fill out space is optimizations to make other parts of the render run faster, but the other parts are already preposterously fast here, so that's unlikely to help.)

The result is that the super high end gpu also takes ~3ms to do hair. Then it renders the rest of the scene in ~3ms -- so there's only a ~20% improvement, despite a larger theoretical gap in gpu specs between the high end and super high end gpus.

(I agree with you, of course, that the game is incredibly well threaded and optimized, I think complaining it doesnt scale up to ~1000 fps on a suitably fast gpu is silly, just trying to dig into why that might be)
 
3090 + 5800X3D @ 3840x1600

I tried to grab frame captures in Nsight but unfortunately it kept crashing the game. Oh well.

RT off / Hair off: 11.5 ms
RT off / Hair max: 14.0 ms
RT on / Hair off: 12.0 ms
RT on / Hair on: 14.3 ms

RT basically costs almost nothing and also does nothing in the demo :sneaky:. The settings menu shows a preview of the impact of each change which is nice. In the preview turning RT on makes a huge difference in water reflections. However water reflections in the actual game are arguably worse than the SSR version. Hopefully something is just broken and will be fixed before release.

I can't tell whether hair strands look better on or off. All of the hair options in the preview look equally good to me off/normal/high.

RT-OFF (settings menu preview):

rt-off.png

RT-ON (settings menu preview):

rt-on.png
 
On a reasonable high end GPU, which has approximately X number of cores in the first place, the hair fully saturates the gpu, take ~3ms, and then the rest of the render starts and takes ~4ms -- perfect for about a 120fps budget. On a super high end gpu, maybe it has ~X*3 cores. Rather than seeing a 3x speedup, two thirds of the gpu sits idle during hair compute.

I think you're on to something. Hair strands seems to add the same ~2ms on a 3090 and a 4090. It doesn't take much longer on a 4070 Ti. Pretty compelling evidence that the hair workload isn't scaling with clocks or with SMs. Would be interesting to see RDNA numbers.

Edit: ran a few more tests and I'm convinced hair is a very narrow workload that isn't filling the gpu. At all resolutions gpu clocks are 100-200Mhz higher with hair enabled which points to fewer SMs being active allowing for clocks to ramp. At higher resolutions some work seems to run in parallel with hair rendering as the net cost of enabling hair is lower ~2.5ms at 1600p compared to ~3.4ms at 720p. So there is some async compute at play.

1080p hair off: 5.8 ms
1080p hair on: 8.8 ms

720p hair off: 3.9 ms
720p hair on: 7.3 ms

On a side note CPU scaling is extremely impressive. Even at very high framerates > 250fps all 16 CPU threads are active and the load is very evenly distributed. Well done RE engine!
 
Last edited:
Status
Not open for further replies.
Back
Top