Understanding game performance through profiler trace graphs *spawn

Nobody said anything about rumors and conjecture about future unknowns. I think we've talked enough about how simple point-in-time profiler traces aren't useful, and we've given examples of what might be useful. Cwjs said nothing about offenses, and nobody in this thread mentioned posters needing a doctoral-level dissertation on GPU mechanics to make their posts.

You may not like the answers you've received, yet you've received a lot of answers. Instead of throwing strawmen about offenses and dissertations, instead find a better way to communicate and query.
 
I think this entire conversation is somewhat silly and going no where. The best advice is create a new thread and avoid posting in established threads where the OP has strong opinions regarding the content and direction they want the conversation to go. Unfortunately this is not the first time this has happened in the UE 5 developer thread and likely will not be the last.
 
I think this entire conversation is somewhat silly and going no where. The best advice is create a new thread and avoid posting in established threads where the OP has strong opinions regarding the content and direction they want the conversation to go. Unfortunately this is not the first time this has happened in the UE 5 developer thread and likely will not be the last.

Agreed. That’s a more reasonable explanation.
 
Was it really inaccurate? It sounds to me like occupancy was probably the key difference, and NVIDIA just managed to improve their compiler
NVIDIA didn't do anything, Bethesda patched the game and fixed the performance on NVIDIA and Intel GPUs.
 
The really curious thing is, what was it they "fixed"? Was the code improvement actually specifically focused on a single, unique GPU call? Or was it a CPU optimization which resulted in a new GPU command queue issuance? Or maybe just a new threading mechanism for dispatching certain calls? Maybe it was simple ordering of commands? Maybe it was a broken shader? Maybe it was more than one shader?

The reality is, they may not even know exactly how the occupancy ends up how it is, but they get better performance if they do (x) change which "magically" results in one particular component of the GPU pipeline getting a few more instructions into queue.
 
If brand X releases a card with N amount of ram and brand Y releases a card with N times 1xx% of ram then fans of brand X will argue that N amount is sufficient and more is not needed.

Also they will find examples to confirm this bias
 
Back
Top