Current Generation Games Analysis Technical Discussion [2024] [XBSX|S, PS5, PC]

We've had famous timespans for things like Pixar movies taking umpteen hours per frame. Are there no numbers out there? And yeah, I guess it is a moving target as once you get to a decent time per frame, you ramp up the quality, and some features like volumetrics will just tank path-tracing. Are massive render-farms still a thing? You'd have thought with the exponential increase in power plus GPGPU things would have gotten nippier!
I believe massive render-farms are still a thing. When I left, they started getting GPUs to render instead of CPUs as you mentioned. That required a big change to the pipeline as they wanted to use realtime apps like Unreal for their lighting iterations from artists. I can ask one of my ex-coworkers that still work in animation industry how different things have become since I left.
 
I believe massive render-farms are still a thing. When I left, they started getting GPUs to render instead of CPUs as you mentioned. That required a big change to the pipeline as they wanted to use realtime apps like Unreal for their lighting iterations from artists. I can ask one of my ex-coworkers that still work in animation industry how different things have become since I left.
I do see a lot more Unreal Engine movies. I think the latest one I saw was the Gundam Zero one.
 
I believe massive render-farms are still a thing. When I left, they started getting GPUs to render instead of CPUs as you mentioned. That required a big change to the pipeline as they wanted to use realtime apps like Unreal for their lighting iterations from artists.
Nowadays, studios moved back to mostly CPU farms though. Weta FX (responsible for Marvel movies) started heavily using AMD CPUs (ThreadRippers and Epyc) to do most of their rendering. GPUs are relegated only to specific effects like heavy water, fluid or weather simulations.
 
Nowadays, studios moved back to mostly CPU farms though. Weta FX (responsible for Marvel movies) started heavily using AMD CPUs (ThreadRippers and Epyc) to do most of their rendering. GPUs are relegated only to specific effects like heavy water, fluid or weather simulations.
curious here, did they provide the reasoning here? Or is it just cheaper overall to go with CPU rendering?
 
Nowadays, studios moved back to mostly CPU farms though. Weta FX (responsible for Marvel movies) started heavily using AMD CPUs (ThreadRippers and Epyc) to do most of their rendering. GPUs are relegated only to specific effects like heavy water, fluid or weather simulations.
Yes, I'm waiting for my friend that works at Blizzard to reply to my question. I wouldn't be surprised that they are still using faster CPUs since programming those are easier and troubleshooting is infinitely easier. I'm taken aback at how there is no real way to dissect a problem in real-time using Nsight.
 
Nowadays, studios moved back to mostly CPU farms though. Weta FX (responsible for Marvel movies) started heavily using AMD CPUs (ThreadRippers and Epyc) to do most of their rendering. GPUs are relegated only to specific effects like heavy water, fluid or weather simulations.
Why are they using CPUs when GPUs are an order of magnitude faster for 3D rendering?
 
Still you should say what the main reason is. It is not path tracing.

Cyberpunk 2077 -
without auto exposure mod

with auto exposure mod


The fact that it only happens in his example with path tracing is more due to the fact that it is much brighter in places in the picture with enabled path tracing
Except that the noisiness is more apparent with PT on than it is off, which is the point of his video. If you turn off PT, and still have auto exposure on, you get less noise in the image than you do with PT on. Sure, running a mod to disable or alter auto exposure might help, but that's not exactly the answer to the problem that is, in many examples, turning on PT can create a noisier image.

Really, all of the examples he showed weren't really RT/PT's fault. The implementations of ray tracing that we have right now are sort of hacky, and either apply to only some surfaces or some effects, and often at lower than native resolutions and use simpler representations of game world. What his video is really showing are some of the shortcomings of the techniques we have in games today. So we have low res reflections, or GI, or shadows in many games. With this example in CP2077 what we are seeing is their PT implementation accumulating rays over several frames that are also being altered by their implementation of auto exposure. While I agree that the root cause may be auto exposure changing data every frame while the path tracer is trying to accumulate that same data. But it can also be argued that if the path tracer could generate enough data to be accurate every frame without relying on several frames of data, the problem wouldn't exist either. At the end of the day it doesn't really matter to the end user what the actual root cause is anyway, if the act of enabling PT ends up generating a noisier image. You can likely get a more stable image by disabling auto exposure (which requires a mod), but you can also get a more stable image by disabling path tracing (available in the games settings). Which, at the end of the day, is the point of the video. Turing on RT/PT can result in a noisier image.
 
curious here, did they provide the reasoning here? Or is it just cheaper overall to go with CPU rendering?
Why are they using CPUs when GPUs are an order of magnitude faster for 3D rendering?

There is a thread I read years ago detailing the reasons, basically it's down to two things: memory and custom shaders. GPUs support limited amounts of memory vs CPUs, and thus can't fit large or complex scenes in memory.

Disney released the data for rendering a single shot from Moana a couple of years ago. Uncompressed, it’s 93Gb of render data, plus 130Gb of animation data if you want to render the entire shot instead of a single frame.

CPUs have TBs of memory now that can fit any scene, plus AMD CPUs have democratized CPUs with many cores (24/32/64/96/128), which meant scaling up the rendering farms to thousands of cores became easier. The advent of AVX-512 also provided a big boost to performance.

As for shaders, professional movies use custom complex shaders, with complex BSDFs, which are a challenge to run on GPUs due to heavy divergence which means low GPU utilization.

 
As for shaders, professional movies use custom complex shaders, with complex BSDFs, which are a challenge to run on GPUs due to heavy divergence which means low GPU utilization.

Bingo. I wrote shaders all the time and there is no tool (not even NSight) that will let me see variable values at runtime within HLSL/GLSL shader code. To me it's useless trying to use it for solving a bug with math computations. I would almost have to have an offline renderer with the same shader code to test against.

Also, since we are exposing a little bit of secrets at these gaming studios - iD software has been working with an offline path-tracer for years. It's their "ground truth" when making their games. So it's not surprising to me that Indiana Jones is so polished and optimized. MachineGames has a lot more bandwidth to run a local light source light loop for RT shadows inside interiors with the 4090 (likewise with the water surfaces). They should add that for a future patch - especially when the 5090 comes out in a few weeks.
 
So, I got curious and checked on Pixar's RenderMan latest updates, this is the official Pixar renderer, turns out Pixar updated the latest version with GPU support, as well as a CPU + GPU hybrid path.

The GPU path retains the same problems, it runs out of VRAM quickly and crashes if the scene doesn't fit in VRAM, and the performance is still not that great. A 4090 is only 40% faster than an AMD Ryzen 7950X CPU. I think more powerful CPUs with more cores will actually tie or beat the 4090. Hybrid mode is useless and doesn't provide much performance gains over GPU.

 
The GPU path retains the same problems, it runs out of VRAM quickly and crashes if the scene doesn't fit in VRAM, and the performance is still not that great. A 4090 is only 40% faster than an AMD Ryzen 7950X CPU. I think more powerful CPUs with more cores will actually tie or beat the 4090. Hybrid mode is useless and doesn't provide much performance gains over GPU.
Many core CPUs are better? You know what that makes me think? :devilish:
 
Many core CPUs are better? You know what that makes me think? :devilish:
2008 would like a word:
 
What's good for raytracing might not be good for a games console. The interesting point for me is we've looked at 'faster graphics' from the perspective of realtime and the 'inevitable' conclusion that very fast, very wide vector processors (GPUs) would be the way to crunch numbers. Which meant Cell as a concept was outdated the moment it was conceived.

Yet here we hear how complex processors are more useful. Thus, is there real scope for a 'ray tracing' processor that straddles the CPU and GPU? Were Cell and Larrabee not really destined for realtime graphics, but the ideal for raytracing? I think that's what the SaarCOR RT accelerator was doing also.

It's obviously too limited a market to have a bespoke ray-tracing processor and all it's software support requirements, etc. but in my favoured parallel universe, Cell is dominating once again!
 
What's good for raytracing might not be good for a games console. The interesting point for me is we've looked at 'faster graphics' from the perspective of realtime and the 'inevitable' conclusion that very fast, very wide vector processors (GPUs) would be the way to crunch numbers. Which meant Cell as a concept was outdated the moment it was conceived.

Yet here we hear how complex processors are more useful. Thus, is there real scope for a 'ray tracing' processor that straddles the CPU and GPU? Were Cell and Larrabee not really destined for realtime graphics, but the ideal for raytracing? I think that's what the SaarCOR RT accelerator was doing also.

It's obviously too limited a market to have a bespoke ray-tracing processor and all it's software support requirements, etc. but in my favoured parallel universe, Cell is dominating once again!
Both Intel and NVIDIA has gone the route of "ray tracing cores" in their GPU's:
1734432409679.png

1734432432178.png

But they are moving "targets", if you compare NVIDIA's RT core over generation:
Turing RT Cores:
1734432698578.png
Ampere RT Cores:
(Double the triangle test throughput compared to Turing)

1734432717530.png

Lovelace RT Cores:
1734432750293.png
I suspect Blakcwell RT Cores will evolve again

This video gives some good numbers/insight:

But the "target" is fast moving with RT cores, so I doubt we need a "dedicated RT processor" like a CPU/GPU...it is a parallel graphics compute problem, it belongs in the GPU IMHO.
 
Both Intel and NVIDIA has gone the route of "ray tracing cores" in their GPU's:

But the "target" is fast moving with RT cores, so I doubt we need a "dedicated RT processor" like a CPU/GPU...it is a parallel graphics compute problem, it belongs in the GPU IMHO.
We're talking about offline rendering and studios where cost matters are choosing to use CPUs, not GPUs, for their flexibility. Yet CPUs in their current form aren't designed to be ideal for RT workloads, thus the possibility there's scope for a different processor arch that provides a different balance of flexibility and performance.

In gaming where the target is offline movie quality in realtime, it's just not happening. If it takes a render farm of CPUs hours to produce a frame, you'd need O(10^5)/O(10^6) performance improvement in a consumer product.

Therefore, realtime game graphics can't chase movies. They need to do their own thing. ML will likely help out, although we'll see what ML can contribute in offline renders long before we see it in games, so that'll define the targets. As an aside, I can't help but feel that future is really lacking in stability. There'll be blobby RT samples blobbed by ML into blobby shapes and blobby continuities, with blobby temporal smearing and blobby temporal AA, and it'll all look fabulous and weird and gorgeous and confusing.
 
We're talking about offline rendering and studios where cost matters are choosing to use CPUs, not GPUs, for their flexibility. Yet CPUs in their current form aren't designed to be ideal for RT workloads, thus the possibility there's scope for a different processor arch that provides a different balance of flexibility and performance.

In gaming where the target is offline movie quality in realtime, it's just not happening. If it takes a render farm of CPUs hours to produce a frame, you'd need O(10^5)/O(10^6) performance improvement in a consumer product.

Therefore, realtime game graphics can't chase movies. They need to do their own thing. ML will likely help out, although we'll see what ML can contribute in offline renders long before we see it in games, so that'll define the targets. As an aside, I can't help but feel that future is really lacking in stability. There'll be blobby RT samples blobbed by ML into blobby shapes and blobby continuities, with blobby temporal smearing and blobby temporal AA, and it'll all look fabulous and weird and gorgeous and confusing.
Wasn't it said that soon only 1/32th of your frames or pixels will be rendered traditionally?
ML is comming fast, the graphics industry will be impacted too.

I mean when I started gaming this was the norm:
1734434649078.png

As always with change, it takes the majority with a big "surprise":

Go back 10 years and tell people that AI upscaling and RT would be a norm in games in a decade and people would call you a "nugget" 🤷‍♂️
 
This assume scalability of the ML solutions that I don't feel has been demonstrated, but that discussion is happening here.

It's certainly true IMO that offline rendering will showcase what ML can do before it hits realtime. eg. Let's imagine you can render a simple image and have ML transform it into something that looks like a photograph. Offline rendering will adopt that with more complex ML models and more time per frame than realtime could afford. When we see movies rendered with 30 seconds to produce the base frame from a render farm + 2 minutes of ML processing, instead of hours of render farm rendering, and the results are movie quality, then we'll have an idea what ML might accomplish. There's no way the movie industry is going to sit on any ML potential and not use it, waiting for games to get realtime photofication before thinking, "wow, we could use that to save a ton of money."

In short, what can the movie industry tell us about what games can and will accomplish? Presently, to my surprise, highest-tier rendering hasn't been massively accelerated for movies since the introduction of RTX, although I dare say it has at the lower end. And it does also hint at alternative technologies, although the applicability of these to a console seem limited. How much is GPU RT impacted by the same issues facing offline renderers? Presumably not a lot as games will be written to the GPU's capabilities, and shaders, if a bottleneck, simplified to fit the hardware.
 
Many core CPUs are better? You know what that makes me think? :devilish:
We should take into account that this is due to the unoptimized nature of these highly complex custom materials and shaders that Pixar and Weta FX use. These big studios still have a lot of time before they optimize their massive library of custom shaders/materials for GPUs ... they are doing it but very slowly.

Other offline renderers show that with stock shaders GPUs are massively faster than CPUs, you can see that with Blender, Octane, V Ray, Arnold, Cinebench, .... etc. Especially after the introduction of accelerated ray tracing.
 
I guess we want a different reference, something rendered in Blender that looks good if not necessarily to the same standard as Renderman, and to look at the render times on that. Are there any details on render times for an animation produced on these GPU accelerated renderers?
 
Back
Top