Realtime AI content generation *spawn

Ignoring content generation for a moment. How exactly do you render an animated character inside a scene with neural descriptions?

For a character do you give the animation engine a bunch of neural blobs for sections of the character, together with an animated skeleton and let it try to condense it into a plenoptic neural model so it can be composed with the rest of the scene by a raytracer?

Or do you use a giant textual description for a scene and everything important in it, then every frame tell it what changes and have it make something up in 15 msec?
 
Last edited:
For non important/secondary NPCs (pedestrians/common enemies) in open world games, this tech is miles ahead of any voice over trash we have in current open world games, right now the best you can hope for talking to a pedestrian is a couple of soulless voiced over lines that repeat forever. With this tech, you will have vastly more immersive worlds that are more dynamic and responsive. It's still early days for the tech too (it's a demo after all), it's going to vastly improve with time, just like ChatGPT.

It definitely won't be long before GenAI provides real good 'filler' for the background chatter in games.

Think of just random pedestrians or passers-by chatting about things that have actually happened in the game world, or walking past TVs with dynamically generated news broadcasts about your exploits in the world so far.


It's already starting to be really quite good at realistically riffing on whatever source material you give it.
 
Half Life 2.


Tomb Raider.


More GTA IV.


It's already starting to be really quite good at realistically riffing on whatever source material you give it.

This is just crazy, I can see the low hanging fruit of ML algorithms working on the final frame, adding particles, weather effects, hair, fur, physics, clothes on the final frame. The game will be designed around such algorithm of course so the output result will be highly clean and constrained within certain programmed parameters. Later ML algorithms would augment global illumination and reflections, maybe enhance assets with realistic textures, shaders and details. Not to mention greatly augmenting cut scenes. Seeing these videos above, I think all of this is very very possible soon.
 
Last edited:
I wonder what the implications this will have for game design should this ever (or maybe when) become usable in real time during post on the user side with at most a reasonble barrier to entry (in terms of cost and/or knowledge). The reason being the user could completely change the art direction based on their personal preference.

Also as an aside this implementation also does show some interesting challenges in just going for realism in this usage in that there is a bit of a disconnect in terms of how the characters move (very animated, video game movement) versus the aesthetics (realistic live action). You see this even with live action movies today in how sometimes the movement ends up feeling uncanny with more assisted movement versus stunt work.
 
I wonder what the implications this will have for game design
I expect "cinematic" " games to benefit greatly from this. Games such as Detroit: Become Human, Heavy Rains, Beyond Two Souls, Until Dawn and The Dark Pictures Anthology. These games already have so much fixed camera cutscenes and limited player movements, if designed around such ML algorithms, the results would be very very movie like indeed.

the user could completely change the art direction based on their personal preference
I guess the game itself could offer the user a set of options to choose from, just like the post process filters that some games offer to the user (Doom, Resident Evil, Gears .. etc).
 
I expect "cinematic" " games to benefit greatly from this. Games such as Detroit: Become Human, Heavy Rains, Beyond Two Souls, Until Dawn and The Dark Pictures Anthology. These games already have so much fixed camera cutscenes and limited player movements, if designed around such ML algorithms, the results would be very very movie like indeed.


I guess the game itself could offer the user a set of options to choose from, just like the post process filters that some games offer to the user (Doom, Resident Evil, Gears .. etc).

What i was more referring to is how much the developers vision and their design terms would matter? And what the impact would be in the terms of the approach to game development and even the business side.

Would most people end up just tailoring games to their own vision if they had the capability? If so what would be the response on the developer side? How would they then approach game design? Would they restrict (or attempt to) modify the games in this way? What about broader business implications as well such as MTX? Would MTX cosmetics be viable if users could just inject their own? Or if they knew nobody else would see them? What would the impact be on licensed properties? Would users just post process in their own favorite franchise IPs?

Not saying it's for the better or worse should it (or when it) comes to pass, but just that I find the potential implications and disruption rather interesting.
 
Very impressive, but also a whole other type of uncanny valley! There's something particularly jarring about photorealistic people glitching. Will we ever get used to it? Or will games then need to take on realistic human-type movement limitations to prevent teleporting feet and impossible bone configurations?

Also highlights again how little intelligence there is in 'AI' because it has no idea what's going on and has zero context. There are no bone constraints to prevent limbs moving in impossible directions, or knowledge of hair to have it behave correctly, or knowledge of light to light and shade correctly. Where a true GI algorithm will light and shade correctly to great effect in all cases, ML will light and shade perfectly until it breaks, and then it's just a mess. It really is a remarkable short-cut.
 
Very impressive, but also a whole other type of uncanny valley! There's something particularly jarring about photorealistic people glitching. Will we ever get used to it? Or will games then need to take on realistic human-type movement limitations to prevent teleporting feet and impossible bone configurations?
It's just the beginning similar to how light bulbs eventually led to the invention of LEDs. At the pace that AI is taking off I would not doubt that any artistic or mechanical issues would likely be solved by another AI model designed to assist a human developer.
 
3D graphics started out primarily artist-driven, with artists manually creating models, materials, and animations, and lighting to resemble reality to the human eye. Then with PBR, photogrammetry, advanced animation systems, mocapping, and ray tracing there was a shift to attempting to actually simulate reality. But now with AI there's no simulation, it's just back to creating something that resembles reality to the human eye. Ironic.
 
Are these "reimaginations" based on actual gameplay or just recorded gameplay? It makes a world of a difference "realtime" or not
 
A new model could be custom built to handle game animations more gracefully.
Game animations are deliberately unnatural to provide responsiveness. The game chooses artificial physics first, and then maps a human-like avatar onto that.

Models will surely improve but some aspects strike me as unsolvable and actually there's a suggestion, having had an early sample of 'photorealistic games', that maybe non-photorealistic games will be better for the purposes of gaming, certainly for some (most?) genres? Kinda like cartoon violence was possible in hand drawings so featured in cartoons, but once the tech became possible to make cartoons photorealistic, it's probably just hugely distasteful.

For a slow Quantic-Dream type game, this tech is perfect, but for third-person action games, maybe not. You could have a first-person game where the NPCs are limited to natural human movements and you don't get to see your limbs teleporting to different impossible positions due to the POV. And of course horror will be a perfect fit! ;)
 
For a slow Quantic-Dream type game, this tech is perfect .. You could have a first-person game where the NPCs are limited to natural human movements and you don't get to see your limbs teleporting to different impossible positions due to the POV. And of course horror will be a perfect fit! ;)
Yeah I agree.

Are these "reimaginations" based on actual gameplay or just recorded gameplay? It makes a world of a difference "realtime" or not
Recorded gameplay, the tech is relying on offline rendering/inference currently, but I am sure someone (NVIDIA) is working on a real time version of it to add at least post process effects to game scenes (particles, weather effects, hair, fur, physics, clothes simulation) to the final frame, maybe even augment global illumination and reflections when the tech is mature enough. Next step would be enhancing assets (textures, models, shaders etc) and Jensen already hinted at this.

NVIDIA also has a working prototype of this with RTX Remix, you enter a text prompt for a specific texture, and AI will replace the current textures with an AI generated one matching your prompt in real time (timestamped in the video below).

 
Are these "reimaginations" based on actual gameplay or just recorded gameplay? It makes a world of a difference "realtime" or not
Recorded gameplay video used as a input for AI video generation, nowhere near realtime.

Generating a much crappier 720x480, 8fps, 6 second video using CogVideoX 5B model on an RTX 4090 takes several minutes, like 3-5min, so a bit less than 1 minute for every 1 second of video generated.

Those game clips are generated with Gen 3 which has a max length of 10sec. This is not great for temporal consistency as every 10sec Lara Croft's shirt might turn from woolly to leather to whatever material.

If this type of architecture will ever be used for visual effect in real-time gameplay context it has a lot of maturing ahead of it. The good thing is that game engines already have depth buffers, color etc to ground the video.
NVIDIA also has a working prototype of this with RTX Remix, you enter a text prompt for a specific texture, and AI will replace the current textures with an AI generated one matching your prompt in real time (timestamped in the video below).
This is a completely different thing than those AI videos. It replaces an existing texture with an AI generated texture within the game engine. It's not generating AI video.
 
This is a completely different thing than those AI videos. It replaces a texture to an AI generated texture within the game engine. It's not generating AI video.
I never said they are the same, I am saying there is already a prototype for enhancing textures with AI ones in real time.

If this type of architecture will ever be used for visual effect in real-time gameplay context it has a lot of maturing ahead of it
Yes, of course, this is just the initial early examples.
 
I never said they are the same, I am saying there is already a prototype for enhancing textures with AI ones in real time.
I see. Your words "real time version of it" and "working prototype of this with Remix" made it sound you were saying these are somehow related. But I must have misunderstood.

Recorded gameplay, the tech is relying on offline rendering/inference currently, but I am sure someone (NVIDIA) is working on a real time version of it to add at least post process effects to game scenes (particles, weather effects, hair, fur, physics, clothes simulation) to the final frame, maybe even augment global illumination and reflections when the tech is mature enough. Next step would be enhancing assets (textures, models, shaders etc) and Jensen already hinted at this.

NVIDIA also has a working prototype of this with RTX Remix.
Creating textures with AI image generators like in the RTX Remix example is not the next step, it precedes real-time generated gameplay AI video by several years which in AI-time is a whole heckuva lot.
 
Put here since this is dedicated thread on Realtime AI graphical tools.
Additional UE 5 developer plugins now available.
For starters, the Audio2Face 3D Plugin is now available for both Unreal Engine 5 and Autodesk Maya. This tool enables AI-powered facial animations and lip syncing. It works by analysing audio, and then generating animations to best match what the character is expressing.

Then Nemotron-Mini 4B Instruct Plugin and Retrieval Augmented Generation (RAG) Plugin are also now available for Unreal Engine 5. These two tools provide response generation for interactive character dialogue and supply contextual information to enhance character interactions.

Finally, Nvidia has also announced that Epic's Unreal Pixel Streaming technology now supports Nvidia ACE, allowing developers to stream high-fidelity MetaHuman characters via Web Real-Time Communication. This is designed to help developers with bringing AI-powered digital humans to games and applications, with low-latency and minimal memory usage on Windows PCs.
 
Recorded gameplay video used as a input for AI video generation, nowhere near realtime.
True though based on the videos it's not too much of a stretch to see various areas where AI might be used to enhance existing or future game development.
Would be interesting if someone could create a NexusMod for an existing game using this technique to enhance realtime gameplay visual fidelity.
 
Back
Top