Unreal Engine 5, [UE5 Developer Availability 2022-04-05]

How does the frame budget of Nanite compare to that of the geometry stage(s) of a typical game today?

Apparently around the same budget as a 60fps title, at least on next gen machines.

The only problem that in games you start noticing the same features repeated as they're reused. Even across different games.

Repetition is really a problem, as is content creation. Spline mesh sets like Source 2 supports wonderfully seem to be part of a solution too this, as is "runtime" virtual texturing where you blend a lot of materials together without worry about ram. Unfortunately nanite seems pretty predicated on kitbashing static items. That's just going to make repetition that much worse; scanning just isn't a solution at all either. Not everyone wants a photoreal game, actually most should probably avoid such. Not to mention it's not exactly a "creative" tool as such.
 
I doubt that there won't be any polygon or other assets budgets due to a simple need to fit the game onto console's storage.
The recent demo - being what, 10 minutes long? - is already 25GBs of storage cooked which is about 1/4th of what you'd realistically want your whole game to weigh in distribution.
In one of the videos Brian Karis or someone else from Epic said they are developing tools to reduce storage later in the project's life cycle. The goal is for artists to not need to worry about storage early in the project.
 
In one of the videos Brian Karis or someone else from Epic said they are developing tools to reduce storage later in the project's life cycle. The goal is for artists to not need to worry about storage early in the project.
later ...
The problem I see is, yes, the GPUs are now capable use that much data. The SSDs can deliver that data, but storing the data is a whole other problem. Especially if you have always unique textures and models. All that costs storage. E.g. because of the SSD we can now load assets "file by file" (well actually only the parts that are needed) but with that you lose much of the compression ratio because everything must be compressed on itself. So compression gets a bit more ineffective but therefore you don't need to build big packages with lots of duplicated data across those packages. So the loss of compression-effectiveness might not count after all.
BUT asset quality normally increases. This most of the time means more polygons, higher resolution textures => more storage needed.
Fun thing about the Engine reveal demo on PS5 was, that they heavily reused assets all over the place. Just like the portal that was made of parts of buildings etc. just scaled down in size. So it really seems that they are really aware that all this new stuff has a cost in form of storage requirements.

Btw, we also saw this in the new R&C. Yes, it is impressive how much you can zoom in on individuell stuff, but e.g. each bullet, each bug is the same and there is not that much variety in textures and assets. There is always some kind of limit. For the current gen, it seems it is design-time and storage before everything else.
unknown.png

 
Yes. Predictors, and de-interleaving. You can compress an array of unrelated floats much much better than a regular universal compressor can.

Is CPU decompression fast enough with these compressors?
 
There is always some kind of limit. For the current gen, it seems it is design-time
Its 99% of the time due to the extra work asked of the artists.
eg WRT with the ps5 R&C game, correct me if Im wrong but arent storage requirements LESS than the last ps4 R&C game.
not to mention IRL dont all bugs/bullets look practically the same (unlike trees/buildings/ppl etc)
 
Is CPU decompression fast enough with these compressors?
On PC this is normally not a problem, as most cores will idle around all the time or not work at 100%. As consoles now have 7 cores / 14 threads (for games) this should translate to higher cpu requirements for games in the future on pc.
Consoles don't do hardware decompression because it wouldn't be fast enough on the CPU, they do it so the limited CPU resources (for around 7-8 years) can be used as efficient as possible. On the other hand, the consoles will stick to their compression tech even if for some data another compression configuration would be much better. Hardware decompression is always a compromise as it is not very flexible and therefore should always loose some efficiency.

Its 99% of the time due to the extra work asked of the artists.
eg WRT with the ps5 R&C game, correct me if Im wrong but arent storage requirements LESS than the last ps4 R&C game.
not to mention IRL dont all bugs/bullets look practically the same (unlike trees/buildings/ppl etc)
There might be other "bugs" in the game (so far I don't have a PS5 ^^). And yes the game has less storage requirements, thanks to better compression and less/no duplicates. It is quite a difference, if you package the data for each level or if you have "loose" data and no duplicates in there.
Another thing is the art style. It looks good and the comic-look does not contain to tiny detail, so especially texture compression should work quite good.
 
Last edited:
In one of the videos Brian Karis or someone else from Epic said they are developing tools to reduce storage later in the project's life cycle. The goal is for artists to not need to worry about storage early in the project.
I'm not sure that its possible with an engine aiming at providing a pixel level detail across all display. Sure geometry can be compressed efficiently enough but the size of all assets will likely still be an issue - big enough to already be an issue during the design phase.
 
later ...
The problem I see is, yes, the GPUs are now capable use that much data. The SSDs can deliver that data, but storing the data is a whole other problem. Especially if you have always unique textures and models. All that costs storage. E.g. because of the SSD we can now load assets "file by file" (well actually only the parts that are needed) but with that you lose much of the compression ratio because everything must be compressed on itself. So compression gets a bit more ineffective but therefore you don't need to build big packages with lots of duplicated data across those packages. So the loss of compression-effectiveness might not count after all.
BUT asset quality normally increases. This most of the time means more polygons, higher resolution textures => more storage needed.
Fun thing about the Engine reveal demo on PS5 was, that they heavily reused assets all over the place. Just like the portal that was made of parts of buildings etc. just scaled down in size. So it really seems that they are really aware that all this new stuff has a cost in form of storage requirements.

Btw, we also saw this in the new R&C. Yes, it is impressive how much you can zoom in on individuell stuff, but e.g. each bullet, each bug is the same and there is not that much variety in textures and assets. There is always some kind of limit. For the current gen, it seems it is design-time and storage before everything else.
I'd say that one approach that could help (not solve) in the storage issue, would be to be less dependent on textures (or in high-res textures, for that matter). Higher polycounts should make textures less necessary in some places/cases, since you don't have to simulate microdetail and other stuff via textures. A little bit like what happens in Dreams; they use no textures, only "color" and material values.

As I said, not a global solution, but it could help.
 
You can later calculate what the player can see and start culling the stored data automatically from that. I think they discussed this very early on? Not sure.
 
I'd say that one approach that could help (not solve) in the storage issue, would be to be less dependent on textures (or in high-res textures, for that matter). Higher polycounts should make textures less necessary in some places/cases, since you don't have to simulate microdetail and other stuff via textures. A little bit like what happens in Dreams; they use no textures, only "color" and material values.

As I said, not a global solution, but it could help.
If your geometry is smaller than pixels, your geometry data gets larger than the textures.
To reduce geometry data tesslation can be used, but than you loose part of control and may not be able to gives things the right color without textures
 
Consoles don't do hardware decompression because it wouldn't be fast enough on the CPU, they do it so the limited CPU resources (for around 7-8 years) can be used as efficient as possible.

Benchmarks on the throughput of zlib and kraken using software decompression directly contradict this statement.
Here are good comparisons made by Charles Bloom, who is a dev for RAD Games (recently acquired by Epic, so expect all of these to be part of UE5's middleware):

https://cbloomrants.blogspot.com/2020/07/performance-of-various-compressors-on.html

CPU decoding of zlib tops out at ~800MB/s and Kraken at ~1.8GB/s. The Series X decompressor has a ~7GB/s maximum output with BCPack and the PS5's can go all the way up to 22GB/s, but with Kraken + Oodle Texture it supposedly averages over 15GB/s.

I do see that Mermaid and most of all Selkie can get very high throughputs on CPU decoding, but compression ratio hurts a bit. He's using a 16c/32t Ryzen 9 3950X at 3.4GHz. He seems to be getting around 4GB/s on one core for Selkie and 2GB/s for Mermaid.
 
Spline mesh sets like Source 2 supports wonderfully seem to be part of a solution too this, as is "runtime" virtual texturing where you blend a lot of materials together without worry about ram.

Source 2 supports spline based models? Is it ina a custom format or some industry standard? Where can I read more about it?
 
"AI" techniques should really be a layer on top of assets:
  • for each class of object use AI generation to increase the quantity in the asset library
  • compression algorithms to convert game-downloaded assets into far larger locally stored assets. 100GB download turns into 1TB of local assets. Some generated on the fly by runtime AI techniques if need be.
In my opinion the primacy of "original captured assets" is just a phase. AI techniques are already good enough to generate faces, why are rocks special?...
 
Benchmarks on the throughput of zlib and kraken using software decompression directly contradict this statement.
Here are good comparisons made by Charles Bloom, who is a dev for RAD Games (recently acquired by Epic, so expect all of these to be part of UE5's middleware):

https://cbloomrants.blogspot.com/2020/07/performance-of-various-compressors-on.html

CPU decoding of zlib tops out at ~800MB/s and Kraken at ~1.8GB/s. The Series X decompressor has a ~7GB/s maximum output with BCPack and the PS5's can go all the way up to 22GB/s, but with Kraken + Oodle Texture it supposedly averages over 15GB/s.

I do see that Mermaid and most of all Selkie can get very high throughputs on CPU decoding, but compression ratio hurts a bit. He's using a 16c/32t Ryzen 9 3950X at 3.4GHz. He seems to be getting around 4GB/s on one core for Selkie and 2GB/s for Mermaid.
It is always a question what you do with it. Read from HDD, SSD, Memory, how fast is everything. Btw, if I read this correctly he just uses single-thread for the benches.
PC systems evolve and this will get less and less of a problem in the future. Also something like nvidias solution to use the GPU hardware is also not that bad. It can also use much faster memory than the main-memory the CPU has access to.
 
Is CPU decompression fast enough with these compressors?

Fast enough in which sense? A software decompressor is no match for a hardware decompressor, simplifying statement. Most fast decompressors are memory limited, and effectively no amount of size-reduction can make a "slow good" algo compete with a "fast bad" algo (it naturally converges to memcpy).
The mission statement for Kraken was, that on uncompressible data, it's not slowing down. It's suppose to be a good universal (de)compressor. If you design an algorithm for data which is guaranteed compressible - I would tend to say, all meshes basically are - you can compress smaller than Kraken at the same or better speed.
In this case the question was more about size, than speed. If you calculate the maximum need of mesh-data per second (or frame or whatever), and design the decompressor to cap at that speed, and use the "free" time to deal with a more complex algorithm that compresses more efficiently, then sure you can do much better than a universal compressor like Kraken (or Selkie, Mermaid & Leviathan). Doing pre-processing, and data-transformation and feeding those UCs, isn't too efficient.
 
Unfortunately nanite seems pretty predicated on kitbashing static items. That's just going to make repetition that much worse; scanning just isn't a solution at all either.

Nanite is definitely not predicated on kitbashing. The fine grained culling provided by the visibility buffer approach lets kitbashing be relatively performant, but it's absolutely the worst case for nanite performance. All of the rest of the properties of nanite (extremely effective mesh compression, no gpu performance cost growth with polycount or number of unique items, low cost for streaming in geometry) make it the most conducive approach to asset variety in the industry right now.

Epic has megascans to sell and only a few month production budgets for these scenes, but the tech isn't based around rendering these scenes only.

I'm not sure that its possible with an engine aiming at providing a pixel level detail across all display. Sure geometry can be compressed efficiently enough but the size of all assets will likely still be an issue - big enough to already be an issue during the design phase.

Every couple-million tri nanite mesh you add in order to remove a 4k normal map will save (considerable) space. ue5 games will be perfectly shippable -- i dont think people realize how big textures are. a 4k normal map is 16 million vectors that are difficult to compress at all without ruining quality.
 
Fast enough in which sense? A software decompressor is no match for a hardware decompressor, simplifying statement. Most fast decompressors are memory limited, and effectively no amount of size-reduction can make a "slow good" algo compete with a "fast bad" algo (it naturally converges to memcpy).
The mission statement for Kraken was, that on uncompressible data, it's not slowing down. It's suppose to be a good universal (de)compressor. If you design an algorithm for data which is guaranteed compressible - I would tend to say, all meshes basically are - you can compress smaller than Kraken at the same or better speed.
In this case the question was more about size, than speed. If you calculate the maximum need of mesh-data per second (or frame or whatever), and design the decompressor to cap at that speed, and use the "free" time to deal with a more complex algorithm that compresses more efficiently, then sure you can do much better than a universal compressor like Kraken (or Selkie, Mermaid & Leviathan). Doing pre-processing, and data-transformation and feeding those UCs, isn't too efficient.

In the context of UE5, I meant decoding fast enough for Nanite's requirements when using a modern CPU core (Zen 2 / Ice Lake and later), at ~3GHz.

So far most of the storage volume and decoding speed concerns were revolved around textures. With UE5 geometry is now a concern, too.
 
Back
Top