Unreal Engine 5, [UE5 Developer Availability 2022-04-05]

Fun thing about the Engine reveal demo on PS5 was, that they heavily reused assets all over the place. Just like the portal that was made of parts of buildings etc. just scaled down in size. So it really seems that they are really aware that all this new stuff has a cost in form of storage requirements.
Even if storage were unlimited, artist time to generate assets at higher detail than before increases (advantages of no more work on LOD aside), so it's not clear what's the practical limit. Technically it's storage, cost wise it's still content creation.

In my opinion the primacy of "original captured assets" is just a phase. AI techniques are already good enough to generate faces, why are rocks special?...
ML can do 2D images of faces, but 3D models is still a problem afaik?
I have seen lots of good results on TL based terrain generation. Those were 2D hightmaps as well.
I'm pretty convinced ML will help content creation, but so far i have not seen much practical tools yet, and doing it on client to help the storage problem sounds very distant. Unpredictable results due to HW dependent floating point results alone already sounds a though problem.
 
You can later calculate what the player can see and start culling the stored data automatically from that. I think they discussed this very early on? Not sure.
The SDF could help to accelerate this process.
But at least one instance of every model will always stay, so it can help only performance, but not storage.
 
Nanite is definitely not predicated on kitbashing. The fine grained culling provided by the visibility buffer approach lets kitbashing be relatively performant, but it's absolutely the worst case for nanite performance. All of the rest of the properties of nanite (extremely effective mesh compression, no gpu performance cost growth with polycount or number of unique items, low cost for streaming in geometry) make it the most conducive approach to asset variety in the industry right now.
This is a great point to emphasize. While the megascans kitbashing stuff is a neat way to flex, Nanite really is a just generally a better way to static meshes of any polygon densities. This is particularly true for stuff like Virtual Shadow Maps and Lumen where having even low poly meshes be Nanite increases performance by 5x or more in most cases. The key point is that Nanite can do things like render the whole world at low resolution and/or many different views at once much more efficiently than the conventional graphics pipeline.

With non-nanite static meshes you can fairly easily get into bad cases with virtual shadow maps for instance where you are effectively asking it to scatter parts of the mesh across 20+ different views of various resolutions. There's just no efficient way to do this without at the very least breaking the mesh up into clusters, similar to what Nanite does.

Thus I will reiterate: Nanite need not be synonymous with super highly detailed geometry. It enables that where it will benefit a project, but it provides a lot in all cases.
 
Thus I will reiterate: Nanite need not be synonymous with super highly detailed geometry. It enables that where it will benefit a project, but it provides a lot in all cases.
Personally i think it only has upsides, no downsides. I expect (and hope) everybody will adopt a similar system to their engines within next few years.

Karis is the new Carmack, whooo!!! :D
 
Benchmarks on the throughput of zlib and kraken using software decompression directly contradict this statement.
Here are good comparisons made by Charles Bloom, who is a dev for RAD Games (recently acquired by Epic, so expect all of these to be part of UE5's middleware):

https://cbloomrants.blogspot.com/2020/07/performance-of-various-compressors-on.html

CPU decoding of zlib tops out at ~800MB/s and Kraken at ~1.8GB/s.

That's all single core. It's not impossible on modern CPU's, it just eats up an impractical number of cores on CPU's with core counts of 8 or less. Here are some quotes:

"Game developers are also using Oodle Texture and Oodle Kraken for PC games; most multi-platform devs will be using the same Oodle Texture encoding of their textures for all platforms, it's not platform-specific. On the PC you don't have hardware Kraken, so software Kraken is used on the CPU. To keep up with the fastest SSD speeds this requires several cores; luckily high end PC's also have lots of CPU cores!"

"In theory PC SSD's will keep getting faster, but you would need several CPU cores running software Kraken to match the decompressed bandwidth of the PS5 hardware Kraken."

The Series X decompressor has a ~7GB/s maximum output with BCPack and the PS5's can go all the way up to 22GB/s, but with Kraken + Oodle Texture it supposedly averages over 15GB/s.

11GB/s. The 3.16:1 example given was just a single (presumably optimal) texture set from a single game.

"The result is that we expect the average compression ratio of games to be much better in the future, closer to 2 to 1"

On the 22GB/s, as pointed out in the past, it's the peak speed of the decompressor for the extremely compressible corner cases, it's does not represent an average or even near average speed.

"the design was focused on maximizing minimum decode speed; that is, making sure that even for data that is pathologically slow to decode, the decoder keeps up with (or ideally outpaces) the peak SSD read speeds.

I'm happy the peak decompression speed came out the way it did [capable of 22GB/s] but we always knew it was going to be high and didn't worry about it much."


I also found this really interesting on the PS5's IO die. Seemingly nothing special there from a hardware perspective:

"Along the same lines, 2 helper processors in an IO block that has both a full Flash controller and the decompression/memory mapping/etc. units is not by itself remarkable. Every SSD controller has one. That's what processes the SATA/NVMe commands, does the wear leveling, bad block remapping and so forth. The special part is not that these processors exist, but rather that they run custom firmware that implements a protocol and feature set quite different from what you would get in an off-the-shelf SSD."
 
That's all single core. It's not impossible on modern CPU's, it just eats up an impractical number of cores on CPU's with core counts of 8 or less. Here are some quotes:

"Game developers are also using Oodle Texture and Oodle Kraken for PC games; most multi-platform devs will be using the same Oodle Texture encoding of their textures for all platforms, it's not platform-specific. On the PC you don't have hardware Kraken, so software Kraken is used on the CPU. To keep up with the fastest SSD speeds this requires several cores; luckily high end PC's also have lots of CPU cores!"

"In theory PC SSD's will keep getting faster, but you would need several CPU cores running software Kraken to match the decompressed bandwidth of the PS5 hardware Kraken."



11GB/s. The 3.16:1 example given was just a single (presumably optimal) texture set from a single game.

"The result is that we expect the average compression ratio of games to be much better in the future, closer to 2 to 1"

On the 22GB/s, as pointed out in the past, it's the peak speed of the decompressor for the extremely compressible corner cases, it's does not represent an average or even near average speed.

"the design was focused on maximizing minimum decode speed; that is, making sure that even for data that is pathologically slow to decode, the decoder keeps up with (or ideally outpaces) the peak SSD read speeds.

I'm happy the peak decompression speed came out the way it did [capable of 22GB/s] but we always knew it was going to be high and didn't worry about it much."


I also found this really interesting on the PS5's IO die. Seemingly nothing special there from a hardware perspective:

"Along the same lines, 2 helper processors in an IO block that has both a full Flash controller and the decompression/memory mapping/etc. units is not by itself remarkable. Every SSD controller has one. That's what processes the SATA/NVMe commands, does the wear leveling, bad block remapping and so forth. The special part is not that these processors exist, but rather that they run custom firmware that implements a protocol and feature set quite different from what you would get in an off-the-shelf SSD."

There are the coherency engine and GPU scrubber on PS5 GPU hardware part that aren't into a off the shelf SSD or other GPU. The Flash controller is custom too and reading the patent it is optimized for read speed but the I/O complex itself out the hardware decompressor and coherency engine is nothing special just optimized on firmware side, OS I/O stack and API.
 
In the context of UE5, I meant decoding fast enough for Nanite's requirements when using a modern CPU core (Zen 2 / Ice Lake and later), at ~3GHz.

It's essential for good out-of-core streaming to prepare data ahead, predict what's coming next correctly, and don't overcommit. E.g. in a fast movement trash all your memory, and start to look silly after one revolution, because you've thrown it all away. You don't need the bandwidth of the entire scene's geometry (or textures) as available peak performance. There's space for compromise, and giving importance to other aspects.
 
Source 2 supports spline based models? Is it ina a custom format or some industry standard? Where can I read more about it?

Below is a youtube tutorial showing off some of the UV tools/spline stuff in the world editor stuff they made. It's not the craziest stuff either, like auto material applied and seamless UVd spline meshes they have, but you get the idea. They haven't really said how they did it, hopefully they'll give some sort of talk on it this year.

 
Below is a youtube tutorial showing off some of the UV tools/spline stuff in the world editor stuff they made. It's not the craziest stuff either, like auto material applied and seamless UVd spline meshes they have, but you get the idea. They haven't really said how they did it, hopefully they'll give some sort of talk on it this year.

Reminds me on a thought i had recently, about the history of game levels:
First it was all procedurally in code, e.g. Atari games.
Then instances of blocks in a regular grid, like Super Mario.
Then free form, like Doom. No more grid, no instances. Angled walls too. Full 3D with Quake.
Not detailed enough - add some static meshes for decoration. Can use instances of them.
Yay, let's build the whole level from detailed models and modules! (at least we got rid of Super Marios grid)

It's like a cycle. But the primary problems remain open:
Free form sucks because tiling textures still add grid and right angle constraints, which this video shows really nicely.
Modules suck because to make them connectable we get the same grid and angle limitations.
Hiding repetition is difficult for both.

An alternative like Dreams does not share any of those problems, but does not share efficiency either. I've seen how guys model wooden planks or rocks with tedious sculpting work. I classify this into the procedural Atari category.

So what's the solution? Supporting all of this crap in one engine? Or will we indeed figure out something better in a distant future, probably by getting rid of geometry / texture separation, but with smarter tools?
 
That's all single core.
Yes, it's what I wrote in the paragraph you left out of my quote.

He seems to be getting around 4GB/s on one core for Selkie and 2GB/s for Mermaid.


It's not impossible on modern CPU's, it just eats up an impractical number of cores on CPU's with core counts of 8 or less.
Impractical to the point of no one wanting to implement it, yes.
With Selkie that could change, though at the cost of some disk space. It would lead to ~30% larger installations than Kraken, from what I see.


11GB/s. The 3.16:1 example given was just a single (presumably optimal) texture set from a single game.

"The result is that we expect the average compression ratio of games to be much better in the future, closer to 2 to 1"

The 3.16:1 definitely isn't optimal because in an earlier post he showed an example of another texture set that provided a 3.99:1 compression ratio when using kraken + oodle texture, which on the PS5 would actually saturate the maximum 22GB/s of the decompressor's output.

unnamed.png


"Close to 2:1" is what Kraken does without Oodle Texture, so saying Kraken + Oodle Texture only averages at 2:1 doesn't make much sense.
oodle_seven_ratio_chart.png
 
Below is a youtube tutorial showing off some of the UV tools/spline stuff in the world editor stuff they made. It's not the craziest stuff either, like auto material applied and seamless UVd spline meshes they have, but you get the idea. They haven't really said how they did it, hopefully they'll give some sort of talk on it this year.



This is a good tool, but it seems pretty standard -- unless I'm missing something people have been doing this (manually) since the mid 90s. Nothing incompatible with any other engine, just continuing valve's tradition of well-chosen tools and polish in their editors.

How I would understand that this is done is that the uvs are aligned in a a straight line strip, and the geometry is dense enough around the corner that the distortions are not visible. This is how roads, trails, and vfx are textured in almost every videogame.

What it looks like valve brings to the table here is typically that's done either manually or by taking topology that is just a regular strip to do it automatically, whereas this seems to be more flexible and workflow friendly -- the level designer can easily do it in editor without worrying about much.
 
That looks so good. I can imagine this being in something like Hellblade 2. Which reminds me, in the MS extended conference today, Ninja Theory mentioned that they've been collaborating with Epic Games. I really can't wait to see what they can do with UE5.

Regards,
SB


If the latest Final Fantasy by Ninja Theory is anything to go by, I have some bad news for you..


EDIT: Ninja Theory != Team Ninja.
I'm old and confused, sorry.
 
Back
Top