Unreal Engine 5, [UE5 Developer Availability 2022-04-05]

Sorry, can I ask a really basic question (well one I think is basic)?!

"How much bandwidth does it take to fill a 1440K resolution with 8K textures for a whole second?" I ask this because we're told this demo had 8K textures only and was running at 1440K...or am I totally misunderstanding?

Also, there's too many pages to trawl through, but has everyone seen the cherno video about this? I found it interesting...nicely explained so even I understand lol
Texture resolution is independent of screen resolution.

Smallest peble or arrowhead can use 8k texture. (or whole landscape.)


Virtual texturing like the one used in the demo allows loading only areas of textures which are visible and in detail level needed.
So a peble with 8k texture which never takes more than 50 pixels in screen, only has small parts of it loaded.

It is also good to remember that loading can be distributed to multiple frames.
So for 1440p image amount of data could be around single 4k image, if scenery is not changing.

Another advantage is that the amount of memory used can be constant independent of amount of textures or their resolutions.
http://silverspaceship.com/src/svt/
 
I am very LTTP due to some stuff happening but i was only just now able to watch the demo, i decided i did not want to watch it for the first time until i have access to my main screen again...

And maybe it's built up the expectations way too much after reading peoples reaction to it but... I don't think it's very mindblowing at all?
Mind blowing aspect is the geometry detail in Nanite. For Lumen, baked light maps used in the past can give similar lighting. It's the dynamic nature of this lighting that is lighing this immense geometry, which you don't get with light maps, that is mind blowing.
 
Sorry, can I ask a really basic question (well one I think is basic)?!

"How much bandwidth does it take to fill a 1440K resolution with 8K textures for a whole second?" I ask this because we're told this demo had 8K textures only and was running at 1440K...or am I totally misunderstanding?
UE5 uses virtual textures. Each object has 8K source textures on disk, but only the parts of those textures that are visible and drawn are required. In a perfect VT engine, you need one texel per pixel, so 1440p would be 3.7 million pixels needing 3.7 texels. I can't remember what maps Epic said they were using, so for illustration let's say 10 bits per pixel for textures. That'd be 37 megabits, or about maybe 5 MB per frame. At 60 fps, if you were refreshing every single texel, that's 300 MB/s tops. However, use of texels is largely shared across frames, so reality is far less than that. You also store tiles including multiple used texels, so RAM requirements are higher than just 5 MB but streaming requirements are much lower.

The game Trials HD used VT at 720p. The calculations for that are on this forum and Sebbbi stated IIRC 7 MB/s. 1440p is 4x that, so 28 MBs doing the same thing trials was.

This is the joy of virtualisation. The old way of rendering graphics was keeping it all in RAM and only using a tiny part of that dataset per frame. Virtualisation allows you to just keep the parts in RAM necessary for what you see (and are about to see). It introduces a huge efficiency in data requirements by reducing the working set. The upside is far, far smaller RAM footprint, allowing more variety etc. The cost of this is higher steaming requirements, but that dataflow isn't particularly bandwidth intensive for the amount of geometry you get to see.

Conceptually, it's like foveated rendering. there's no point drawing the bits of the screen in high fidelity if your eye can't resolve them, so render just the smallest portion your eye is looking at in high quality and render the rest in low. the result is exactly the same but you reduce rendering requirements to a tiny fraction. Here, don't bother storing all the vertices (or texels) if you aren't using that data anytime soon. The future of rendering is moving towards efficiency, which is one of the key arguments some of us have raised over the raw TF comparisons with next-gen consoles over current-gen. Better ways of using the Flops means doing more work for the same raw resources. We're working smarter, not harder, and that evolution is happening all-round in the computing space.
 
And maybe it's built up the expectations way too much after reading peoples reaction to it but... I don't think it's very mindblowing at all?
Yeah, it looks really good but nothing *sits* in the environment? And nothing pops, either?
I felt the environmental lighting pretty good, although the character isn't well lit or connected. I think one of the issues if the reuse of the same assets and basically, it's a whole lot of stone and rock. Not terribly exciting. Not even banners or flags or stuff. But the underlying principle is 'infinite geometry', solving a key problem with art-work creation. If Epic had infinite storage and didn't care about artistic integrity, they could have thrown thousands of unique items into the mix. As a tech, Nanite deserves the interest.

And the lighting I think very effective and 'next-gen', even if it's leaning heavily on current-gen techniques (SDF shadow casters for example). It gives a very coherent lighting model in the environments. It's not ground-breaking in so much as we've seen that level of lighting in games this gen, such an Uncharted 4 's home environments, but these are very static. Or the Tomorrow Children where the environments were super simple. That level of lighting in realtime in complex environments is a clear step up.
 
Mind blowing aspect is the geometry detail in Nanite. For Lumen, baked light maps used in the past can give similar lighting. It's the dynamic nature of this lighting that is lighing this immense geometry, which you don't get with light maps, that is mind blowing.
Yeah, but 500 of the exact same statue and using lots of triangles and high res textures to make jagged brown rocks even more jaggedy, browner and rockier is super redundant after a point, not very interesting to look at and also not a great showcase for what actual games could/will be like using the engine.

Yeah, the lighting being dynamic is impressive but it just doesn't look *THAT* good? Maybe if it was 60fps...
 
Yeah, but 500 of the exact same statue and using lots of triangles and high res textures to make jagged brown rocks even more jaggedy, browner and rockier is super redundant after a point, not very interesting to look at and also not a great showcase for what actual games could/will be like using the engine.

Yeah, the lighting being dynamic is impressive but it just doesn't look *THAT* good? Maybe if it was 60fps...



60 fps is the target, early tech, early devkit, early library...
 
Yeah, but 500 of the exact same statue and using lots of triangles and high res textures to make jagged brown rocks even more jaggedy, browner and rockier is super redundant after a point, not very interesting to look at and also not a great showcase for what actual games could/will be like using the engine.

Yeah, the lighting being dynamic is impressive but it just doesn't look *THAT* good? Maybe if it was 60fps...
Well, the scene content that you are not impressed by is more art direction than technology. Although, I do think the scene suits the technology to look its best.

Seeing 500 statues of that same statue isn't that impressive. But dynamic GI with bounced lighting for that scene geometry density really is impressive. If I had camera control, and I was playing like a Tomb Raider game, I would zoom in on a close up of that statue, and marvel at its geometry detail and lighting, which the engine should scale to with 1:1 triangles to pixels ratio when you use REYES.

Or the last flyby scene at extreme speeds, where I can imagine a Siperman, Mirrors Edge, or Gravity Rush game, flying around an immensely detailed cityscape.
 
Yeah, but 500 of the exact same statue and using lots of triangles and high res textures to make jagged brown rocks even more jaggedy, browner and rockier is super redundant after a point, not very interesting to look at and also not a great showcase for what actual games could/will be like using the engine.
Exactly, but we have to admit that we already do the same since forever with reusing texture. Geometry is just the new texture it seems. Artists will deal with repetition, but the issue is there.
 
Texture resolution is independent of screen resolution.

Smallest peble or arrowhead can use 8k texture. (or whole landscape.)


Virtual texturing like the one used in the demo allows loading only areas of textures which are visible and in detail level needed.
So a peble with 8k texture which never takes more than 50 pixels in screen, only has small parts of it loaded.

It is also good to remember that loading can be distributed to multiple frames.
So for 1440p image amount of data could be around single 4k image, if scenery is not changing.

Another advantage is that the amount of memory used can be constant independent of amount of textures or their resolutions.
http://silverspaceship.com/src/svt/

UE5 uses virtual textures. Each object has 8K source textures on disk, but only the parts of those textures that are visible and drawn are required. In a perfect VT engine, you need one texel per pixel, so 1440p would be 3.7 million pixels needing 3.7 texels. I can't remember what maps Epic said they were using, so for illustration let's say 10 bits per pixel for textures. That'd be 37 megabits, or about maybe 5 MB per frame. At 60 fps, if you were refreshing every single texel, that's 300 MB/s tops. However, use of texels is largely shared across frames, so reality is far less than that. You also store tiles including multiple used texels, so RAM requirements are higher than just 5 MB but streaming requirements are much lower.

The game Trials HD used VT at 720p. The calculations for that are on this forum and Sebbbi stated IIRC 7 MB/s. 1440p is 4x that, so 28 MBs doing the same thing trials was.

This is the joy of virtualisation. The old way of rendering graphics was keeping it all in RAM and only using a tiny part of that dataset per frame. Virtualisation allows you to just keep the parts in RAM necessary for what you see (and are about to see). It introduces a huge efficiency in data requirements by reducing the working set. The upside is far, far smaller RAM footprint, allowing more variety etc. The cost of this is higher steaming requirements, but that dataflow isn't particularly bandwidth intensive for the amount of geometry you get to see.

Conceptually, it's like foveated rendering. there's no point drawing the bits of the screen in high fidelity if your eye can't resolve them, so render just the smallest portion your eye is looking at in high quality and render the rest in low. the result is exactly the same but you reduce rendering requirements to a tiny fraction. Here, don't bother storing all the vertices (or texels) if you aren't using that data anytime soon. The future of rendering is moving towards efficiency, which is one of the key arguments some of us have raised over the raw TF comparisons with next-gen consoles over current-gen. Better ways of using the Flops means doing more work for the same raw resources. We're working smarter, not harder, and that evolution is happening all-round in the computing space.
Cheers, I can’t believe my understanding of how this is working was pretty spot on lol.

I suppose I was trying to work out how this tech would work on a HDD, but I’m still a bit perplexed with that. But at least it’s slowly sinking in. I’m guessing this might be where the PS5 cache scrubbers might help out with all the swapping out of textures. Also I recall people were worried about SSDs overheating and surely this will mean they are constantly chugging away.
 
From a technological point of view, I think next-generation will probably be more interesting than current-gen out of titles like Dreams or Claybook or No Man Sky or Inside. Very fast storage but there is a size problem and innovation will come from this. I am happy to see a REYES renderer like but I want to know what is the solution to keep the game size reasonable.

The second point is triangle RT or GI approximation, it will be interesting too.
 
I think the biggest change Microsoft could make is having one driver for all of the XSX hardware, or just have a trusted driver model so each 'driver' can share without IRPs (I/O Request Packets) which is how drivers usually communicate with other drivers and the Windows kernel. This would work in Series X because Microsoft are only dirver-supplier and they can't be replaced by the user. This cannot work in Windows because drivers come from all sources and IRPs service a resilience and security need.

Cheers for the explanation!

And this is the point. Doing anything in Windows (or macOS or linux) has a tremendous amount of overhead. Not in terms of your PC using a signifiant amount of CPU time, although it can happen, but folks think of their PC as one machine but the reality is Windows holds together a whole bunch of different components that pass data to and from each other with absolutely no understanding of what the other components are or even what the data is. And this is the only way that the PC hardware ecosystem can work.

So really the tremendous flexibility, legacy mindedness and citadel style of multi layer security of the PC can create a kind of "death by a thousand cuts" in terms of some I/O operations.

Whelp, lets hope Unreal 5 makes pooling of assets in ram easy!

But seriously, gold star for trying to dig into this stuff. It's a lot to take in but it also makes you appreciate Windows a bit more! :yes:

Makes me appreciate how much I'm unable to properly appreciate. :runaway:

@DSoup @function what steps are bypassed when using unbuffered reads like Carmack mentioned?

I wish I were qualified to say!

Looking at his use of "unbuffered IO", and looking at some of the other pages on the MS Hardware Dev Centre site DSoup linked to, I think Carmak might be talking about different driver IO modes. How he's able to select between these types of access I dunno. Maybe he's tinkering with drivers, or maybe the drivers expose certain parts of their functionality that he's able to tap into.

The MS page for "Using Direct I/O" (as opposed to buffered) lays out some of the advantages and disadvantages for direct:

https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/using-direct-i-o

"Drivers for devices that can transfer large amounts of data at a time should use direct I/O for those transfers. Using direct I/O for large transfers improves a driver's performance, both by reducing its interrupt overhead and by eliminating the memory allocation and copying operations inherent in buffered I/O.

Generally, mass-storage device drivers request direct I/O for transfer requests, including lowest-level drivers that use direct memory access (DMA) or programmed I/O (PIO), as well as any intermediate drivers chained above them."

...

"Drivers must take steps to maintain cache coherency during DMA and PIO transfers."

Someone else is going to have to say whether I'm barking up the right tree or not!
 
Well, the scene content that you are not impressed by is more art direction than technology. Although, I do think the scene suits the technology to look its best.

Seeing 500 statues of that same statue isn't that impressive. But dynamic GI with bounced lighting for that scene geometry density really is impressive. If I had camera control, and I was playing like a Tomb Raider game, I would zoom in on a close up of that statue, and marvel at its geometry detail and lighting, which the engine should scale to with 1:1 triangles to pixels ratio when you use REYES.

Or the last flyby scene at extreme speeds, where I can imagine a Siperman, Mirrors Edge, or Gravity Rush game, flying around an immensely detailed cityscape.

Will we have that kind of density without it being 500 of the exact same statue? There isn't some sort of discounted duplication/reuse thing going on?
And same thing goes for the environment in the speed section, will that amount of density at high travelling speeds be possible when most assets are not just some variation of jagged brown rock?

That is probably being unfairly suspicious, i guess, though i do wish the demo dispelled some of those thoughts.
 
Yeah, the lighting being dynamic is impressive but it just doesn't look *THAT* good? Maybe if it was 60fps...

Don't know if it was just me but that final scene where she flies certainly didn't look like 30fps vsync. It looked like it was 50+fps and very smooth. Like a movie. :)
 
Last edited by a moderator:
Looks like i won't have to upgrade for next gen (2080Ti/3950x/nvme) to play at next gen console level quality of game ports, guess most will be UE5.
 
@DSoup @function what steps are bypassed when using unbuffered reads like Carmack mentioned?


Unbuffered I/O. There are a few different options for using unbuffered I/O in Windows and I don't read Carmack as suggesting it as a viable option, he said "quibble" as in a technicality, i.e. you can drive most cars with just 3 wheels on but you wouldn't want too.

The issue with unbuffered I/O is if you don't respond to I/O reads/writes fast enough then data is lost, which can be catastrophic for the storage system. It's not all abut reads, there are writes too. Unless he says more about how he would do it and mitigate data loss, we're left guessing.
 
pretty good difference between normal and high model, just look at self-shadowing

Insidentaly, the detail added by better self-shadowing could conceptually be achieved without micro-geometry. It is possible to trace a ray, or do some cheap aproximation of that in texture-space against a heightmap of the model and integrate that with the shadowmapping results. The inner surface bumps can also be aproximated with POM or its cousin algo's. It's really the silhouettes that are harder to replace actual geometry with (although even that is not impossible)
 
Back
Top