Virtual Texture Issues and Limitations (Gen 9/PC)

cheapchips · Jul 5, 2023

pcchen said:
I really hope it's the case but I'm not that optimisitc. For example, you can't just load the texels you need, because the performance is likely to be very bad. However, if you need to load at least part of the texture (let's say in 4KB blocks), then the amount of data you might need could be very large. Not to mention that the main memory is not that good at random access. If you load from a large data set randomly, you'll be lucky to get 10% (or even less) of the theoretical memory bandwidth.
From what I've seen, VT is not just "takes programmer time." It's likely to need a huge amount of engineering efforts and in the end you might still get random stuttering when the stars are not aligned.

Doom Eternal uses (non Megatexture) VT and Raytracing and is possibly the most performant game out there. It would suggest that it's 'just' an engineering challenge? Unless you're using UE5, when both a streaming and realtime generation pipelines are availble for 'free'.

JoeJ · Jul 5, 2023

I'm rather pessimistic too.
It also depends a lot on scene complexity. If it's fuzzy, e.g. trees, fences, lots of holes like windows, etc., you may only fetch a few texels per block, but you still need a huge number of blocks.
VT helps to minimize the problem and should be done, but i don't expect it to be a silver bullet.

pcchen · Jul 5, 2023

cheapchips said:
Doom Eternal uses (non Megatexture) VT and Raytracing and is possibly the most performant game out there. It would suggest that it's 'just' an engineering challenge? Unless you're using UE5, when both a streaming and realtime generation pipelines are availble for 'free'.

Doom Eternal's levels are quite closed (and does not have too many decorations) compared to current open world games, so while it's faster I wouldn't consider it as a good indicator on how such techniques will perform in more recent games.

milk · Jul 5, 2023

pcchen said:
I really hope it's the case but I'm not that optimisitc. For example, you can't just load the texels you need, because the performance is likely to be very bad. However, if you need to load at least part of the texture (let's say in 4KB blocks), then the amount of data you might need could be very large. Not to mention that the main memory is not that good at random access. If you load from a large data set randomly, you'll be lucky to get 10% (or even less) of the theoretical memory bandwidth.
From what I've seen, VT is not just "takes programmer time." It's likely to need a huge amount of engineering efforts and in the end you might still get random stuttering when the stars are not aligned.

Of course it goes without saying that if a texel is visible on screen, the chances of its neighbors also being on screen is higher than them not being. The real question is: In real world scenes, what is the typicap overhead. Sebbbi provided figures based on his shipped games. He said it was between 3x 4x. That is still less than other naive methods.

milk · Jul 5, 2023

JoeJ said:
I'm rather pessimistic too.
It also depends a lot on scene complexity. If it's fuzzy, e.g. trees, fences, lots of holes like windows, etc., you may only fetch a few texels per block, but you still need a huge number of blocks.
VT helps to minimize the problem and should be done, but i don't expect it to be a silver bullet.

As compared to what solution for the same scene? Naively having all mips of all textures of all models on screen (and potentially for multiple LODs of all models on screen)

For whatever worst case scenario you can conjure up for VT, not having VT will fare even worse.

cheapchips · Jul 5, 2023

pcchen said:
Doom Eternal's levels are quite closed (and does not have too many decorations) compared to current open world games, so while it's faster I wouldn't consider it as a good indicator on how such techniques will perform in more recent games.

Epic strongly recommend VT as best practice for Nanite, so I think it's reasonable safe bet that the Matrix demo uses the VT streaming pipeline. That's as modern open world as it comes, so far at least.

pcchen · Jul 6, 2023

milk said:
Of course it goes without saying that if a texel is visible on screen, the chances of its neighbors also being on screen is higher than them not being. The real question is: In real world scenes, what is the typicap overhead. Sebbbi provided figures based on his shipped games. He said it was between 3x 4x. That is still less than other naive methods.

Yes, but you can't just load, let's say, 10 texels from a texture, and expect it to achieve max throughput at the same time. So in the end you either have to load much more from a texture or have a much less throughput.
That's why I think VT might help some cases, such as reducing latency in some situation, but you can't expect VT to allow you to stream everything from main memory. For example, in an indoor scene you might want to have most indoor textures in VRAM, but keep outdoor textures in main memory, so you can stream the outdoor textures some outdoor pixels may need, but the majority of the textures for indoor pixels are still readily available in VRAM.
Now, some might ask if it's possible to make a caching scheme to achieve this automatically. It could be possible, but simple schemes are likely to cause cache thrashing in some situation, which can be difficult to avoid.

[EDIT] One thing to clarify is, page based VT is always a useful function because obviously it's always better to be able to load just a part of a texture instead of the full data.

JoeJ · Jul 6, 2023

milk said:
As compared to what solution for the same scene? Naively having all mips of all textures of all models on screen (and potentially for multiple LODs of all models on screen)

For whatever worst case scenario you can conjure up for VT, not having VT will fare even worse.

Iirc, i mean't i'm pessimistic about the idea to determine the data needed, then stream it, then render it, all in one frame at minimal memory requirements.
I rather want to pre cache stuff based on speculation, accepting we might not use all of it, and making sure we still have a fallback if it's not ready.

But i agree some form of VT is mandatory.

milk · Jul 6, 2023

JoeJ said:
Iirc, i mean't i'm pessimistic about the idea to determine the data needed, then stream it, then render it, all in one frame at minimal memory requirements.
I rather want to pre cache stuff based on speculation, accepting we might not use all of it, and making sure we still have a fallback if it's not ready.

But i agree some form of VT is mandatory.

Some amount of speculative prediction is obviously necessary if you wanna avoid showing ugly lowe mips at sudden large camera/object movements.

Interestingly, for next gen games one is bound to need more than what is only on the direct view frustrum for GI, so a system that predict what serfaces might contribute to lighting secondarily, through reflections and light bouncing has to exist regardless. The preentive caching might just hitch a ride on that...

Frenetic Pony · Jul 6, 2023

pcchen said:
I really hope it's the case but I'm not that optimisitc. For example, you can't just load the texels you need, because the performance is likely to be very bad. However, if you need to load at least part of the texture (let's say in 4KB blocks), then the amount of data you might need could be very large. Not to mention that the main memory is not that good at random access. If you load from a large data set randomly, you'll be lucky to get 10% (or even less) of the theoretical memory bandwidth.
From what I've seen, VT is not just "takes programmer time." It's likely to need a huge amount of engineering efforts and in the end you might still get random stuttering when the stars are not aligned.

Trials ran at a smooth 60fps generations ago, it's fine. Especially with NVME drives. Multiple things actually run some scale of virtualized texturing already, both Assassin's Creed and the (defunct?) recent Tom Clancy open world games ran landscape virtualized texturing off hard drives, no one noticed. I'd hardly be surprised if Snowdrop and the updated AC engine both have full Virtualized Texturing. Heck afaik the Matrix City UE5 demo runs full VT, works on a Series S.

JoeJ · Jul 6, 2023

milk said:
Interestingly, for next gen games one is bound to need more than what is only on the direct view frustrum for GI, so a system that predict what serfaces might contribute to lighting secondarily, through reflections and light bouncing has to exist regardless. The preentive caching might just hitch a ride on that...

Yeah. It's my GI system which motivates me exploring all this unique detail / procedural material composition stuff.
If it works, it would compensate for the downside of requiring some global parametrization and related offline processing time hurting production iteration times.
But now i regret to have jumped into this rabbit hole so naively... <: )

However, i can answer which surfaces contribute to indirect lighting: All of them. \

/
(I'm always puzzled about GI solutions which get away with using only partial scenes)

pcchen · Jul 7, 2023

Frenetic Pony said:
Trials ran at a smooth 60fps generations ago, it's fine. Especially with NVME drives. Multiple things actually run some scale of virtualized texturing already, both Assassin's Creed and the (defunct?) recent Tom Clancy open world games ran landscape virtualized texturing off hard drives, no one noticed. I'd hardly be surprised if Snowdrop and the updated AC engine both have full Virtualized Texturing. Heck afaik the Matrix City UE5 demo runs full VT, works on a Series S.

They definitely not loading all of their textures directly from main memory (except Xbox which main memory and VRAM are the same thing), not to mention storage.

milk · Jul 7, 2023

pcchen said:
They definitely not loading all of their textures directly from main memory (except Xbox which main memory and VRAM are the same thing), not to mention storage.

qThey are generating the final textures at runtime, and possibly streaming in base textures from the HDD.

DavidGraham · Jul 9, 2023

JoeJ said:
Rage did it in software too because HW feature was not ready / widespread back then.

I remember that Rage on PC had an option to do texture streaming on GPUs via CUDA.

doob · Jul 9, 2023

wasn't it texture decompression that was being accelerated? had nothing to do with streaming in it self, correct me if i'm mistaken.

techuse · Jul 9, 2023

doob said:
wasn't it texture decompression that was being accelerated? had nothing to do with streaming in it self, correct me if i'm mistaken.

Yes. GPU transcoding. I don’t recall it having much actual benefit in practice.

Dictator · Jul 9, 2023

techuse said:
Yes. GPU transcoding. I don’t recall it having much actual benefit in practice.

If you had a lopsided a system with a weaker CPU it would aid perf. The issue is, the Game was Limited to max 60 FPS so you would never really see it. With mods unlocking the frame-rate it would be more apparent

IMO idtech 5 is pretty Bad in Terms of design Goals - driven fully by camrack's fixations and the deaire for 60 FPS on console no Matter how awful the game looked.

milk · Jul 11, 2023

Senior technical artist at CD Project Red discussing his interest in implementing some of the ideas discussed here in UE5.

https://twitter.com/x/status/1678065800379260928

Lots of names to describe this: voronoi with custom texture, tile sampler (in SD), grid 2d structure, etc.User inputs an atlas of shapes or textures, and they get scattered on a parameterized space.

The idea here is that we assume that the landscape gets rendered into a runtime virtual texture, so we can hopefully get away with a complex, runtime procedurally textured material. The overall cost is unclear at this point.

In theory the user could scatter ground detail like stones, sticks, grass blades, leaves, etc, in an almost (or fully) infinitely unrepeating manner.

3dcgi · Jul 13, 2023

pcchen said:
Yes, but you can't just load, let's say, 10 texels from a texture, and expect it to achieve max throughput at the same time. So in the end you either have to load much more from a texture or have a much less throughput.
That's why I think VT might help some cases, such as reducing latency in some situation, but you can't expect VT to allow you to stream everything from main memory. For example, in an indoor scene you might want to have most indoor textures in VRAM, but keep outdoor textures in main memory, so you can stream the outdoor textures some outdoor pixels may need, but the majority of the textures for indoor pixels are still readily available in VRAM.
Now, some might ask if it's possible to make a caching scheme to achieve this automatically. It could be possible, but simple schemes are likely to cause cache thrashing in some situation, which can be difficult to avoid.

[EDIT] One thing to clarify is, page based VT is always a useful function because obviously it's always better to be able to load just a part of a texture instead of the full data.

I don't think a virtual texturing system will ever read just 10 texels. It will read an entire page of texels which is at least 4kB. I expect systems will then pull the data into vram and read it from there until a heuristic results in the page being evicted from vram. I'm sure there are multiple ways to implement virtual texturing but this is how I understand it to work.

pcchen · Jul 13, 2023

3dcgi said:
I don't think a virtual texturing system will ever read just 10 texels. It will read an entire page of texels which is at least 4kB. I expect systems will then pull the data into vram and read it from there until a heuristic results in the page being evicted from vram. I'm sure there are multiple ways to implement virtual texturing but this is how I understand it to work.

Yes, of course. I just wanted to point out that expecting a virtual texturing system to provide textures for an entire scene in real time is unrealistic.

Virtual Texture Issues and Limitations (Gen 9/PC)

cheapchips

JoeJ

pcchen

Moderator

milk

Like Verified

milk

Like Verified

cheapchips

pcchen

Moderator

JoeJ

milk

Like Verified

Frenetic Pony

JoeJ

pcchen

Moderator

milk

Like Verified

DavidGraham

doob

techuse

Dictator

milk

Like Verified

3dcgi

pcchen

Moderator

Similar threads