Virtual Texture Issues and Limitations (Gen 9/PC)

I really hope it's the case but I'm not that optimisitc. For example, you can't just load the texels you need, because the performance is likely to be very bad. However, if you need to load at least part of the texture (let's say in 4KB blocks), then the amount of data you might need could be very large. Not to mention that the main memory is not that good at random access. If you load from a large data set randomly, you'll be lucky to get 10% (or even less) of the theoretical memory bandwidth.
From what I've seen, VT is not just "takes programmer time." It's likely to need a huge amount of engineering efforts and in the end you might still get random stuttering when the stars are not aligned.

Doom Eternal uses (non Megatexture) VT and Raytracing and is possibly the most performant game out there. It would suggest that it's 'just' an engineering challenge? Unless you're using UE5, when both a streaming and realtime generation pipelines are availble for 'free'.
 
I'm rather pessimistic too.
It also depends a lot on scene complexity. If it's fuzzy, e.g. trees, fences, lots of holes like windows, etc., you may only fetch a few texels per block, but you still need a huge number of blocks.
VT helps to minimize the problem and should be done, but i don't expect it to be a silver bullet.
 
Doom Eternal uses (non Megatexture) VT and Raytracing and is possibly the most performant game out there. It would suggest that it's 'just' an engineering challenge? Unless you're using UE5, when both a streaming and realtime generation pipelines are availble for 'free'.

Doom Eternal's levels are quite closed (and does not have too many decorations) compared to current open world games, so while it's faster I wouldn't consider it as a good indicator on how such techniques will perform in more recent games.
 
I really hope it's the case but I'm not that optimisitc. For example, you can't just load the texels you need, because the performance is likely to be very bad. However, if you need to load at least part of the texture (let's say in 4KB blocks), then the amount of data you might need could be very large. Not to mention that the main memory is not that good at random access. If you load from a large data set randomly, you'll be lucky to get 10% (or even less) of the theoretical memory bandwidth.
From what I've seen, VT is not just "takes programmer time." It's likely to need a huge amount of engineering efforts and in the end you might still get random stuttering when the stars are not aligned.

Of course it goes without saying that if a texel is visible on screen, the chances of its neighbors also being on screen is higher than them not being. The real question is: In real world scenes, what is the typicap overhead. Sebbbi provided figures based on his shipped games. He said it was between 3x 4x. That is still less than other naive methods.
 
Last edited:
I'm rather pessimistic too.
It also depends a lot on scene complexity. If it's fuzzy, e.g. trees, fences, lots of holes like windows, etc., you may only fetch a few texels per block, but you still need a huge number of blocks.
VT helps to minimize the problem and should be done, but i don't expect it to be a silver bullet.

As compared to what solution for the same scene? Naively having all mips of all textures of all models on screen (and potentially for multiple LODs of all models on screen)

For whatever worst case scenario you can conjure up for VT, not having VT will fare even worse.
 
Last edited:
Doom Eternal's levels are quite closed (and does not have too many decorations) compared to current open world games, so while it's faster I wouldn't consider it as a good indicator on how such techniques will perform in more recent games.

Epic strongly recommend VT as best practice for Nanite, so I think it's reasonable safe bet that the Matrix demo uses the VT streaming pipeline. That's as modern open world as it comes, so far at least.
 
Of course it goes without saying that if a texel is visible on screen, the chances of its neighbors also being on screen is higher than them not being. The real question is: In real world scenes, what is the typicap overhead. Sebbbi provided figures based on his shipped games. He said it was between 3x 4x. That is still less than other naive methods.

Yes, but you can't just load, let's say, 10 texels from a texture, and expect it to achieve max throughput at the same time. So in the end you either have to load much more from a texture or have a much less throughput.
That's why I think VT might help some cases, such as reducing latency in some situation, but you can't expect VT to allow you to stream everything from main memory. For example, in an indoor scene you might want to have most indoor textures in VRAM, but keep outdoor textures in main memory, so you can stream the outdoor textures some outdoor pixels may need, but the majority of the textures for indoor pixels are still readily available in VRAM.
Now, some might ask if it's possible to make a caching scheme to achieve this automatically. It could be possible, but simple schemes are likely to cause cache thrashing in some situation, which can be difficult to avoid.

[EDIT] One thing to clarify is, page based VT is always a useful function because obviously it's always better to be able to load just a part of a texture instead of the full data.
 
Last edited:
As compared to what solution for the same scene? Naively having all mips of all textures of all models on screen (and potentially for multiple LODs of all models on screen)

For whatever worst case scenario you can conjure up for VT, not having VT will fare even worse.
Iirc, i mean't i'm pessimistic about the idea to determine the data needed, then stream it, then render it, all in one frame at minimal memory requirements.
I rather want to pre cache stuff based on speculation, accepting we might not use all of it, and making sure we still have a fallback if it's not ready.

But i agree some form of VT is mandatory.
 
Iirc, i mean't i'm pessimistic about the idea to determine the data needed, then stream it, then render it, all in one frame at minimal memory requirements.
I rather want to pre cache stuff based on speculation, accepting we might not use all of it, and making sure we still have a fallback if it's not ready.

But i agree some form of VT is mandatory.

Some amount of speculative prediction is obviously necessary if you wanna avoid showing ugly lowe mips at sudden large camera/object movements.

Interestingly, for next gen games one is bound to need more than what is only on the direct view frustrum for GI, so a system that predict what serfaces might contribute to lighting secondarily, through reflections and light bouncing has to exist regardless. The preentive caching might just hitch a ride on that...
 
I really hope it's the case but I'm not that optimisitc. For example, you can't just load the texels you need, because the performance is likely to be very bad. However, if you need to load at least part of the texture (let's say in 4KB blocks), then the amount of data you might need could be very large. Not to mention that the main memory is not that good at random access. If you load from a large data set randomly, you'll be lucky to get 10% (or even less) of the theoretical memory bandwidth.
From what I've seen, VT is not just "takes programmer time." It's likely to need a huge amount of engineering efforts and in the end you might still get random stuttering when the stars are not aligned.

Trials ran at a smooth 60fps generations ago, it's fine. Especially with NVME drives. Multiple things actually run some scale of virtualized texturing already, both Assassin's Creed and the (defunct?) recent Tom Clancy open world games ran landscape virtualized texturing off hard drives, no one noticed. I'd hardly be surprised if Snowdrop and the updated AC engine both have full Virtualized Texturing. Heck afaik the Matrix City UE5 demo runs full VT, works on a Series S.
 
Interestingly, for next gen games one is bound to need more than what is only on the direct view frustrum for GI, so a system that predict what serfaces might contribute to lighting secondarily, through reflections and light bouncing has to exist regardless. The preentive caching might just hitch a ride on that...
Yeah. It's my GI system which motivates me exploring all this unique detail / procedural material composition stuff.
If it works, it would compensate for the downside of requiring some global parametrization and related offline processing time hurting production iteration times.
But now i regret to have jumped into this rabbit hole so naively... <: )

However, i can answer which surfaces contribute to indirect lighting: All of them. \:D/
(I'm always puzzled about GI solutions which get away with using only partial scenes)
 
Trials ran at a smooth 60fps generations ago, it's fine. Especially with NVME drives. Multiple things actually run some scale of virtualized texturing already, both Assassin's Creed and the (defunct?) recent Tom Clancy open world games ran landscape virtualized texturing off hard drives, no one noticed. I'd hardly be surprised if Snowdrop and the updated AC engine both have full Virtualized Texturing. Heck afaik the Matrix City UE5 demo runs full VT, works on a Series S.

They definitely not loading all of their textures directly from main memory (except Xbox which main memory and VRAM are the same thing), not to mention storage.
 
They definitely not loading all of their textures directly from main memory (except Xbox which main memory and VRAM are the same thing), not to mention storage.

qThey are generating the final textures at runtime, and possibly streaming in base textures from the HDD.
 
wasn't it texture decompression that was being accelerated? had nothing to do with streaming in it self, correct me if i'm mistaken.
 
Yes. GPU transcoding. I don’t recall it having much actual benefit in practice.
If you had a lopsided a system with a weaker CPU it would aid perf. The issue is, the Game was Limited to max 60 FPS so you would never really see it. With mods unlocking the frame-rate it would be more apparent

IMO idtech 5 is pretty Bad in Terms of design Goals - driven fully by camrack's fixations and the deaire for 60 FPS on console no Matter how awful the game looked.
 
Senior technical artist at CD Project Red discussing his interest in implementing some of the ideas discussed here in UE5.


Lots of names to describe this: voronoi with custom texture, tile sampler (in SD), grid 2d structure, etc.User inputs an atlas of shapes or textures, and they get scattered on a parameterized space.
The idea here is that we assume that the landscape gets rendered into a runtime virtual texture, so we can hopefully get away with a complex, runtime procedurally textured material. The overall cost is unclear at this point.
In theory the user could scatter ground detail like stones, sticks, grass blades, leaves, etc, in an almost (or fully) infinitely unrepeating manner.
 
Yes, but you can't just load, let's say, 10 texels from a texture, and expect it to achieve max throughput at the same time. So in the end you either have to load much more from a texture or have a much less throughput.
That's why I think VT might help some cases, such as reducing latency in some situation, but you can't expect VT to allow you to stream everything from main memory. For example, in an indoor scene you might want to have most indoor textures in VRAM, but keep outdoor textures in main memory, so you can stream the outdoor textures some outdoor pixels may need, but the majority of the textures for indoor pixels are still readily available in VRAM.
Now, some might ask if it's possible to make a caching scheme to achieve this automatically. It could be possible, but simple schemes are likely to cause cache thrashing in some situation, which can be difficult to avoid.

[EDIT] One thing to clarify is, page based VT is always a useful function because obviously it's always better to be able to load just a part of a texture instead of the full data.
I don't think a virtual texturing system will ever read just 10 texels. It will read an entire page of texels which is at least 4kB. I expect systems will then pull the data into vram and read it from there until a heuristic results in the page being evicted from vram. I'm sure there are multiple ways to implement virtual texturing but this is how I understand it to work.
 
I don't think a virtual texturing system will ever read just 10 texels. It will read an entire page of texels which is at least 4kB. I expect systems will then pull the data into vram and read it from there until a heuristic results in the page being evicted from vram. I'm sure there are multiple ways to implement virtual texturing but this is how I understand it to work.

Yes, of course. I just wanted to point out that expecting a virtual texturing system to provide textures for an entire scene in real time is unrealistic.
 
Last edited:
Back
Top