DirectStorage GPU Decompression, RTX IO, Smart Access Storage

The tweet could have done with expressing. These posts are showing noise from no texture mip filtering, and the sampling for mips uses dithering. This is now introducing noise to textures where textures to date have had none. As pointed out in some comments, we are taking scenes that are already noisy from low sampling and now adding noise to the textures too. There won't be a stable pixel on the screen!

The question is how to get around this. Can mip levels be introduced to the encoding?
This is something I already have expressed concerns about earlier. That tweet is not entirely right btw, is does use mip-maps but it just point samples them instead of linear/tri-linear. It does use a sampling pattern that converges to cubic interpolation with dlss which in theory is better than linear so that is nice. But dlss does not get rid of all noise, especially with magnification. If you question is if you can introduce linear sampling like with normal texture sampling. Then the answer is yes but you have to decode multiple texel values before you interpolate because you can't interpolate the parameters before they go into the neural network. So that will increase the cost significantly and probably making it unusable. But IMHO the noise makes it unusable also. I think the original paper downplays the noise you need to get rid off. Yes it works with in these simple scenes but especially in engines like UE where artists can make custom materials you get situations with material aliasing which will add even more noise.
 
does use mip-maps but it just point samples them instead of linear/tri-linear
It's sampling the latent feature mip-maps at two different levels, 4 points each.

That does permit the network to effectively do tri-linear / bicubic filtering filtering, but it's not using the mip-maps in the way they're intended, which is to provide a cutoff in frequency space as close to required frequency as possibly. Instead it's only using it as a reference point to ensure that features 2-3 octaves lower don't get completely missed. It's also only using that as network inputs, so the chance that the trained network happens to actually doing proper filtering is low.
Then the answer is yes but you have to decode multiple texel values before you interpolate because you can't interpolate the parameters before they go into the neural network. So that will increase the cost significantly and probably making it unusable.
All good if you can properly batch that. The cost is not in the arithmetic (neither tensor evaluation, let alone the activation function), nor the register pressure from the hidden layers. It's primarily in streaming the tensor in between layers.

So if you can reuse the tensor before it's gone from the cache, the overhead is acceptable.

It would not help though. If you look closely, the noise dominates so badly in the upper 1-2 octaves, and only at the point at which it's close to being supported by the "latent feature" mip-map it turns stable enough that a reasonably low sample count could produce an improvement.

The high frequency features appear to be incorrect in both frequency and phase, only the amplitude is somewhat correct. Only the features up to one octave above the support points from the mip-map appear to have been reconstructed reasonably well.
 
Last edited:
The tweet could have done with expressing. These posts are showing noise from no texture mip filtering, and the sampling for mips uses dithering. This is now introducing noise to textures where textures to date have had none. As pointed out in some comments, we are taking scenes that are already noisy from low sampling and now adding noise to the textures too. There won't be a stable pixel on the screen!

The question is how to get around this. Can mip levels be introduced to the encoding?
 
Back
Top