HDR comparison, I16 vs FP16

Humus said:
Not really. It's not going to be 100% equivalent to what a bilinear filter would return on FP16, but that's not the point either. A bilinear filter isn't the most optimal way to reconstruct the underlying signal from the samples, so comparing error to that isn't that useful.
Replace bilinear with any other filtering mechanism say sinc and you will still get errors because of the maths rather then the failures in the filtering method.

Kombatant said:
What makes you so certain?
If they can't do it for the full screen how is it any more likely they can do it per triangle?
 
bloodbob said:
Replace bilinear with any other filtering mechanism say sinc and you will still get errors because of the maths rather then the failures in the filtering method.

Errors or not, it doesn't matter other than from a purely theoretical point of view. Bilinear has errors vs the underlying signal too. What matters is if it works well in practice and in particular if it looks smooth, which it does.
 
Kombatant said:
What makes you so certain?
Because it requires a change in how pixels are generated for shading. If the support isn't there now, it's exceedingly unlikely that new drivers could possibly add the support.
 
For a ps3.0 card, the selective supersampling is a pretty pointless feature anyway. You can achieve the same thing with much more fine grained control from the application side by simply using flow control and derivates.
 
Humus said:
For a ps3.0 card, the selective supersampling is a pretty pointless feature anyway. You can achieve the same thing with much more fine grained control from the application side by simply using flow control and derivates.
Perhaps, if flow control is free, but even then I'd wager to guess that a hardware-accelerated form would be a fair amount faster.

That said, it'd be nice to see some hardware support for what amounts to selective supersampling within the shader. For example, the developer could define certain variables within the shader that would need to be calculated per-sample, and the shader compiler would calculate everything leading up to those variables per-sample as well.
 
Chalnoth said:
Because it requires a change in how pixels are generated for shading. If the support isn't there now, it's exceedingly unlikely that new drivers could possibly add the support.

Since (from what I gather from your post) you are not sure if the support is there or not, you can't make definitive statements, can you? Hence I was curious.
 
Humus said:
For a ps3.0 card, the selective supersampling is a pretty pointless feature anyway. You can achieve the same thing with much more fine grained control from the application side by simply using flow control and derivates.

That assumes developers write shaders which don't alias, but most games today *aren't* PS3.0 games and even bleeding edge games don't dedicated spare GPU cycles to eliminating aliasing. The fact is, selective supersampling is damn useful for current games, so I'd say that "pointless" is a pretty harsh statement.


Kombatant,
NVidia's technique, for correctness and absence of artifacts, requires a modification to centroid sampling in the presence of multisample masks, which prior NV cards didn't have. I don't know about ATI's cards.
 
It was a long time (beginning of R300), where there where discussions about ATIs centroid sampling. and AFAI remember, they can switch it / change it in the driver. Don't know if it coul be changed the way needed for supporting this feature, thought.

but selective supersampling, multisampling, or not sampling at all should definitely be possible in the driver. if the framebuffer is set up for a certain sampling-amount, there is no difference anymore really, the rest can get changed in software.

but possibly there is some tiny restriction somewhere wich disables the possibility :(

well well, we'll see how the future of sampling routines evolve..



this topic actually is about the HDR stuff.


humus, what you created is sort of RGBE16, isn't it? the normal spec for RGBE is 8bit per component, and yours is 16.. but else, it sounds like it's equal..
 
davepermen said:
humus, what you created is sort of RGBE16, isn't it? the normal spec for RGBE is 8bit per component, and yours is 16.. but else, it sounds like it's equal..
I was thinking the same thing. However, Humus' version is more amenable to filtering, since the "mantissa" of the dominant colour component(s) will be at or near 1.0 throughout gradients. Using a bilinear filter on true RGBE textures may be a bit funky.

I think one way to get really nice results would be to use a regular RGBA8 texture, use the alpha channel do a lookup into a small 1D I16 exponential ramp texture (possibly tailored to each texture?) with filtering, then mulitply.

What do you think, Humus?
 
DemoCoder said:
That assumes developers write shaders which don't alias, but most games today *aren't* PS3.0 games and even bleeding edge games don't dedicated spare GPU cycles to eliminating aliasing. The fact is, selective supersampling is damn useful for current games, so I'd say that "pointless" is a pretty harsh statement.

Current games don't use selective supersampling either. And if games aren't going to spend spare GPU cycles to eliminate aliasing, then how is it "damn useful" then?

What I'm saying is that with flow control and derivates, you can easily implement supersampling yourself in the shader, and control it from the application (including the exact sample points), meaning you can also set it to only use one sample if that's what the user wants, and you're able to select the level of supersampling dynamically across a surface as you see fit.

Chalnoth said:
Perhaps, if flow control is free, but even then I'd wager to guess that a hardware-accelerated form would be a fair amount faster.

Only in naive cases where you translate it directly. In the majority of the cases, you only need to supersample a subset of the computations. Say for instance you have a lighting computation, maybe the specular component introduces some aliasing, but the diffuse, ambient, lightmap, some shadow mask projection etc does not. In that case, you only need to supersample the specular part and do the rest only once per pixel, which would be a helluva lot faster than supersampling the entire shader, even with special hardware support.
 
Humus said:
Only in naive cases where you translate it directly. In the majority of the cases, you only need to supersample a subset of the computations. Say for instance you have a lighting computation, maybe the specular component introduces some aliasing, but the diffuse, ambient, lightmap, some shadow mask projection etc does not. In that case, you only need to supersample the specular part and do the rest only once per pixel, which would be a helluva lot faster than supersampling the entire shader, even with special hardware support.
Well, right, hence my statements after what you quoted. That is to say, when doing supersampling many of the computations can realistically be made at very low precision, such as subsample position computation and sample weighting. So it'd be far more efficient to leverage the existing multisample hardware to do these computations.
 
davepermen said:
humus, what you created is sort of RGBE16, isn't it? the normal spec for RGBE is 8bit per component, and yours is 16.. but else, it sounds like it's equal..

It's similar, but with the exception that RGBE stores an exponent in alpha, while this stores a linear range, maybe you could call it RGBR[ange] or maybe RGBL[inear]. RGBE unfortunately doesn't works too well with filtering, while this method works quite well. RGBE looks ok in most parts of the image though, but has enough aliasing in high-contrast areas to be useless in practice. RGBE requires more instructions to decode, though it needs less memory bandwidth.
 
Chalnoth said:
Well, right, hence my statements after what you quoted. That is to say, when doing supersampling many of the computations can realistically be made at very low precision, such as subsample position computation and sample weighting. So it'd be far more efficient to leverage the existing multisample hardware to do these computations.

Ah, I'm a bit too quick this morning. I see what you say now. Well, I'm not sure how easy that would be to implement, but it feels more like squeezing in more fixed function inside a programmable pipeline to do something you can already do programmatically, so I'm not quite convinced that it's worth it.
 
Awesome work Humus. :) You're the Man!! :D

So when will we R3xx and R4xx owners see the advantage of this in the drivers?

Thx
US
 
Last edited by a moderator:
Humus said:
Ah, I'm a bit too quick this morning. I see what you say now. Well, I'm not sure how easy that would be to implement, but it feels more like squeezing in more fixed function inside a programmable pipeline to do something you can already do programmatically, so I'm not quite convinced that it's worth it.
Well, I just felt that it'd be nice to have access to the existing fixed-function hardware within the pipeline. Would also be nice for texture filtering, so that you could, for example, do anisotropic texture filtering on the result of specular lighting off of a bump map.
 
I'd prefer the discussion to stay on topic if you don't mind, because I find it very interesting.

As for TSAA, for the moment there's nothing that could convince me that it's "pointless"; what I definitely consider pointless are agendas.
 
Unknown Soldier said:
Awesome work Humus. :) You're the Man!! :D

So when will we R3xx and R4xx owners see the advantage of this in the drivers?

Thx
US

Everything discussed here so far has been application side stuff, including the off-topic stuff. The driver can't just override FP16 with I16.
 
Back
Top