Custom resolve demo

Texture filtering and mipmapping is only concerned with producing a correct HDR image. What happens to that image later on is irrelevant to how you produce the correct HDR image.
[...]
Then you can of course tonemap it to garbage if you like, but so can you do with any HDR photo too.
I understand what you're saying and agree to an extent. My concern is really that a "bad" tone mapping operator could screw up texture "anti-aliasing" by introducing higher frequencies than those that were band-limited in the pre-filtering process.

For instance consider a tone mapping operator that is a step function. Applying this to the HDR image will naturally produce aliasing due to infinite spatial frequencies. While this is certainly a "garbage" tone mapping function, this is kind of what I mean by an "aggressive" function... exponentials will approach this with high exposures for instance.

So my point is that if we consider the ideal to be super-sampling (doing everything, tone mapping the super-sampled buffer, then finally down-sampling the resulting post-tone-mapped image), tone mapping is still in the wrong place in the process... i.e. it will not give the same result and indeed can introduce new spatial frequency content into the image.

Anyways I'm not 100% sure of my thinking here (just brainstorming) and I'm pretty sure that for most "reasonable" tone mapping functions there won't be a big problem as I stated earlier.

That said, it does seem to me like a poor tone mapping function could spell aliasing hell onto the underlying image. I guess Humus you're saying that that's purely a problem with the tone mapping function and that's fine, but it doesn't change the underlying warnings about highly non-linear tone mapping functions.

Now of course you can also argue that geometry already introduces infinite frequencies into the image, although indeed MSAA is designed to mitigate that somewhat :) In the same way that putting tone mapping in the "wrong spot" relative to MSAA can cause problems with high contrast regions and non-linear tone mapping, so can putting it in the wrong place with respect to texture filtering (consider rendering black and white checkerboard *geometry* vs. *texture*).

Again, I'm not sure that it's a huge issue in practice as you see extremely brightly, contrasting lit textures (maybe snow?) less often then objects rendered against the sky. Similarly however I don't see any edge aliasing problems in the demo - even with normal MSAA resolve - except objects against the bright sky. In any cases where the pre MSAA resolve tone map would make a difference with one piece of geometry overlayed on another, I'm pretty sure that a pre-texture-filter tone map would make the equivalent difference.

Anyways just food for thought...
 
The rendering of the scene is that exact same between the two. It's still standard multisampled rendering, so the pixel shader is only executed once per-pixel, rather than per-sample. The only thing that differs is the resolve. The traditional way by calling Resolve() and then passing the result through a tonemap shader, and in the CR way by sampling the render target and tonemapping it sample by sample.
Thanks - i was just baffled about the enormous performance difference with nothin in view except background texture. :)
 
I think Andew's point is that tonemapping can introduce the same kind of aliasing we see in specularity (which is, in a way, strikingly like a "strong tonemapping"). Specularity isn't, in itself, the cause of the problem.

The only solution is to oversample at the time of computing the specularity/tonemapping.

Normal mapping is, sort of, a mild version of specularity in this sense, which might be why there is normal-map aliasing in the demo (something Andrew reports - I can't run it).

Jawed
 
I think Andew's point is that tonemapping can introduce the same kind of aliasing we see in specularity (which is, in a way, strikingly like a "strong tonemapping"). Specularity isn't, in itself, the cause of the problem.
Yes indeed... sorry if I got to that in a roundabout way, but the problem is not with texture filtering in and of itself, it's that tone mapping can introduce aliasing, since it violates some of the anti-aliasing assumptions that are made earlier in the pipeline. Technically any post-process can do this (the threshold-based bloom techniques are notorious...), but tone mapping certainly has a lot of potential to do it as you're mapping huge ranges down to small ranges.

Normal mapping is, sort of, a mild version of specularity in this sense, which might be why there is normal-map aliasing in the demo (something Andrew reports - I can't run it).
Yeah speaking of specularity, there was a clever paper a while ago that ran through the texture filtering math and figured out a good way to adjust the intensity/exponent of specular highlights based on how denormalized the normal map vector had become by mipmapping/filtering. Certainly a special case (only applies to Phong illumination), but it'd be interesting to see if something similar could apply to tone mapping... I've seen mention on GameDev.net of local tone mapping operators that dynamically decrease the exposure near high-contrast features, which seems to be directly attacking the problem of which I'm speaking.

So while I certainly like the custom MSAA resolve approach (it's cheap and looks much better!), I think down the road we may need to look at the problem more generally, and we may have to bite the bullet and look at local tone mapping operators more seriously.
 
Since these kinds of aliasing are a particular problem during movement, it would be nice to see what happens when rendering techniques cache prior-frame data for re-use in the current frame. By re-using prior frame samples and comparing them against current frame samples you can get a reduced-cost supersampling - the cost being associated with this caching. You could then put adaptivity into the algorithm, such as aliasing-detection (edge-detection) and multiple-frame caching, I guess.

Jawed
 
Since these kinds of aliasing are a particular problem during movement, it would be nice to see what happens when rendering techniques cache prior-frame data for re-use in the current frame. By re-using prior frame samples and comparing them against current frame samples you can get a reduced-cost supersampling - the cost being associated with this caching. You could then put adaptivity into the algorithm, such as aliasing-detection (edge-detection) and multiple-frame caching, I guess.

Jawed

It seems to me that such methods could really break AFR scaling, or am I missing something?
 
It seems to me that such methods could really break AFR scaling, or am I missing something?
I suppose it depends on whether the GPUs working in AFR have a uniform memory system with decent bandwidth (is this feasible?). Certainly AFR that we've seen so far would make this kind of technique "incompatible".

It'll be sad if AFR-incompatibility seriously constrains the types of algorithms used for rendering (algorithms that work fine on single GPUs). Have any developers made any significant comments on this subject? All we seem to have seen so far is bluster from the IHVs. It seems they're so keen on AFR that they're willing to sing from the same hymnbook, quite literally...

Also, I can't help wondering if AFR-incompatibility for rendering techniques would make developers more keen on "CPU-rendering" - would we see AFR as a technique used to enable the scaling of "multiple CPUs" bolted together? Seems doubtful to me, I would expect the "CPU"s (say 2 or more Larrabees) to be indistinguishable from one CPU.

Stuff for another thread I think...

Jawed
 
I suppose it depends on whether the GPUs working in AFR have a uniform memory system with decent bandwidth (is this feasible?). Certainly AFR that we've seen so far would make this kind of technique "incompatible".
The trade-off would be that per-GPU bandwidth and performance would suffer, just as it would in a multi-socket CPU setup.
If the implementation became less accurate, there are a number of knobs that could be turned to increase performance.
Perhaps the implementation would only look at frames more than two or three back, or would try to combine data from multiple frames into a more compact, though possibly lossy form.

The latter would help with the bandwidth problem, the former is something that would lessen the serialization penalty that AFR enforces. If the programmer assumes that nobody will notice that the blur data is a few frames out of date (and hey, it's a blur), it would significantly more likely that the needed data will be ready if the timing margins are more slack.

It'll be sad if AFR-incompatibility seriously constrains the types of algorithms used for rendering (algorithms that work fine on single GPUs). Have any developers made any significant comments on this subject? All we seem to have seen so far is bluster from the IHVs. It seems they're so keen on AFR that they're willing to sing from the same hymnbook, quite literally...
AFR does make the GPU makers' job somewhat easier when it comes to showing high multi-chip scaling. It's like making every workload a SPECrate benchmark.

I read in an interview, however, that Vista currently caps the number of frames that can be rendered. The ATI guy in the interview seemed rather wistful that Vista didn't let them render way ahead of themselves.

Also, I can't help wondering if AFR-incompatibility for rendering techniques would make developers more keen on "CPU-rendering" - would we see AFR as a technique used to enable the scaling of "multiple CPUs" bolted together? Seems doubtful to me, I would expect the "CPU"s (say 2 or more Larrabees) to be indistinguishable from one CPU.
I don't believe any of the problems outlined for multi-GPU setups are any different than any other multi-socket NUMA CPU setup.
If 2 Larrabees are supposed to be indistinguishable from a single CPU without various forms of memory allocation gymnastics such as duplicated memory and NUMA-aware policies, they will likely be indistinguishable from a single, but oddly slower, unit.

Larrabee on its own board in a PCI-E slot would be pretty much in the same boat as any other card setup without some additional interconnect like the SLI or Crossfire cables (once again something that's been done for decades in other industries that gamers rarely encounter).
 
So my point is that if we consider the ideal to be super-sampling (doing everything, tone mapping the super-sampled buffer, then finally down-sampling the resulting post-tone-mapped image), tone mapping is still in the wrong place in the process... i.e. it will not give the same result and indeed can introduce new spatial frequency content into the image.

Not sure what you mean with "wrong place". Where would you want to put it? Like in your "ideal" case? Then that actually matches exactly what this demo does, except it's multisampled rather than supersampled.

That said, it does seem to me like a poor tone mapping function could spell aliasing hell onto the underlying image.

Surely. And so could it to a photo. Like for instance this stupid example of a tonemap operator: result = color + frand(). ;)

I guess Humus you're saying that that's purely a problem with the tone mapping function and that's fine, but it doesn't change the underlying warnings about highly non-linear tone mapping functions.

Agreed. I like the photographic tonemapping because it's very stable and predictable and generally produces good results, and is quite cheap too. There are more advanced tonemap operators that looks at histograms and make custom curves, but I'm not convinced they are better. I wonder how well they perform under non-typical input where the curve might not be very smooth in the end.
 
Thanks - i was just baffled about the enormous performance difference with nothin in view except background texture. :)

That's when the difference becomes the biggest. With very little actual rendering, the cost of the resolve becomes the main performance determinator. With heavy rendering the difference diminishes as rendering cost becomes the bottleneck.
 
Not sure what you mean with "wrong place". Where would you want to put it? Like in your "ideal" case? Then that actually matches exactly what this demo does, except it's multisampled rather than supersampled.
Indeed, I'm considering the "ideal" to be super-sampling... i.e. not even doing texture filtering but dealing with that via screen-space super-sampling. This isn't particularly viable, but it would effectively address any "aliasing" that the tone mapping introduced as well as texture filtering. I guess my point is more that "great, moving it before MSAA resolve solves the problem for geometry edges, but not for texture edges". Admittadly, the latter isn't exactly an easy problem to solve.

Surely. And so could it to a photo. Like for instance this stupid example of a tonemap operator: result = color + frand(). ;)
Hehe indeed... *runs off to try the brilliant new tone mapping operator, copyright Humus 2008* ;)

Agreed. I like the photographic tonemapping because it's very stable and predictable and generally produces good results, and is quite cheap too. There are more advanced tonemap operators that looks at histograms and make custom curves, but I'm not convinced they are better. I wonder how well they perform under non-typical input where the curve might not be very smooth in the end.
Yeah I agree, and it'll be an interesting space to explore. It seems like there might be some utility in looking at neighbourhoods and local tone mapping, but I'll leave that to the experts to figure out :) In the mean time, I do like the results of that tone mapping operator that you chose... many others that I've played with tend to desaturate or overdarken certain regions.
 
I've seen mention on GameDev.net of local tone mapping operators that dynamically decrease the exposure near high-contrast features, which seems to be directly attacking the problem of which I'm speaking.

So while I certainly like the custom MSAA resolve approach (it's cheap and looks much better!), I think down the road we may need to look at the problem more generally, and we may have to bite the bullet and look at local tone mapping operators more seriously.

Absolutely. In photoshop you can tonemap with local adjustments. You can often produce better looking images this way, especially if the general contrast ratio in the image is large. It allows you to keep an appealing contrast across the entire image, without blowing any part of the image into the blacks or whites. I would guess it could be tricky to estimate good input parameters for it though. It's perhaps something I should look into. :)
 
That's when the difference becomes the biggest. With very little actual rendering, the cost of the resolve becomes the main performance determinator. With heavy rendering the difference diminishes as rendering cost becomes the bottleneck.

*doh* [slapshimself]
Right. Right.

It's a shame, that i cannot think of any decent excuse for me not seeing this myself. :)
 
Absolutely. In photoshop you can tonemap with local adjustments. You can often produce better looking images this way, especially if the general contrast ratio in the image is large. It allows you to keep an appealing contrast across the entire image, without blowing any part of the image into the blacks or whites. I would guess it could be tricky to estimate good input parameters for it though. It's perhaps something I should look into. :)

Humus if you are going to do this, ie local contrast adaption, here is how on a GPU:

1.) Do a custom mipmap reduction of the entire image, except instead of computing average of 4 pixels, compute maximums and minimums (2 MRTs). This gives you contrast range at different resolutions in the image. I would test using non-MRTs here and doing 2 interleaved passes (non-dependent draw calls might be faster).

2.) Slightly smooth each mipmap level (or do inline in the next step).

3.) Do a custom reverse mipmap expansion (going from smallest to largest mipmap), doing a weighted average of the current mipmap layer and the previously processed smaller mipmap layer for the min and max mipmaps. This will result in two screen size textures one for min and one for max where each pixel is a weighted average of the contrast range at different resolutions. Choosing different weightings will vary the "localization" of the effect. Give more accumulated weight to the smaller mipmap levels and expand the localization, give more accumulated weight to the larger mipmap levels and the contrast range will adapt more to smaller details.

4.) Build a tonemap curve per pixel given a lookup in the min and max textures. Obvious (but ugly) method is to simply do,

minValue = tex2D(minTex, offset);
maxValue = tex2D(maxTex, offset);
output.rgb = (input.rgb - minValue) * (1.0 / (maxValue - minValue) )

Which will result in the HDR graying + halo look...

Could easy do something really nice here as well, the method I described will always produce "halos". A better method is to use traditional photographic methods and stick to a single large gradient on the entire image...
 
Absolutely. In photoshop you can tonemap with local adjustments. You can often produce better looking images this way, especially if the general contrast ratio in the image is large. It allows you to keep an appealing contrast across the entire image, without blowing any part of the image into the blacks or whites. I would guess it could be tricky to estimate good input parameters for it though. It's perhaps something I should look into. :)
Dodge and burn for the win. Hmm, actually I tend to dislike it in photography because it's usually heavy-handed.

Shouldn't it be artist controlled, in the end?

Jawed
 
Back
Top