Quick question about HDR: RELOADED

JaylumX

Newcomer
I have another question for you3D guru's out there. Before i start, i in no way have the knowledge that the majority of you guys have so if my question does sound a bit off please be gentle. Also i am aware of the multiple threads here about Fixed point verse Floating point but i hope i have asked this question from a different perspective.

Here it is [with the obligotary introduction of course].

The question relates to HDR and particularly ATI implementation of HDR or as i am begining to understand LDR which uses some sort of "ping pong" method and some weird arse stuff with the pixel shaders [I am aware that the HDR technique Nvidia uses, the Licensed OPENEXR image format via FP blending is 16bit per channel - 64bit].

Now currently ATI implementation store brightness information [or some bollucks] using fixed point textures rather than floating point textures, yada, yada, yada, which give a clipped dynamic range in comparision to Floating point but still offer better range than the bog standard 8 bit. Now theoretically speaking if ATI's F-Buffer was exposed in Direct3D could the HDR information be stored in it considering that the F-Buffer deals with colour channels of 32 bit or is this an impossibility. Even then if ATI exposed the F-buffer in opengl [which i am currently aware that that is more plausible than it being exposed in Direct3D since M$ don't support specific Hardware features outside of DX] that HDR brightness information could be store in there. If possible could that present comparable image quality to FB-blending and if so will it take a performance hit of numerous magnitute. I ask this because the "ping pong" method uses the pixel shaders for post processing and considering the F-buffer's primary consern with colour channels, is this a viable solution.

Cheers

JaylumX
 
AFAIK, OPENEXR is a storage format. It isn't a feature of NVidia cards or drivers, since it has no support in DirectX or OpenGL. It is simply a way of storing data, which must then be decoded and used via standard API functionality.

Anyways, I don't think that the F-Buffer could help since it doesn't necessarily store fullscreen data. Also, there is nothing to indicate that the data can be directly manipulated outside of the shader program it was created for.
 
The OpenEXR open is several things, which causes no end of confusion. OpenEXR could mean any of the following:

* The 16 bit floating point datatype (though this is really called half, but people mix it up)
* The image file format, which among other things can contain pixels of the half datatype (which is the default)
* The library to read/write said files
* The whole library included related math functions that you get if you download the source of the directory

The one most pertinent to real-time applications is the half datatype. It's the exact same format as the fp16, which is to say 1 sign bit / 10 mantissa bits / 5 sign bits. Whenever you create a texture or render target of 16 bit data, or create a half in a shader, that's what you are working with. It's just a spec. No one owns it per se, just as no one owns IEEE 32 bit floating point format. It just exists, and everyone uses it. To my knowledge, it was developed originally by ILM, then filtered into GPU land.

No graphics card company implements HDR directly. They can't. It's not an algorithm or a feature. It's a concept whose use is enabled by the existence of other features. In the most naive description, HDR requires a storage and processing datatype that is capable of representing dynamic ranges (ratio of largest to smallest value) that exceed that of an 8-bit unsigned integer (current pixel format). ATI has supported those since the 9700. Every card since then has fp16 and fp32 texture formats, and the shader artithmatic is floating point as well.

Now, for some applications, being able to do that is sufficient to say that the card supports HDR (again in this nebulous concept form). When we begin talking about games, we start making other assuptions that impose more features. What all current ATI cards are missing is blending on floating point render targets. Meaning, that after you rasterize a pixel, you can set some artithmetic operation to combine that pixel value with the value already in memory. This is set through API state flags and happens after the pixel shader. Anything with particle systems, glass, water, clouds, uses multipass rendering, or numerous other effects that games are expected to have need this feature.

The ping-pong refers to people manually doing that blending. Render to RT1, then on the next pass set RT1 as a texture, blend in your shader, and output to RT2. Then swap RT1 and RT2 and do it again, and again for ever object you need to blend. Again, this is not ATI's implementation of HDR (which isn't something you can directly implement), it's how clever developers have acheived blending of floating point render targets without having specific API support for it.

The problem with this ping-pong is it's really slow. Changing render targets and shaders are relatively heavy operations for the 3D runtime. They stall the pipeline a lot. Which is why they aren't really useable for games of any reasonable scene complexity like an FPS.

Now, regarding the F-buffer, this doesn't solve the current problem. The F-buffer was created to get around the maximum lengths of pixel shaders. A long pixel shader could be broken down into parts small enough to be run on the current hardware. Each part would be able to read from a block of memory where results from the previous pass would be stored and write to a block of memory for the next pass. From a D3D-level API standpoint, nothing changes except shaders that wouldn't compile because they were too long now do. Everything would be handled internally.

This doesn't really gain us anything because the drivers would still have to effectively ping-pong in order to guarantee the right results. If you've ever tried to do a gaussian blur reading/writing the same image, you'll understand that the pixels already done will affect new pixels being operated on. This gets even messier with texture caches and the like.

So what does this all mean? Well, since the F-buffer would have to ping-pong internally due to previous results affecting latter results, we still have the same problem. The existence of the F-buffer, a construct to allow arbitrarily long shaders, has no implications on whether or not blending on floating point surfaces is possible. So, we are stuck in the same place.

Hope this helps.
 
That was an excellent reply, sw, and one that should be posted to just about every SM3/HDR thread in existence in some fora. Cheers!
 
Thanks. It's a 10,000 foot view of stuff. I keep meaning to write up some decent articles on what it takes to be HDR in any respectable sense, cut through the marketting blah, and cover what exactly the implications of it are (not only for games, but beyond). That's the part that still has me impressed. It goes way beyond just having blooms around the lights.

It's such a nebulous thing, it's hard for me to figure out where to start. It's a blanket term that covers everything from aquisiton, processing, color calibration, to relations to LDR, and display and much more. And, I don't really have much time as I'm doing a masters and work for Sunnybrook Technologies.

But, if people have any specific questions, feel free to ask. I'll do my best to answer what I can.
 
You can do HDR with ping-pong it doesn't become LDR just that well if you have 100 overlapped transparent triangles ( as 100 triangles on the same pixel ) its gonna require atleast 100 passes and well you probably gonna get like 1 FPS and its a pain trying to seperate the triangles screen space layers so you'll most likely end an excessive number of passes anyway.

If you do ping-pong via brute force you have to do a pass for every traingle now you image what that would be like with 10K particle quads.
 
Would multiple render targets help in the performance in the ping pong method or is that totally unrelated to HDR.
 
If you could set things up such that polygons were drawn in batches the size of which are equal to the number of render targets, one could potentially sent each polygon to a separate render target and then blend all of them in a single pass. Of course, all this would do is reduce the number of passes. It would still remain very inefficient.
 
Pete said:
That was an excellent reply, sw, and one that should be posted to just about every SM3/HDR thread in existence in some fora. Cheers!

Hehe. I'm going to paste it everywhere and claim it as my own!!! Excellent post man. :)
 
Ping-ponging is only needed when blending. Anything that is completely opaque doesn't need it. Ideally, the number of ping-pongs is equal to the depth complexity of blended objects in the scene. In most cases, this number is actually quite low. I thnk I recall what with all it's culling help, Unreal Tournament's depth complexity was 6 or so. Only problem is that it is very hard to figure out the ordering exactly in terms of draw calls, so you end up having to do a lot more to be safe.

Multiple render targets are mostly unrelated. Whether you have 1, 2 or more render targets has no impact on whether or not you can do blending on any of them. Though, I know for a fact that on ATI cards, it's faster to output to 2 2-channel render targets in some cases than 1 4-channel render target.
 
Back
Top