HDR+AA...Possible with the R520?

DemoCoder said:
I think FP10 blending is going to be artifact central. You already get artifacts with 8-bit precision, and going from 8 to 7 bits, makes things much worse. Remember, we use to harp about artifacts with 16-bit color when the green channel has 5-6 bits of precision. With FP16 shaders, we were seeing artifacts with just 10-bit mantissa on *short* shaders which closely resemble what a few HDR framebfufer blends will give you in terms of accumulated error. Moreover, without a sign-bit, you can't use an FP buffer for RTT operations that create vertices or particles. (well you can, but you have to bias yourself in vertex shader) I would hope that both FP10 and FP16 are supported. FP10 is a hack right now because of bandwidth issues, but it trades off precision for range, but both are needed for next-gen titles. It is not a good tradeoff like FP32->FP24 was IMHO.
Also, FP10 is apparently s5e3, with a maximum of 32. This makes the minimum representable number 2^-3, or 1/8 (besides zero, of course). I just don't understand how the format can be really useful without being capable of resolving dim colors.
 
How many actual blends are we going to see on any given surface in a typical scene?

3? 10? 50?

The argument against FP16 and FP24 in pixel shaders was primarily about multiple passes as far as I can tell.

Jawed
 
Jawed said:
How many actual blends are we going to see on any given surface in a typical scene?
Well, it really depends upon the scene. Some worst-case scenarios would include scenes with lots of foliage, or lots of particle effects. Particle effects in particular would be really bad, since you could have thousands of particles on the screen at once, and you'd have to swap buffers between each one if you want it to render correctly.
 
Just look at Call of Duty 2 for an example.

Jawed said:
The argument against FP16 and FP24 in pixel shaders was primarily about multiple passes as far as I can tell.

No, that would make little sense, since FP16,FP24,FP32 are truncated to 8-bit during multipass without HDR (which no one was using back then when those arguments were made)

No, the artifacts in FP16 PS come from ill-conditioned expressions yielding too much accumulated error, especially in dependent texturing when a calculated value is used to lookup a texture. But if you think a 10-mantissa is bad, try truncating that to 7-bits using FP10 intermediate storage.
 
DemoCoder said:
Just look at Call of Duty 2 for an example.

Which, apparently, doesn't use HDR...

No, that would make little sense, since FP16,FP24,FP32 are truncated to 8-bit during multipass without HDR (which no one was using back then when those arguments were made)

My bad. Rather than "multipass" I should have said "multiple calculations", e.g. with the error building up from lots of divides.

Jawed
 
Jawed said:
DemoCoder said:
Just look at Call of Duty 2 for an example.

Which, apparently, doesn't use HDR...

I use it as an example of a game with a next-gen particle system, a system which wouldn't work very well if it used FP10.


My bad. Rather than "multipass" I should have said "multiple calculations", e.g. with the error building up from lots of divides.

There is no difference between multi-pass and multiple-calculations in terms of error buildup. A+B in FP16 won't have more error than A+B in FP10 via two passes. That's why I said we have seen artifacts in *short* FP16 shaders. Division (reciprocal, there is no DIV operator) is not the problem anyway, it's subtraction that's the real dangerous operation.

FP10 is useful if you're going to stay away from alot of blending, or not going to use it as RTT input to another shader pass. But if you do lots of blending, you'll end up with the same problem as FP16 in shaders. Certainly, a mega-blended particle system with HDR is not going to look good, or rather, it won't look as good as 8-bit.
 
I don't understand why subtraction would be more dangerous than division, at least not if done in linear space.

But apart from that significant errors are going to arise from multiplication/division (where you add the error per term per operation) or power (where you multiply the error per term per operation).

Jawed
 
Jawed said:
I don't understand why subtraction would be more dangerous than division, at least not if done in linear space.

But apart from that significant errors are going to arise from multiplication/division (where you add the error per term per operation) or power (where you multiply the error per term per operation).

Jawed

For additional and subtraction, absolute errors add. For mul/dev, relative errors add. But for subtraction, you can get cancellation. And cancellation can eliminate almost all the accuracy in your result. The less bits you have to represent numbers, the greater proportionate damage from cancellation, since it is the most significant bits which matter least during cancellation.

Now, you can argue that it is unlikely people will do subtractive blends. But that's a different argument. Of course, cancellation is just the worst case you want to avoid. But you still get error accumulation via additive/multiplicative blends.
 
Floating-point numbers with 10 bit precision...

Puuhh... I remember you guys for flaming nvidias FX12 precision which is about 8 times more precise :)

And 7bit mantissa is even worth then plain 8-bit RGB encoding! Plus, we have tonemapping, lots of scaling etc. Do you really thing 7 bits are enought?
 
Xmas said:
Chalnoth said:
Also, FP10 is apparently s5e3, with a maximum of 32. This makes the minimum representable number 2^-3, or 1/8 (besides zero, of course). I just don't understand how the format can be really useful without being capable of resolving dim colors.
http://www.beyond3d.com/forum/viewtopic.php?p=539541#539541
Right, we talked about this, and I thought about it some more, and I'm pretty sure this is correct (although it should be s6e3, yes).

With 3 bits exponent, you have 8 distinguishable exponents. For 32 to be the maximum range, you need the largest number to be: 2*2^4. So, the smallest representable number would be 1*2^-3, or 1/8.
 
Zengar said:
Puuhh... I remember you guys for flaming nvidias FX12 precision which is about 8 times more precise :)

Nobody was flaiming the fact nvidia supported FX12, they flaimed the fact they replaced code that used FP32 with code that used FX12 and the fact that IQ was reduced.

FP10 blending offers another option to developers, they might not use it or even find it usefull. But as long it is not forced on anyone it is really not a problem.
 
DemoCoder said:
FP10 is useful if you're going to stay away from alot of blending, or not going to use it as RTT input to another shader pass. But if you do lots of blending, you'll end up with the same problem as FP16 in shaders. Certainly, a mega-blended particle system with HDR is not going to look good, or rather, it won't look as good as 8-bit.
Er, rather, if you do lots of blending, FP10 is going to behave much like an old-style 16-bit framebuffer.
 
Chalnoth said:
Right, we talked about this, and I thought about it some more, and I'm pretty sure this is correct (although it should be s6e3, yes).

With 3 bits exponent, you have 8 distinguishable exponents. For 32 to be the maximum range, you need the largest number to be: 2*2^4. So, the smallest representable number would be 1*2^-3, or 1/8.
That's without denorms. With denorms, the smallest representable number is 2^-6 * 2^-2 or 1/256.
 
Ah, hrm, when I looked up denorms it stated that it was a statement that all zeroes for the number meant that the number was zero (as opposed to 1*2^0). But I can see how that implementation would be useful.
 
Chalnoth said:
Ah, hrm, when I looked up denorms it stated that it was a statement that all zeroes for the number meant that the number was zero (as opposed to 1*2^0). But I can see how that implementation would be useful.
With denorms, an exponent of all zeroes means that there is no implicit one, and the exponent is one higher.
So in this case, 001 is exponent -2 with implicit one, and 000 is exponent -2 without implicit one.
 
DemoCoder said:
I think FP10 blending is going to be artifact central. You already get artifacts with 8-bit precision, and going from 8 to 7 bits, makes things much worse. Remember, we use to harp about artifacts with 16-bit color when the green channel has 5-6 bits of precision. With FP16 shaders, we were seeing artifacts with just 10-bit mantissa on *short* shaders which closely resemble what a few HDR framebfufer blends will give you in terms of accumulated error. Moreover, without a sign-bit, you can't use an FP buffer for RTT operations that create vertices or particles. (well you can, but you have to bias yourself in vertex shader) I would hope that both FP10 and FP16 are supported. FP10 is a hack right now because of bandwidth issues, but it trades off precision for range, but both are needed for next-gen titles. It is not a good tradeoff like FP32->FP24 was IMHO.

You should be able to dynamicly adjust between fp10 and fp16 depending on the need .

So while in some situations you might get artifacting on fp10 you can use fp16 .

Honestly if it can provide me with good hdr 90% of the time and ihgh lvl fsaa over fp 16 its great
 
Could someone(Dave, Chalnoth?) please write an article about FP16, FP24, FP32 for Shaders and FP10, FP16 for HDR. I'm still confused about all the different terminologies .. and would like a basic outline of what's what.

I'm pretty sure other people would appreciate it too.

US
 
i'd prefer an RGBE framebuffer over such crappy fp formats all the way.. if i need the framebuffer for math-accumulation, i'd use an fp16 or fp32 buffer (support for them should be there.. even while slow due to bandwidth, imho.. so we can really use the hw for all sort of apps).

if i need the framebuffer just for the ordinary use.. means the scene image, accumulating some blended meshes, then an RGBE buffer would definitely suffice. that'll be .. 32bits for rgb, a.k.a. 11bit per component, too.. but each one would have it's own 16bit (more or less, depends on how you split the mantissa and exponent:D)..

in my software renderings, i use this, and bandings aren't really visible..

wellwell, we'll see
 
Back
Top