FP blending in R420?

DemoCoder · May 7, 2004

We'll just have to wait and see. I'm just suggesting there is conflicting information available in other channels.

Dave Baumann · May 7, 2004

Maybe, but then there were a sufficiently large number of people there, some of which had direct involvement of the technology implementation, that would rather had said in front of 90 editors that it would support FSAA if it could.

However, ragardless - in this instance bandwidth would become even more of an issue, as would memory footprint.

DemoCoder · May 7, 2004

Only if the game isn't shader bound (is HL2 really bandwidth bound?) You're talking about 2x bandwidth and memory increase (assuming framebuffer compression still works) which is equivalent to bumping up resolution or AA.

Some people might prefer to run in 1024x768 with HDR and/or 2xAA if it gets rid of alot of artifact banding which appears in 1600x1200 4xAA.

Are you telling me that HDR in the R300 and R420 is practically useless and all of this whining about it's lack in the NV3x was misplaced?

I think the point of HDR is to be used in render to texture in most cases (as I said before). Valve did an HDR demo and it seemed to run just fine BTW.

Dave Baumann · May 7, 2004

DemoCoder said:
I think the point of HDR is to be used in render to texture in most cases (as I said before). Valve did an HDR demo and it seemed to run just fine BTW.

I know - they didn't use a floating point frame-buffer, which is what this topic is about. (And a render to texture would still operate with MSAA as well)

DemoCoder · May 7, 2004

Render to texture doesn't do AA. MSAA rendertargets aren't supported nor is FP texture filtering.

Thus, if you render a scene to a render target than in a second pass, fetch that FP texture and blend it with your integer frame buffer, you will effectively be point sampling the previous non-FSAA frame buffer and rendering it into an integer MSAA buffer.

You'd have to render a super-sampled FP 4x texture, then do bilinear/box filtering yourself in the shader to avoid aliasing.

FP texture filtering and FP frame buffer blending is a *performance optimization* to Valve's multipass method, that's all.

KimB · May 7, 2004

Mintmaster said:
Does anyone know if R420 supports:
- FP16 blending?
- FP32 (well actually, FP24 internally) blending?
- Blending with other formats, like I16?
- FP filtering?

I'm sure the R420 only supports the pixel formats the R3xx supported.

And by the way, FP framebuffer blending is required for complete support of HDR rendering. If you want to blend while doing HDR rendering otherwise, you would need to write to a separate texture each time a draw call is made (and you'd have to guarantee that your draw call doesn't overlap itself), reading that texture into the next pass. The performance hit wouldn't be remotely bearable for such a situation.

The only way around this is to severely limit yourself as to what you can render when doing HDR rendering.

Dave Baumann · May 7, 2004

DemoCoder said:
Render to texture doesn't do AA. MSAA rendertargets aren't supported nor is FP texture filtering.

No, the render to texture operation doesn't, but the rest of the operation can. For instance the rthdribl demo used render to texture if FP render targets are supported and will still utilise MSAA.

DemoCoder · May 7, 2004

You're missing the point. After you've rendered your HDR texture, you need a pass to blend it into the regular backbuffer. The R300 only supports point sampling on HDR textures. This means you've lost any AA on the HDR texture from texture filtering or from MSAA, because your pixel shader will point sample one HDR texel, and write the same color value into all 4 MSAA samples.

This means you've got to super-sample the HDR image and do box filtering or bilinear filtering your pixel shader, which will eat into performance.

Zeno · May 7, 2004

DaveBaumann said:
I know - they didn't use a floating point frame-buffer, which is what this topic is about. (And a render to texture would still operate with MSAA as well)

I think there is a misconception here. As far as I know, there is no such thing as a "floating point frame buffer" in any of the cards we're talking about. When doing HDR, you have to set up a high precision offscreen buffer and render to it. It is this buffer that you will now be able to blend into on the 6800.

After you have your scene rendered into your high precision buffer, you then have to do a final fragment program pass to map the 16-bit floats into the 8-bit fixed range of the frame buffer. This is usually referred to as "exposure" or "tone mapping".

It would be great to have additional precision in the frame buffer, but I don't think normal computer displays can handle 16 bits of dynamic range.

bloodbob · May 7, 2004

Zeno said:
DaveBaumann said:

I know - they didn't use a floating point frame-buffer, which is what this topic is about. (And a render to texture would still operate with MSAA as well)

Click to expand...

I think there is a misconception here. As far as I know, there is no such thing as a "floating point frame buffer" in any of the cards we're talking about. When doing HDR, you have to set up a high precision offscreen buffer and render to it. It is this buffer that you will now be able to blend into on the 6800.

After you have your scene rendered into your high precision buffer, you then have to do a final fragment program pass to map the 16-bit floats into the 8-bit fixed range of the frame buffer. This is usually referred to as "exposure" or "tone mapping".

It would be great to have additional precision in the frame buffer, but I don't think normal computer displays can handle 16 bits of dynamic range.

Older style monitors are analog so the number of bits isn't a problem the range is though.

LeGreg · May 7, 2004

DaveBaumann said:
No, the render to texture operation doesn't, but the rest of the operation can. For instance the rthdribl demo used render to texture if FP render targets are supported and will still utilise MSAA.

I think you're mixing concepts here.

Rthdribl doesn't do multisampling in the sense of an hardware multisampling.

The argb buffer as far as i know only is needed for final presentation
(with tone mapping and blurring) and the fact that it can do multisampling (as in d3d multisampling quality) doesn't count.

nutball · May 7, 2004

bloodbob said:
Older style monitors are analog so the number of bits isn't a problem the range is though.

It's not the number of bits which is necessarily the problem is it? Fundamentally monitors (TFT and CRT) are pretty limited in the dynamic range that they can display. There's only so "off" they can be, and only so "on" they can be.

With modern TFTs the ratio of these two numbers is about 500:1 I think, presumably with CRTs it's somewhat greater, but still in the 1000's:1 I'd guess. So that's effectively, what, 10-11 bits (integer) of dynamic range. Neither can display "OMFG

that's bright, pass the sunglasses" kinda bright.

What's the dynamic range that the human eye can deal with in a single scene? For example standing outside in bright sunshine looking in through the door of a darkened building? You can still resolve details of the building interior. I'll wager the dynamic range of lighting in this situation exceeds that which any monitor is capable of displaying, analogue or otherwise.

Hyp-X · May 7, 2004

Mintmaster said:
Does anyone know if R420 supports:
- Blending with other formats, like I16?

Well since R300 supports that I assume R420 does too.
Both the HDR launch demo and Valve's HDR implementation are using A16R16G16B16 buffers - NOT float buffers.

And MSAA support for HDR targets require two things:
- actual MSAA supports for those formats (R300 misses this point...)
- StretchBlt support for MSAA buffer -> plain texture conversion

And yes there's a CAP for every pixel for in D3D which can be queried to find out if there's blending support for that format - so yes D3D supports FP16 blending.

Tic_Tac · May 7, 2004

WaltC said:
Mintmaster said:

It seems everyone has been caught up in either performance numbers or shader models. However, even Chalnoth was saying that FP-blending is likely the most important feature to be introduced this generation.

Does anyone know if R420 supports:
- FP16 blending?
- FP32 (well actually, FP24 internally) blending?
- Blending with other formats, like I16?
- FP filtering?

If they are supported, has anyone tested them?

Click to expand...

But then I worry that nVidia had no official demos lined up to illustrate all of this amazingly important stuff like PS3.0 and fp blending, so that reviewers could run them, analyze them, and praise them, if deserving of it. I mean, if it's so "important" and so forth, then WHERE ARE THE IN-HOUSE nVIDIA DEMOS to illustrate it???? I think a bit of empirical evidence is definitely in order.

This is a point I thought of too whilst watching Nvidia's presentation.

If you've touted SM3.0 as the next big thing and spent many juicy engineering hours supporting it from the ground up, why not shows us the money?

They did, by comparing SM1.x vs SM3.0.

It makes me suspect that there is no significant qualitative or performance advantage using SM3.0 over SM2.0 at this point.

Games need to actually run at decent frame rates to sell and SM2.0 is just starting to fledge.

It would make sense to get SM2.0 right, rather than tout SM3.0 to soothe the ego.

I see SM3.0 more of a nVidia PR exercise at this point.

Hyp-X · May 7, 2004

Tic_Tac said:
It makes me suspect that there is no significant qualitative or performance advantage using SM3.0 over SM2.0 at this point.

I don't like the "SM" approach at all. We are talking two different things:

PS2.0 vs PS3.0
Don't expect any visual difference here. PS3.0 is more flexible and it will be easier to develop on PS3.0 capable hardware - but at the end of the day (read at the optimization stage) it might turn out that most of the shaders run faster with the ps_2_a profile.
So it won't affect consumers that much - if at all.

VS2.0 vs VS3.0
There are very good advantages in VS3.0 allowing better effects can be produced with it - but like with every new thing it will probably take 1-2 years to adopt that - especially that only one of the vendors support it now.

So my prediction is:
You'll see almost immidiate PS3.0 adoption, but no visual difference, and there won't be any VS3.0 adoption in the lifetime of this generation of cards.

Xmas · May 7, 2004

Wildcat Realizm supports direct output of a FP16 framebuffer, and programmable blending. With SuperScene AA, I guess.

Mintmaster · May 7, 2004

Hyp-X said:
Mintmaster said:

Does anyone know if R420 supports:
- Blending with other formats, like I16?

Click to expand...

Well since R300 supports that I assume R420 does too.

Are you sure? I read many times that you couldn't. I know I16 filtering is supported, but I didn't think blending was.

I remember reading this thread. It seems blending might be supported, but maybe not, and for some reason there's a discrepency between OpenGL and DX9.

I also asked some ATI reps at ATI Mojo Day Reloaded (Montreal version), and they told me I16 blending wasn't supported. However, they were all DX people, so maybe they were commenting on the driver's current abilities and not the hardware's.

Both the HDR launch demo and Valve's HDR implementation are using A16R16G16B16 buffers - NOT float buffers.

Yeah, I knew that, but I didn't know what they were doing for blending. I assumed they just thought of a compromise, like doing alpha blending afterwards into the final X8R8G8B8 display buffer. You can't get HDR reflections (i.e. with glare, bloom, lens flares, etc) off transparent glass, as one example of a drawback. You could counter this by drawing only the reflections into another I16 (or FP) buffer, but now we're getting into nasty ad-hoc solutions with Valve might be willing to go through, but most wouldn't.

How sure are you that I16 blending is supported? As I mentioned in a previous post, I believe you can do some very usable HDR with I16, so long as you don't sacrifice low bits for more range like in NVidia's FP16 vs I16 HDR screenshots (page 28 ). You'll sacrifice some high end range vs. FP16, but it'll really just mean less variation in bloom sizes and highlights, which is not easily perceptible IMHO if the art is done well.

OpenGL guy, can you help us figure out the blending capabilities of R300 and maybe even R420?

Mintmaster · May 7, 2004

joe emo said:
"This sucks. R300->R420 is almost as bad as GF3->GF4."

I'm not so sure that's bad thing. As I recal, when the NV30 was launched, the general consensus was that NVIDIA had hit one out of the park with the GF4, but struck out the GF-FX. The GF4 was a damn fine product for it's time -- the Radeon8500, while supporting more features, was slower than the GF4 and therefor did not have the same "warm" market response the GF4 did.

Agreed, but I'm not asking for PS 3.0. VS 3.0 would be nice, but so long as some sort of uberbuffers type of capability (allowing render to vertex array) is standardized, I'll be happy.

It's just that FP blending is such an easy feature to implement, relatively speaking. Just two or four FP24 blending units is all I want. We're talking about a 16-pipe architecture capable of up to 32 FP24 vector math ops per clock, so the die cost would be absolutely minimal.

Mintmaster · May 7, 2004

DaveBaumann said:
FP frame buffers. With NV40 fill-rate and bandwidth will be halved. There is no AA either. I suspect that these will be the long term limiting factors.

So what? Why do we need such insane framerates or super high resolutions anyway? They're a luxury because we have nothing else to do with our graphics cards. I want the best image possible at 60 fps, but that doesn't necessarily mean 1600x1200 at 6xAA. 800x600 is already better than DVD quality. I'll play a game at 1024x768, 800x600 or even 640x480 if it looks as good as the rthdribl demo.

AA isn't a big problem because all you have to do is draw the scene a second time into a normal X8R8G8B8 buffer, and it won't impact performance too much since there will be low bandwidth requirements. I don't know if this is what rthdribl does, but its AA is fine.

Finally, alpha blending isn't used everywhere. It may have relatively low performance due to high bandwidth requirements of 64-bit blends, but at least the capability is there. Half performance isn't bad at all with these monsters. You also have to remember that a lot of the time spent rendering HDR is in the blooms and flares, so low alpha blending perf isn't going to be that important to performance.

Performance is the lamest excuse ever for not implementing HDR.

Dave Baumann · May 7, 2004

So what? Why do we need such insane framerates or super high resolutions anyway?

Look at the performances of something like GT2 in 3DMArk03 - these are not insane and this is still bandwidth limited enough to have large performance drops with FSAA. This is only a relatively complex scene using DX7/DX8.1 features.

FP blending in R420?

DemoCoder

Dave Baumann

Gamerscore Wh...

DemoCoder

Dave Baumann

Gamerscore Wh...

DemoCoder

KimB

Dave Baumann

Gamerscore Wh...

DemoCoder

Zeno

bloodbob

Trollipop

LeGreg

nutball

Hyp-X

Irregular

Tic_Tac

Hyp-X

Irregular

Xmas

Porous

Mintmaster

Mintmaster

Mintmaster

Dave Baumann

Gamerscore Wh...

Similar threads