Does Radeon 9700 Support 128Bit FP Frame Buffers?

Dave Baumann

Gamerscore Wh...
Moderator
Legend
This is one element I've not been clear on since the release of the card - some have made vague references to it, other have not. This is what the ATI docs say:

RADEON 9700 Technology White Paper said:
The RADEON 9700 supports a new, high precision 10-bit per color channel frame buffer format enabled by DirectX 9.0. This enhancement of the standard 32-bpp color format (which supports just 8 bits per color channel) is capable of representing over one billion distinct colors, resulting in sharper, clearer images with more faithful color reproduction.

ATi DirectX 9.0 White Paper said:
32-bit color can combine 256 different levels of red, green, and blue to re-produce virtually any color the human eye can perceive. However, this format only allows for 256 different levels of brightness to be represented. The human eye can perceive millions of different levels of brightness.

DX9 supports a new high precision 32-bit per pixel frame buffer format that can represent four times as many levels of brightness. The added range makes a big difference in scenes that have widely varying levels of brightness. With too few levels of brightness, 3D images can look either over-exposed or under-exposed, while high precision images look much more natural.

It would seem that the R300 chip can support this format as the FIREGL X1 specs say:
128-bit floating point precision frame buffer

However, that is curiously absent from the Radeon 9700 spec sheet.

So, is this just a typo or are they not supporting 128bit FP framebuffer on the retail version of R300. If not is this only down to performance?
 
I think they are talking about a 128-bit FP backbuffer and a 10:10:10:2 32-bit frontbuffer, to give a few extra bits of precision to the DAC during scanout.
 
DemoCoder said:
I think they are talking about a 128-bit FP backbuffer and a 10:10:10:2 32-bit frontbuffer, to give a few extra bits of precision to the DAC during scanout.

The main uses of high precision formats would be to pass full precision data between runs of different shaders, or to provide high precision and dynamic range information in the form of a texture map.

128 bit FP buffers could be used as textures and intermediate renderable buffers, but are not necessarily a displayable format. For the final pass of rendering you would expect to render to a displayable format that could be flipped to the front surface (e.g. 10:10:10:2)

I think this is where the confusion creeps in. Not all formats in use on a graphics card can be interpreted by the DAC (for example DXTC textures...)
 
That's exactly what I said. 128-bit backbuffer, 10:10:10:2 frontbuffer for display. I never said the 128-bit backbuffer is a display format.
 
DemoCoder said:
I think they are talking about a 128-bit FP backbuffer and a 10:10:10:2 32-bit frontbuffer, to give a few extra bits of precision to the DAC during scanout.

Thats the point though - I can't actually see anywhere that states the Radeon 9700 supports the 128bit FP back buffer. It appears the FIREGL variant does, but I can't see if for the Radeon 9700.
 
Personally Id love to see a 10:10:10:32 format (assuming 128 bit format has a large performance penalty). It would be a nice alternative if you just wanted a high precision intermediate value for depth in volume calculations. If anyone is going to bother to question ATI about the 128 bit FP support could you try to ask about this too? :)
 
Well, if OpenGL guy or one of the other ATi peeps around can't clear it up then I may have to go through their PR channels (a.k.a. Black Hole) to get an answer!
 
It's rather a tenuous link but ZDNet have it, in a small news piece, as:

By the integration of a Floatingpoint of pixel processor, which can fall back to 128 bits floating POINT Framebuffer, reach ATI with the computation of 3D-Objekten a higher accuracy, which provides again for the representation quality of the Radeon 9700.

If that reads rather poorly, it's because of an online translation of the German ZDNet site! The original can be read here:

http://news.zdnet.de/zdnetde/news/story/0,,t101-s2119345,00.html

Your guess is as good as mine to the accuracy of this :-?
 
Hi.

The 9700 supports writing a full 128b pixel to a non-displayable buffer (it's 32b float per component). It supports up to 10:10:10:2 for displayable buffers.

The 128b (4xSPFP) format is available as a texture format, and can be read in and used in the pixel shader. This does allow for any length pixel shader to be run, since intermediate results can be written to the FB and then read back to execute the next section of code.

But, the format is also available for someone wanting to supply any 128b per texel (4 float) texture map to the shader (i.e. Binormal vectors, etc...), for advance & high precision shading effects.

Hope that answers your question.
 
Now what happens if you want to output to more than one frame buffer?
Can you still get 128b per frame buffer, or is there a 128b total limit?

In the DX9 doc from Meltdown2001 it said that DX9 can't do filtering on the floating point inputs. Is that true for R300? (Filtereing is of course not needed when you just want intermediate values for multipass.)
 
Hi.

I assume you mean for the multiple-render-targets? If so, then yes, the 9700 can write a seperate 128b pixel to each of the multiple render targets. The 9700 supports up to 4 render targets simultaneously.

If, instead, you mean writing multiple buffers, each one available as a seperate texture, then yes also. You would render the first texture, then render the second, and so on.

The 9700 only supports point sampling on floating point inputs. So, no, you can't "filter" floating point textures in the texture unit. Of course, you could do some filtering in the shader, by multi-passing the texture and blending the passes into a new filtered texture.

Hope that answers those questions too.
 
The 9700 only supports point sampling on floating point inputs.
What does that mean? Why sampling is needed in spite of ability to handle FP?
Is this mean 9700 cannot read FP texture as it is?
Please feel free to correct me if I misunderstood. ;)
 
It means no bilinear, trilinear, or anisotropic filtering of 128-bit textures. Which is a reasonable tradeoff considering the performance and logic implications of trying to fetch 128-taps of a 128-bit textel.

With the ability to read a texture register multiple times, I suppose you could do the sampling in the pixel shader itself by messing with the texture coordinates. By knowing the gradients and z values, you could code your own anisotropic filtering. On the NV30, you could probably take an unlimited number of point samples.

Or, you could bind the same texture into 4 different registers with an offset and do bilinear.
 
DemoCoder said:
It means no bilinear, trilinear, or anisotropic filtering of 128-bit textures. Which is a reasonable tradeoff considering the performance and logic implications of trying to fetch 128-taps of a 128-bit textel.
The point of the format is for use as an intermediate stage in multistage rendering: You don't filtering because you don't want to disturb the results.
On the NV30, you could probably take an unlimited number of point samples.
You people crack me up. Have fun with your infinite loop :LOL:
 
OpenGL guy said:
DemoCoder said:
It means no bilinear, trilinear, or anisotropic filtering of 128-bit textures. Which is a reasonable tradeoff considering the performance and logic implications of trying to fetch 128-taps of a 128-bit textel.
The point of the format is for use as an intermediate stage in multistage rendering: You don't filtering because you don't want to disturb the results.

No, that is not the only point of the format. Why would I restrict my other lookup takes to 32-bit? If I have a light map, I want high dynamic range representation of the lighting. I sure and hell don't want 8 bits per color.

Maybe for ATI this is the only point, because your pixel shaders aren't long enough to do complex filtering and procedural shading, so you must multiple. But if I am starting out with high quality lightfield maps, I want to procedurally filter them, and I don't neccessarily have to multipass to do it on other architectures (3dLabs P10, NV30)


On the NV30, you could probably take an unlimited number of point samples.
You people crack me up. Have fun with your infinite loop :LOL:

Unlimited means it is under the control of the developer, not the hardware or limited by the language.

I want to perform interpolation on a 128-bit FP normal map. I'll want 4-8taps, there goes about 8-16 slots out of my 160 pixel shader slots just for filtering the one texture. For every texture, I must burn more texture stages and instruction slots.

Of course you don't want to disturb the data, but point sampling a normal map!?!? Point sampling a light map? Certainly, if I am doing procedural texturing, I'll want more complex filtering, doing antialiasing in the shader, and I might be starting with a 128-bit FP texture.

I was actually *defending* ATI in my original post by saying fixed function filtering isn't needed for 128-bit textures. However, you seem to object to the idea that some of the other architectures are powerful enough to do complex filtering in the pixel shader itself. Hell, 3DLabs architecture is so pwerful they can do wavelets in their shaders!


Higher precision texture formats are NOT only useful for multipass.
 
ok the radeon 9700 says it has support fro DDR II modules and the core was desgined with that in mind...would it be possible for manufacturers such as hercules to implement the new memory into the card without the core update from ATi...just the what happened with the first geforces when manufacturers proceed to implement ddr ram
 
DemoCoder said:
OpenGL guy said:
DemoCoder said:
It means no bilinear, trilinear, or anisotropic filtering of 128-bit textures. Which is a reasonable tradeoff considering the performance and logic implications of trying to fetch 128-taps of a 128-bit textel.
The point of the format is for use as an intermediate stage in multistage rendering: You don't filtering because you don't want to disturb the results.

No, that is not the only point of the format. Why would I restrict my other lookup takes to 32-bit? If I have a light map, I want high dynamic range representation of the lighting. I sure and hell don't want 8 bits per color.

Maybe for ATI this is the only point, because your pixel shaders aren't long enough to do complex filtering and procedural shading, so you must multiple. But if I am starting out with high quality lightfield maps, I want to procedurally filter them, and I don't neccessarily have to multipass to do it on other architectures (3dLabs P10, NV30)
Where did I say I speak for ATI? Also, I am a driver developer, not an application developer and, for fun, I am a game player.

I don't like lightmaps, never did. Now that we have normal maps, why on earth would you want to use light maps?

Our shaders are so short that no one has implemented a shader that uses all of the instructions. Doom 3 will use the longest shaders of any D3D/OpenGL program, and yet still won't use the full capacity of our hardware. In other words, there's still plenty of room for other computations.
You people crack me up. Have fun with your infinite loop :LOL:

Unlimited means it is under the control of the developer, not the hardware or limited by the language.
You don't say? :rolleyes: I guess you didn't see the smiley? No one here has a sense of humor?

What I don't like is people taking a company's marketing documents for an unreleased product and then comparing everyone else to it and claiming they are lacking. Maybe you'll be more impressed by R400 specs? I mean, if we're gonna talk about vaporware, let's be fair. Radeon 9700 has a release date in the not-so-distant future, I can't say the same for some other parts of which I've heard.

And I don't see any limitations to the Radeon 9700 design. You want loops? Run it through the shader a few times. You want "unlimited" code execution? Run it through the shader. Don't worry, if you're too lazy or don't understand how to do such things yourself, there are tools coming out to make your life easier. Tools such as DX9 HLSL, OpenGL 2.0 HLSL, and Rendermonkey will take care of all the multipassing that is required. Then certain game designers will finally get what they want: Another CPU :LOL:

For people who care about performance, keeping the shader (assuming that's a main portion of their computations) small will be a goal. For people who don't care about performance (i.e. non-real-time rendering), they'll be taken care of as well.
I want to perform interpolation on a 128-bit FP normal map. I'll want 4-8taps, there goes about 8-16 slots out of my 160 pixel shader slots just for filtering the one texture. For every texture, I must burn more texture stages and instruction slots.

Of course you don't want to disturb the data, but point sampling a normal map!?!? Point sampling a light map? Certainly, if I am doing procedural texturing, I'll want more complex filtering, doing antialiasing in the shader, and I might be starting with a 128-bit FP texture.
I don't think you need such precision for a normal or light map. First off, light maps are a hack and shouldn't be used. Secondly, for a normal map, why not just increase the resolution of the map and use 32 bpp? Then you get filtering for free. Plus, you can even add mipmaps, which you cannot do with the 128 bpp format.

Anyway, as I stated above, if you want such filtering, add it to your HLSL code and let the multipassing begin!
I was actually *defending* ATI in my original post by saying fixed function filtering isn't needed for 128-bit textures. However, you seem to object to the idea that some of the other architectures are powerful enough to do complex filtering in the pixel shader itself. Hell, 3DLabs architecture is so pwerful they can do wavelets in their shaders!
How does the Radeon 9700's architecture prevent you from doing the same thing?
Higher precision texture formats are NOT only useful for multipass.
I never said it was, but I seriously doubt many people are going to supply such a source texture format. 16 bytes per texel is very expensive.

My whole point was that even though Radeon 9700 doesn't support 1024 PS instructions (which some unreleased product is supposed to), you can still achieve the same results. And, please, don't go off about how slow multipass is: Anyone using such long shaders doesn't care about time. The point to all this precision is quality, and the Radeon 9700 provides that.
 
Back
Top