Ben, whatdya think?

jjayb

Regular
Hey Ben, nice pictures you posted at rage3d. So what do you think about the 9700? Was it what you expected? Better? Worse? You seem to be pretty informed about Nvidia's upcoming tech. Do you think they may have been a little surprised by Ati's release?
 
What do I think? I think DX9 hardware will help bring many effects never before seen in videocards. While Nvidia did 12 pixel shader operations in a pass on a Geforce3 , and the 8500 22 ops per pass , the 9700 does 160 per pass. You can do a lot of things with 160 ops per pass .Further , you can multipass the shaders and do shaders with 1000s of ops 2nd 128bit color precision can show a dramatic difference , that is more precision than in Toy Story.

The 9700 does extremely well in today's games when AA and anisotropic filtering is enabled. Both trilinear and bilinear anisotropic filtering are now possible for the first time in the Radeon series . Performance without AA and aniso enabled is mostly CPU limited .

ATI showed a Renderman shader running realtime on the 9700 . That was impressive by itself, but it was at 50 fps.

Microsoft isn't ready with DX9 at the moment. While there is a beta around, ATI won't ship the 9700 with DX9 betas .

As to NVIDIA wait and see :D
 
What do I think? I think DX9 hardware will help bring many effects never before seen in videocards. While Nvidia did 12 pixel shader operations in a pass on a Geforce3 , and the 8500 22 ops per pass , the 9700 does 160 per pass. You can do a lot of things with 160 ops per pass .Further , you can multipass the shaders and do shaders with 1000s of ops 2nd 128bit color precision can show a dramatic difference , that is more precision than in Toy Story.

Um, isn't it 64bits of percision? While the pipelines bitness is considered 128bit because of the two 64bit texture units?

ATI showed a Renderman shader running realtime on the 9700 . That was impressive by itself, but it was at 50 fps.

Hrm, this is interesting, one wonders how fast the 9700 can render FF:SW. It'd be neat if they did that like nVidia.

As to NVIDIA wait and see

Any more dick teasing and I'll hunt you down and cut yours off. ;)
 
Saem said:
Um, isn't it 64bits of percision? While the pipelines bitness is considered 128bit because of the two 64bit texture units?
R300 pixel shaders always operate in 128 bit float precision and clamp the result at end if neaded. That's got nothing to do with number of TMU-s.

By the way ben: If it's not to big secret which part of the industry do you work in? :)
 
Could you break down that 128/96 bits into components, mantissa and exponent, please?

Clairification:
Ben6
By 96 bit do you mean 3x32f (no alpha in there), to compare it to 24bit color. Or do you mean 4x24f?

MDolenc:
Do you know for sure that it's 128 bit all the time, or was that an assumption? It looks like an awfull waste of gates to have it on all the time.
 
Basic said:
Clairification:
Ben6
By 96 bit do you mean 3x32f (no alpha in there), to compare it to 24bit color. Or do you mean 4x24f?
It's 24 bits per channel times 4 channels. Because of alignment, it's represented as 32 bits per channel if written to the frame buffer, hence 128 bit.
 
Thanks.
Now that it's a bit smaller it's more believable that it's on all the time. Is that correct? (Sorry MDolenc if I seem to neglect you, but as I know that OpenGL guy works at ati, I know I can trust technical info as facts.) Can you say how the bits are splitted on exponent/mantissa?

[Edit]: Added reference to who I said I was sorry to.
 
That seems much more believable.

Does the R300 support 128-bit source art? And is it all in a format that is, in actuality, 96-bit?

Additionally, as we see in 24-bit z and 8-bit stencil buffers, is the other 32-bits of information used for anything?

Update:

It seems conceivable that such a pipeline would be capable of up handling up to around 64-bit integer color buffers. Can you disclose the pixel formats that the R300's current drivers support? I suppose it doesn't mean a whole lot to many people, but I would find it interesting.
 
OpenGL guy said:
It's 24 bits per channel times 4 channels. Because of alignment, it's represented as 32 bits per channel if written to the frame buffer, hence 128 bit.

Just to be absolutley clear, it's not the IEEE 32 bit format of 1 sign, 8 exponent and 23 mantissa, but a 24 bit formats of, for example, 1 sign, 7 exponent, and 16 mantissa. Is that fair assesment?

So the dynamic range and accuracy would be a tinsy-tiny bit reduced from 32-bit floats that we're all used to, but still a lot better than anything else out there.
 
I'd really like to know just how programmers might use the added precision of 32-bit over 24-bit. I still don't see a real reason for it.

This would best be supplemented by demos and screenshots, of course.
 
RussSchultz said:
There will never be a need for more than 640k

That's fundamentally different.

Yes, higher precision numbers are necessary for other calculations, but I wonder why they are necessary for pixel shader calculations.

High-precision numbers are generally most useful for highly-recursive algorithms (going over and over the data many times), but 24-bits per channel should be more than enough for any 12-bit DAC, provided that there is some error handling in the math (such as higher bit depth calculations and centering all errors around zero...as opposed to doing all calculations at 24-bits and allowing all errors to be additive).
 
Chalnoth said:
High-precision numbers are generally most useful for highly-recursive algorithms (going over and over the data many times), but 24-bits per channel should be more than enough for any 12-bit DAC, provided that there is some error handling in the math (such as higher bit depth calculations and centering all errors around zero...as opposed to doing all calculations at 24-bits and allowing all errors to be additive).
Pixel shaders need up to 32 bits for the same reason that a general purpose CPU does. Even though the end result of a pixel shader calculation will be a color (typically written to an 8 bit per channel frame buffer), there are many possible reasons that an intermediate calculation will require 32 bits of precision. Not in all cases--16 bit half precision floats are sufficient in many cases--but there will be some . . . Consider dependent texture lookups. Suppose that you have a 4096x1 texture that you're using as a 1-D lookup table. With a 16-bit float, there are only 10 bits for the mantissa, so you can't even address every texel after 1024. Bump up the mantissa to 16 bits. Now you can address all texels provided your texture coordinate doesn't exceed 16 (16 * 4096 = 65536 or 2^16, above which not all integers are representable exactly.) Or, consider a simple pixel shader that takes the x coordinate of the pixel in world space and computes the alpha as sin(x) . . . You'd like this to work without obvious artifacts from roundoff over a 'reasonable' range of world space coordinates. 16 bits and even 24 bits won't get you there. The list of examples goes on, but the idea is that there a lot more places than you think where a lack of numerical precision will bite you, even when the end result is an 8 bit color component.

--Grue
 
Think trig functions.

How much of an arc-angle do you think a single pixel occupies?

Remember textures aren't just textures anymore.
 
Grue said:
Not in all cases--16 bit half precision floats are sufficient in many cases--but there will be some . . . Consider dependent texture lookups.

I still think they'll be enough for color data, though. This is provided, of course, that there are methods of controlling accumulation errors.

Suppose that you have a 4096x1 texture that you're using as a 1-D lookup table. With a 16-bit float, there are only 10 bits for the mantissa, so you can't even address every texel after 1024. Bump up the mantissa to 16 bits. Now you can address all texels provided your texture coordinate doesn't exceed 16 (16 * 4096 = 65536 or 2^16, above which not all integers are representable exactly.)

Would you actually want to use just one component as a lookup into a 2D texture? Yes, you've certainly shown that 16-bits per channel is not enough for textures greater than 1024 (And usually you can expect some error in the tail end of the mantissa...so it probably wouldn't be acceptable for anything past around a 256-size texture).

I would think that for 2D or 3D texture lookups, you'd want to use the additional channels (i.e. use two of the four available floats for a lookup into a 2D texture...and it may be possible to just use 16-bit floats, with two combined for one lookup for additional accuracy).

Anyway, this should make 24-bit floats adequate for lookup tables into 2D or 3D textures, provided you use the different channels for different dimensions (And don't attempt to just use one channel for the entire lookup).

Or, consider a simple pixel shader that takes the x coordinate of the pixel in world space and computes the alpha as sin(x) . . . You'd like this to work without obvious artifacts from roundoff over a 'reasonable' range of world space coordinates. 16 bits and even 24 bits won't get you there.

I'm not entirely sure why this would be much of a problem, as long as the algorithm for computing the sine was accurate enough.

The list of examples goes on, but the idea is that there a lot more places than you think where a lack of numerical precision will bite you, even when the end result is an 8 bit color component.

--Grue

Well, I do know bump mapping is one example where 8-bit integer values are far from adequate...but I don't see why 24-bit floats would be inadequate.

I also know that 32-bit floats, when expressed as decimals, will usually end up with errors past about 4-5 decimal places, which is a few bits closer than the mantissa size would seem to indicate, if you do a few recursive operations on a modern CPU.

But, there's little to no reason why this must be the case for GPUs. GPUs should use higher-precision internal calculation, as well as algorithms to keep errors centered around zero (and not always add). It's certain that today's CPUs do similar things, otherwise there would be noticeable loss in color depth from enabling trilinear filtering, anisotropic filtering, and FSAA.
 
Back
Top