He sure did.
and of course for Shader Model 3, the required precision is FP32, so you don't get any artifacts that might have been due to partial precision. You can still get access to partial precision, but now anything less than FP32 becomes partial precision. Essentially, the required precision for Shader Model 3 is FP32. What do gamers get out of this? Well, they're going to get titles or content that either looks better or runs faster or both.
Decoded:
Partial precision gives you artifacts & fp24 is partial precision in sm3 --> fp24 = artifacts.
But if the hardware can't run the shader fast enough in fp32, you can still use fp16 which, as we've been saying for the last year, is actually good enough for anything that you might need to do with sm2, in fact, who needs sm2 when you've got ps1.4, I mean doom3 only does ps1.4 effects & both you and I know that The Carmack is
the man.
Meanwhile in reality: fp16 as used almost 100% of the time by nv3x (& due to nv40 support of fp16 & Nv instruction to developers for the last year to use -pp everywhere, is currently used lots by nv40) is totally not good enough & there is bugger all difference in quality between fp24 & fp32.
and
TR: What about some examples of shaders where FP32 precision produces correct results and FP24 produces visible artifacts?
Tamasi: You don't have to listen to me, you can listen to the statements by Tim Sweeney. They've got a number of lighting algorithms that produce artifacts with FP24. In general, what you're going to find is that the more complex the shader gets, the more complex the lighting model gets, the more likely you are to see precision issues with FP24. Typically, if you do shaders that actually manipulate depth values, then again you might see issues with FP24.
Decoded: There are all manner of different possible algorithms that can written which will be affected by fp precision & Tim Sweeny our TWIMTBP lead spokesman says you gotta have fp32 so you gotta have fp32.
Meanwhile in reality: Any developer with half a brain actually uses algorithms that don't loose precision between passes & even if they did use precision affected algorithms, fp32 really isn't
that much better than fp24 on even those types of algorithm.
But generally, the more complex the lighting algorithm, or they actually manipulate depth, the more likely you are to run into precision issues with FP24.
Decoded: Fp24 is not good enough for complex lighting or depth algorithms.
Meanwhile in reality: The fp16 used by nv3x really is not good enough for complex lighting & depth algorithms, while fp24 is good enough so far & there are few cases where fp32 is a substantial improvement.
And I think lastly, the big issue is that there is no standard for FP24, quite honestly. There is a standard for FP32. It's been around for about 20 years. It's IEEE 754.
The fp format standard any coder should be coding to is the fp format used by the hardware you are coding on/for (which should be as broad a variety as possible in order not to loose substantial market share).
Who gives a toss if you're using ieee standard fp32 if ATIs fp24 or nv3xs entirely non standard fp16 is adequate for what a graphics coder needs?
Do nv really have full ieee standard fp32?
Or just the bits that are adequate for what a graphics coder needs?
I do more or less agree with him about how to count transistors.
Though, in terms of a 12 active pipe chip on a chip designed for up to 16 pipes, it
is quite fair to not count the inactive transistors (160mil) but wasn't the story that ATI was only counting logic & not any cache or ancilliary non-core-functionality transistors?
Anyways, I would be surprised if the count of all the transistors on a 16 pipe r420 would come out anything other than pretty close to the nv40s 220 million odd.