I had been playing around much with a R300 and a NV34 and something that has continuously stayed at the back of my mind was the apparent murky differences between DX9's FP "specs" and that of IEEE-32.
The thing that primarily stood out to me was the R300's 24-bit FP internal pixel pipeline vs IE3-32. The more I think about it, the more I favour the thought that the R300's designers wanted a DX9-compliant feature yet they definitely do know what IE3 is. Maybe it's a combination of transistors, a trusted process technology they feel they must go with and, well, performance. The R300 should've been IE3-32, not DX9-minimum-spec-compliancy IMHO.
Why?
IEEE defines all basic arithmetic operations (add, subtrace, multiply, divide, square root) as producing the properly rounded 32-bit version of the (theoretical) exact result. No problems here with DX9.
But when you get into compound operations like mulad, dp4, etc., the DX9 spec doesn't specify an "order of operations", so the result may differ depending on (for example) whether mulad is round(a+round(b*c)) or round(a+b*c), etc. Further, the trig operations aren't well-defined at all, so implementations may differ. In this day and age, when much of the functionality you'd expect (such as blending of floating-point textures) remains unimplemented, this is not at all a limiting factor. But in a few years, when the more dire problems have been solved, this one will be start poking up as a problem. Eventually, people will expect basic hardware operations to produce precisely-defined results, regardless of hardware manufacturer, control panel settings, time of day, etc.
Obviously, the R300 is a good part and ATi probably felt that their decision to go with FP24 internally is a good choice and I wouldn't really disagree with that because it is the first available "DX9 part". Hey, we gotta start somewhere! However, one has to wonder why they didn't have full FP32 internally and via drivers do some possible, er, optimizations depending on the scenarios, but leave the options open.
Am I asking for too much from the R300 given the current timeframe? Am I being a bit critical of DX9 in this aspect?
The thing that primarily stood out to me was the R300's 24-bit FP internal pixel pipeline vs IE3-32. The more I think about it, the more I favour the thought that the R300's designers wanted a DX9-compliant feature yet they definitely do know what IE3 is. Maybe it's a combination of transistors, a trusted process technology they feel they must go with and, well, performance. The R300 should've been IE3-32, not DX9-minimum-spec-compliancy IMHO.
Why?
IEEE defines all basic arithmetic operations (add, subtrace, multiply, divide, square root) as producing the properly rounded 32-bit version of the (theoretical) exact result. No problems here with DX9.
But when you get into compound operations like mulad, dp4, etc., the DX9 spec doesn't specify an "order of operations", so the result may differ depending on (for example) whether mulad is round(a+round(b*c)) or round(a+b*c), etc. Further, the trig operations aren't well-defined at all, so implementations may differ. In this day and age, when much of the functionality you'd expect (such as blending of floating-point textures) remains unimplemented, this is not at all a limiting factor. But in a few years, when the more dire problems have been solved, this one will be start poking up as a problem. Eventually, people will expect basic hardware operations to produce precisely-defined results, regardless of hardware manufacturer, control panel settings, time of day, etc.
Obviously, the R300 is a good part and ATi probably felt that their decision to go with FP24 internally is a good choice and I wouldn't really disagree with that because it is the first available "DX9 part". Hey, we gotta start somewhere! However, one has to wonder why they didn't have full FP32 internally and via drivers do some possible, er, optimizations depending on the scenarios, but leave the options open.
Am I asking for too much from the R300 given the current timeframe? Am I being a bit critical of DX9 in this aspect?