FP16 and market support

Rugor · Dec 22, 2003

Well, the fact that Nvidia chose to vastly exceed DX9 specs in some areas while barely meeting them in others is not ATI's fault either.

The simple truth is that ATI has a better DX9 hardware implementation than Nvidia. This doesn't mean they do everything better, but they do do DX9 better. That and AA are their two big advantages right now.

Nvidia's strengths are OGL and developer support. They also perform very well in DX8.x regimes.

radar1200gs · Dec 22, 2003

Forst of all FP32 > FP24, so nVidia can meet DX( full spec when required.

Secondly, as I've said before in the thread, developers will use PP to enable their games to run well on the majority of hardware installed in the real world ie: their customers.

If futuremark wants 3dmark03 to be an indicator of future game performance they should follow the developers lead...

Neeyik · Dec 22, 2003

radar1200gs said:
Forst of all FP32 > FP24, so nVidia can meet DX( full spec when required.

As can all the other DX9-level products out there.

Secondly, as I've said before in the thread, developers will use PP to enable their games to run well on the majority of hardware installed in the real world ie: their customers.

If futuremark wants 3dmark03 to be an indicator of future game performance they should follow the developers lead...

Except that modifying all the appropriate shaders with partial precision is no guarantee of the game running well. Anyway, Futuremark asked the devs what they thought the future would be about while developing 3DMark03, so what does that tell you?

Dave Baumann · Dec 22, 2003

radar1200gs said:
Forst of all FP32 > FP24, so nVidia can meet DX( full spec when required.

The statement "at least FP24" in no way negated that - in otherwords, this is what I said.

If futuremark wants 3dmark03 to be an indicator of future game performance they should follow the developers lead...

Again, if NVIDIA didn't educate them on the use of PP then how are they supposed to. I'd guess at 3DMark not being the final 3DMark either, so there is still opportunity to add it.

Rugor · Dec 22, 2003

Yes Nvidia's FP32 not only meets but exceeds the DX9 spec. Yes you can use FP16 in some cases with the DX9 "Partial Precision Hint." However, you cannot claim DX9 compatibility if all your card can do is FP16, because the spec requires at least FP24 support. The problem for Nvidia is that they exceed the spec with FP32, but drop down to FP16 for performance in most circumstances. The cards would perform much better if they could hold to FP16 all the time, but then they would not be meeting the DX9 spec.

FP32 is more than they need, but FP16 isn't always enough, and that's an example of an inelegant design. You don't need either to meet the spec, so if you're designing a card for games built around the DX9 spec why use them? A pure FP32 design would have had no problems with the spec, but Nvidia couldn't get acceptable speed from it, so they had to go FP16.

FP16 is a Kludge, that's all it is, it's a way to get limited DX9 support when you can't get usable performance in full DX9 precision No API requires it, and if FP24 is likely to become useless soon, then FP16 will only precede it into irrelevance.

jvd · Dec 22, 2003

Rugor said:
Yes Nvidia's FP32 not only meets but exceeds the DX9 spec. Yes you can use FP16 in some cases with the DX9 "Partial Precision Hint." However, you cannot claim DX9 compatibility if all your card can do is FP16, because the spec requires at least FP24 support. The problem for Nvidia is that they exceed the spec with FP32, but drop down to FP16 for performance in most circumstances. The cards would perform much better if they could hold to FP16 all the time, but then they would not be meeting the DX9 spec.

FP32 is more than they need, but FP16 isn't always enough, and that's an example of an inelegant design. You don't need either to meet the spec, so if you're designing a card for games built around the DX9 spec why use them? A pure FP32 design would have had no problems with the spec, but Nvidia couldn't get acceptable speed from it, so they had to go FP16.

FP16 is a Kludge, that's all it is, it's a way to get limited DX9 support when you can't get usable performance in full DX9 precision No API requires it, and if FP24 is likely to become useless soon, then FP16 will only precede it into irrelevance.

Its great to exceed something. But what is the point if by doing so the program is no longer playable ? Sure you can put 6xfsaa on all games with ati but its not playable in all games so whats the point ? 32fp isn't playable on half life 2 with an nvidia card. So there for ati has the best design. This is also true for all the dx 9 games that I know of.

KimB · Dec 22, 2003

radar1200gs said:
Except that PP (that "substandard" part of DX9) is part of the DX9 spec.

Tell me. Does it make sense for a person who only browses the web to have a 3GHz processor with 1GB of RAM? No?

Bigger numbers aren't always better.

KimB · Dec 22, 2003

Rugor said:
The problem for Nvidia is that they exceed the spec with FP32, but drop down to FP16 for performance in most circumstances.

Your argument totally ignores the idea that perhaps FP24 will not be enough precision for some operations (it may not have enough precision for accurate texture filtering, for example).

euan · Dec 22, 2003

Chalnoth said:
Your argument totally ignores the idea that perhaps FP24 will not be enough precision for some operations (it may not have enough precision for accurate texture filtering, for example).

That's a completely pointless argument. Something is always at some point in time not going to enough for something else. Why continue to argue the same pointless fact? :?

If ATI upgrades to full 32b fp, it will probably run at full speed, and being completely transparent to the legacy 24b hardware. Isn't true that the 24b hardware is in essence 32bit format choped at the knees?

This thread is so pointless I feel angry for reading it all.

Ostsol · Dec 22, 2003

Even if texture filtering were handled by the fragment pipeline, ATI's FP24 and even NVidia's FP16 is more than enough. Consider, after all, that NVidia's texture interpolation is handled with 8 bit accuracy. Of course, one could always try and perform linear interpolation manually in a fragment shader. In that case, Humus' Mandelbrot Set demo shows us exactly what the result would be like.

Rugor · Dec 22, 2003

Chalnoth said:
Rugor said:

The problem for Nvidia is that they exceed the spec with FP32, but drop down to FP16 for performance in most circumstances.

Click to expand...

Your argument totally ignores the idea that perhaps FP24 will not be enough precision for some operations (it may not have enough precision for accurate texture filtering, for example).

Well, anything FP24 isn't enough for will rule out FP16 too, so it becomes irrelevant, and having FP32 without enough performance to use it isn't a huge leap beyond that either.

I am sure FP24 won't be enough precision forever, it's doubtless possible to write shaders that it can't handle, but those shaders aren't going to be DX9.0b shaders. By the time we need more than FP24 we will have moved on to the next generation. However that's beside the point, so long as the spec calls for FP24 that's all you need, and nothing else. Anything else is either beyond or beneath the spec and not called for. Should a new spec be brought out requiring FP32 then Nvidia's current hardware will be able to claim compliance with that spec, and ATI's won't. However, that's irrelevant for this spec. The current spec calls for FP24, and therefore no other precision is necessary. FP32 may be desirable, but you don't need it. What you need for a fast DX9.0b accelerator is really good FP24 performance; not FP16 or FP32 support, but fast FP24. Anything else is working towards a different spec.

ATI did a much better job than Nvidia in designing a card for high performance at the DX9 spec. We can go back and forth over which IHV does what better, but ATI's cards are better for pure DX9 because they are closer to the spec.

Sxotty · Dec 22, 2003

I have found it amazing how when a very simple logical argument is laid at people feet, they just say , no b/c I have no desire to think and go on with their argument, ignoring what others have written. If you do not desire to discuss things then just quit writing the same post over and over in different word orders.

OpenGL guy · Dec 22, 2003

Ostsol said:
Even if texture filtering were handled by the fragment pipeline, ATI's FP24 and even NVidia's FP16 is more than enough. Consider, after all, that NVidia's texture interpolation is handled with 8 bit accuracy. Of course, one could always try and perform linear interpolation manually in a fragment shader. In that case, Humus' Mandelbrot Set demo shows us exactly what the result would be like.

Texture addressing is the big issue with FP16, not filtering.

Ostsol · Dec 22, 2003

OpenGL guy said:
Ostsol said:

Even if texture filtering were handled by the fragment pipeline, ATI's FP24 and even NVidia's FP16 is more than enough. Consider, after all, that NVidia's texture interpolation is handled with 8 bit accuracy. Of course, one could always try and perform linear interpolation manually in a fragment shader. In that case, Humus' Mandelbrot Set demo shows us exactly what the result would be like.

Click to expand...

Texture addressing is the big issue with FP16, not filtering.

Hmm. . . I should have quoted the post I was replying to. . . Anyway, I realize where precision has an impact. I was replying to this post:

Chalnoth said:
Rugor said:

The problem for Nvidia is that they exceed the spec with FP32, but drop down to FP16 for performance in most circumstances.

Click to expand...

Your argument totally ignores the idea that perhaps FP24 will not be enough precision for some operations (it may not have enough precision for accurate texture filtering, for example).

rwolf · Dec 22, 2003

chalnoth said:
I think the main reason that the R3xx performed better form the start is that it seems to be designed more simply.

The R3xxx architecture has less transistors, lower clock speeds, less memory bandwidth, slower memory, uses less power, runs cooler on a smaller board and yet it still runs faster.

I don't think this is because the design is "simple".

The deltachrome is an example an 8 pipe "simple" design and it is slower than the 9600 with less quality features.

Rugor · Dec 22, 2003

I think elegant is a better word for the R300 design than simple. ATI did a good job of balancing their transistor usage for the features they needed, and the performance followed.

Dio · Dec 22, 2003

Ostsol said:
Even if texture filtering were handled by the fragment pipeline, ATI's FP24 and even NVidia's FP16 is more than enough. Consider, after all, that NVidia's texture interpolation is handled with 8 bit accuracy. Of course, one could always try and perform linear interpolation manually in a fragment shader. In that case, Humus' Mandelbrot Set demo shows us exactly what the result would be like.

Ah, but you have to generate the coordinates that sample the texture and then the fractional precision to interpolate between them. It's not just a case of the colour value returned.

ATI's mathemagicians showed FP24 was the correct minimum bound to ensure this works correctly, and Microsoft agreed.

radar1200gs · Dec 22, 2003

Dio said:
Ostsol said:

Even if texture filtering were handled by the fragment pipeline, ATI's FP24 and even NVidia's FP16 is more than enough. Consider, after all, that NVidia's texture interpolation is handled with 8 bit accuracy. Of course, one could always try and perform linear interpolation manually in a fragment shader. In that case, Humus' Mandelbrot Set demo shows us exactly what the result would be like.

Click to expand...

Ah, but you have to generate the coordinates that sample the texture and then the fractional precision to interpolate between them. It's not just a case of the colour value returned.

ATI's mathemagicians showed FP24 was the correct minimum bound to ensure this works correctly, and Microsoft agreed.

That is a case where FP16 clearly is inadequate. As I said before in the thread, anywhere there is a lot of recursion (and mandelbrots are recursive but their very nature) happening FP16 is likely to be inadequate. That is why FP32 is there - for the cases that FP16 can't handle.

I'd like the critics to show precisely where in a game over the next 18 months or so you would employ a shader with the complexity of a mandelbrot generator.

Dave Baumann · Dec 22, 2003

a.) Dio's example isn't even a complex case, it basic texturing, so we don't need to look very far at all.

b.) Fairly simple lighting shaders have been shown to have errors with FP16 precision - remember, in this can be lower precision than the FX12 modes available in the original NV30.

Dio · Dec 22, 2003

radar1200gs said:
(and mandelbrots are recursive but their very nature)

Grumble, moan, O(2^n), whinge, complain

FP16 and market support

Rugor

radar1200gs

Neeyik

Homo ergaster

Dave Baumann

Gamerscore Wh...

Rugor

jvd

KimB

KimB

euan

Ostsol

Rugor

Sxotty

OpenGL guy

Ostsol

rwolf

Rock Star

Rugor

Dio

radar1200gs

Dave Baumann

Gamerscore Wh...

Dio

Similar threads