X800 using SM3.0 path in FarCry

pat777 said:
Partial precision is not bad. In Far Cry there's no difference between FP16 and FP24. I will admit, if there's a difference in IQ in future games then FP24 over FP16 will make more of a difference than FP32 over FP24.

But if FP 16 is not bad, why is Nvidia publicly telling everyone that FP24 is bad and FP32 is necessary, while privately getting developers to use FP16?

NV40 may be quite capable at FP32, but if Nvidia pay developers to only use FP16, then for all intents and purposes, you'll never see FP32 out of NV40 - it will act as an FP16 only card.
 
Bouncing Zabaglione Bros. said:
pat777 said:
Partial precision is not bad. In Far Cry there's no difference between FP16 and FP24. I will admit, if there's a difference in IQ in future games then FP24 over FP16 will make more of a difference than FP32 over FP24.

But if FP 16 is not bad, why is Nvidia publicly telling everyone that FP24 is bad and FP32 is necessary, while privately getting developers to use FP16?

NV40 may be quite capable at FP32, but if Nvidia pay developers to only use FP16, then for all intents and purposes, you'll never see FP32 out of NV40 - it will act as an FP16 only card.
It's just marketing. I'll admit nVIDIA is lying. FP24 is enough for most/all games this year. I'm not sure about FP16.

BTW, nVIDIA told developers to use FP16 whenever possible, not just everywhere. I think nVIDIA expected this generation to have effects where FP32 is absolutely nesecarry over FP24 while other effects are good with FP16.
 
g__day said:
"Nvidia encouraged Crytek to use partial precision in the shaders wherever possible. This means that many of the shaders will run in 16-bit precision, not the 32-bit precision (a requirement of SM3.0) that they are touting to the press. Ironically, while promoting the 32-bit precision of SM3.0 as a "must have", Nvidia is asking developers to use less precision than was available in SM1.0 - that is, all the way back in DirectX8! Even more ironically, ATI hardware will run most of the FarCry shaders more accurately (ATI hardware runs all shaders in 24-bit precision). Microsoft, the owner of DirectX, defines 24-bit and greater as "full precision" and 16-bit as "partial precision", Nvidia has claimed that ATI paid Crytek to delay the patch and include ATI features (the figure mentioned was $500k!)."

I needed to explain to the author of that where he was rather incorrect. :LOL:
 
DaveBaumann said:
g__day said:
"Nvidia encouraged Crytek to use partial precision in the shaders wherever possible. This means that many of the shaders will run in 16-bit precision, not the 32-bit precision (a requirement of SM3.0) that they are touting to the press. Ironically, while promoting the 32-bit precision of SM3.0 as a "must have", Nvidia is asking developers to use less precision than was available in SM1.0 - that is, all the way back in DirectX8! Even more ironically, ATI hardware will run most of the FarCry shaders more accurately (ATI hardware runs all shaders in 24-bit precision). Microsoft, the owner of DirectX, defines 24-bit and greater as "full precision" and 16-bit as "partial precision", Nvidia has claimed that ATI paid Crytek to delay the patch and include ATI features (the figure mentioned was $500k!)."

I needed to explain to the author of that where he was rather incorrect. :LOL:
Let me guess, FP16 uses less precision than SM 1.0.
 
Well, he was suggesting that the actual usable number of bits in FP16 was less than DX8 precisions - but there are no actual set precisions in DX8 (he was thinking of FX12 on NV3x). He also states that FP16 is definced as partial precision whilst FP24 is full, however than statement gets a little fudged when you start thinking about the differences in SM3.0.
 
Even FX12 is never more precise than FP16. And neither was the 13-bit format in Rampage, as he mentions SM1.0.
 
Ruined said:
Trying to run the new shaders on x800 probably causes some major graphical glitches, missing graphics, or some other anomoly ixbt didn't detect... If a Russian website could do it, obviously CryTek could have done it too.

And the evidence is where?
OR AT LEAST SOMETHING TO SUPPORT YOU CLAIM?!!??!?!
 
Unit01 said:
Ruined said:
Trying to run the new shaders on x800 probably causes some major graphical glitches, missing graphics, or some other anomoly ixbt didn't detect... If a Russian website could do it, obviously CryTek could have done it too.

And the evidence is where?
OR AT LEAST SOMETHING TO SUPPORT YOU CLAIM?!!??!?!
Hey, it's Ruined...it's not like he needs "evidence" or anything, he just knows! ;)
 
DaveBaumann said:
Well, he was suggesting that the actual usable number of bits in FP16 was less than DX8 precisions - but there are no actual set precisions in DX8 (he was thinking of FX12 on NV3x). He also states that FP16 is definced as partial precision whilst FP24 is full, however than statement gets a little fudged when you start thinking about the differences in SM3.0.

I did think that these statements were rather odd.

Isnt Pixel Shader 1.4 defined by a integer16 format? But not a FP16 format? Or is that how ATI does it?
 
ChrisRay said:
Isnt Pixel Shader 1.4 defined by a integer16 format? But not a FP16 format? Or is that how ATI does it?
PS1.4 defines a minimum range ([-8, 8]) for certain values, but no precision (although 256 values from 0 to 1 is generally considered minimum).
 
Xmas said:
ChrisRay said:
Isnt Pixel Shader 1.4 defined by a integer16 format? But not a FP16 format? Or is that how ATI does it?
PS1.4 defines a minimum range ([-8, 8]) for certain values, but no precision (although 256 values from 0 to 1 is generally considered minimum).

Am I confused but atleast on the fx5800 didnt Nvidia use INT12 for ps1.4 which in thei implementation only had a range of -2 to 2 thus violating the only requirement you put on it :? :p
 
dan2097 said:
Xmas said:
ChrisRay said:
Isnt Pixel Shader 1.4 defined by a integer16 format? But not a FP16 format? Or is that how ATI does it?
PS1.4 defines a minimum range ([-8, 8]) for certain values, but no precision (although 256 values from 0 to 1 is generally considered minimum).

Am I confused but atleast on the fx5800 didnt Nvidia use INT12 for ps1.4 which in thei implementation only had a range of -2 to 2 thus violating the only requirement you put on it :? :p

No I believe you are correct. The 5800 Did Shader 1.4 like that, Unless thats been changed with future driver revisions. I think the 5900 Does Shader 1.4 in FP16.
 
Oh come on people!

Do you work with C or C++?! You have float and double. You use float most of the time and double when you need precision. NVIDIA's hardware works exacly like that, but with an half and a float. ATI only offers halfloat (something in between). You use half in NVIDIA's hardware exacly like you do with float on C: precision is enought and it is faster. If precision is not enough, use double (or use float on the GPU...). What is the big deal of NVIDIA telling developers to use FP16?! Isn't obvious? It runs faster, so why not use it if it has enough precision?
 
Sigma said:
Oh come on people!

Do you work with C or C++?! You have float and double. You use float most of the time and double when you need precision. NVIDIA's hardware works exacly like that, but with an half and a float. ATI only offers halfloat (something in between). You use half in NVIDIA's hardware exacly like you do with float on C: precision is enought and it is faster. If precision is not enough, use double (or use float on the GPU...). What is the big deal of NVIDIA telling developers to use FP16?! Isn't obvious? It runs faster, so why not use it if it has enough precision?

I tend to use whichever data type performs best on which ever CPU I'm compiling code for whether it be 8-bit PIC's, 68ks, SH4(love floats, hate doubles), or X86. Then make the algorithm fit within the limits of the chosen precision as best possible.

I can't see how that would work too well, when one vendor has two formats, and the other 1 sitting inbetween. I guess we needed another big vendor to help decide a majority winner. :LOL:
 
Sigma said:
Oh come on people!

Do you work with C or C++?! You have float and double. You use float most of the time and double when you need precision. NVIDIA's hardware works exacly like that, but with an half and a float. ATI only offers halfloat (something in between). You use half in NVIDIA's hardware exacly like you do with float on C: precision is enought and it is faster. If precision is not enough, use double (or use float on the GPU...). What is the big deal of NVIDIA telling developers to use FP16?! Isn't obvious? It runs faster, so why not use it if it has enough precision?

Not quite a correct analogy. For it to be correct you'd have to take into account the following, in respect to DX9 ShaderModel 2.x:

1) Microsoft specified float to be less than FP24.
2) Microsoft specified double to be at least FP24.
3) ATI offers float as FP24.
4) ATI offers double as FP24, same performance as float.
5) Nvidia offers float as FP16, not as accurate as ATI's float, sometimes not even as fast as ATI's float.
6) Nvidia offers double as FP32, more accuracy than ATI's float/double, but significantly slower than ATI's float/double.
7) Nvidia attempts to pass FX16(FX12?) off as float or double.

Why does everyone gravitate towards Nvidia's solution being the 'correct' one, when the specs do not indicate it as such? If anyone is offering oddball formats, in regards to DX 9 SM2.x, then it would be Nvidia, not ATI.
 
float is a storage optimization. On CPUs, you use float to optimize memory and disk footprint, but speedwise, the precision will be whatever the FPU can run at the fastest (usually double nowadays)

On GPUs, float serves two purposes:

1) it frees up memory buffers on chip so that more context state can be saved per pixel (e.g. # of temporary registers)

2) some operations are float-specific (e.g. low precision only), for example, the "free norm" operation. Since normalization doesn't really need more than FP16, this is fine.

Nvidia is correct IMHO to say "use the minimum precision neccessary to do the job" Rules for floating point operations are very strict, and the compiler has its hands tied unless the developer can assist it. This is the same as in ANSI C/Fortran. Any extra semantic information given by the programmer to the compiler is information that can be used for optimization.

For numerical programming, it's good to have strongly typed languages IMHO.
 
BRiT said:
Not quite a correct analogy. For it to be correct you'd have to take into account the following, in respect to DX9 ShaderModel 2.x:

1) Microsoft specified float to be less than FP24.
2) Microsoft specified double to be at least FP24.
3) ATI offers float as FP24.
4) ATI offers double as FP24, same performance as float.
5) Nvidia offers float as FP16, not as accurate as ATI's float, sometimes not even as fast as ATI's float.
6) Nvidia offers double as FP32, more accuracy than ATI's float/double, but significantly slower than ATI's float/double.
7) Nvidia attempts to pass FX16(FX12?) off as float or double.

Why does everyone gravitate towards Nvidia's solution being the 'correct' one, when the specs do not indicate it as such? If anyone is offering oddball formats, in regards to DX 9 SM2.x, then it would be Nvidia, not ATI.

Well, the documentation is quite clear, the half type is a 16 bit floating point, the float type is a 32 bit floating point type and the double type is a 64 bit floating point type (see "Scalar Types" under "HLSL -> Data Types").

Secondly, it says the following:

MSDN said:
Not all target platforms have native support for half or double values. If the target platform does not, these will be emulated using float. Intermediate results of floating point expressions may be evaluated at a precision higher than the operands or the result.

So, floats should be 32 bits, and halfs can be upgraded to 32 bits if the hardware has no native support and doubles can be downgraded to 32 bits if the hardware has no native support.

Intermediate results may be at a higher precision than requested.

So, the documentation says that type float may not be less 32 bits, so FP24 doesn't cut it according to the documentation.

BUT, Microsoft seems to be very unclear in their documentation. I have never found the minimum required precisions (without hints)in PS2.0 for instance. The FP24 minimum comes from an e-mail as far as I know.



DemoCoder said:
float is a storage optimization. On CPUs, you use float to optimize memory and disk footprint, but speedwise, the precision will be whatever the FPU can run at the fastest (usually double nowadays)
You are right about the storage/bandwidth optimization of using floats instead of doubles. However, operations using a float tend to take less cycles than using a double. So, there are two optimizations, unlike you say.
 
The original PS2.0 specification stated FP24 as minimum for full precision. Interestingly enough, though, the HLSL documentation lists three floating point types and their precisions: half (FP16), float (FP32), and double (FP64).
 
Ostsol said:
The original PS2.0 specification stated FP24 as minimum for full precision. Interestingly enough, though, the HLSL documentation lists three floating point types and their precisions: half (FP16), float (FP32), and double (FP64).
That is indeed what I just typed (in a long version ;)). Problem with the MSDN documentation is that if a programmer naively uses that information and depends on float being 32 bits, that on ATi hardware it will become a problem. They should have made clear in the documentation that a float is AT LEAST 24 bits.
 
Back
Top