x800 texture shimmering: FarCry video

kihon · Jun 14, 2004

DaveBaumann said:
kihon said:

So what was the conclusion of this? Is only the X800 showing artifacts, or are they there in the 3xx series too? What about the NV40 (running DX9 path)?

Has this been submitted to ATI yet?

Click to expand...

AFAIK ATI have looked at Far Cry and this appears to be an application specific issue. As a result of tEd's other thread they have identified an issue with Max Payne and I believe it is being addressed.

Good to hear - but are there any links to show this? (ie any posts from ati?)

Thanks again.

Dave Baumann · Jun 14, 2004

radar1200gs said:
However, in the case of a DX9 pixel shader without _PP hinting, the data enters the chip in FP32 format and is truncuated in hardware by the GPU to FP24.

No. The "data" is internally generated. Unless you are dealing with float textures, the data going in is a bunch of vertex information - the pixel information is generated internally and the command is only giving a hint as to which precision it should be calculated.

WaltC · Jun 14, 2004

radar1200gs said:
Just for your information, DX9's native precision isn't FP24 - data is handed to R300 as FP32 if the dev hasn't explicitly specfied another format and the R300 then internally truncuates FP32 to FP24 (In my books thats full time _PP with no user choice).

And no, FX chips do not replace FP24 (or any other format) shaders with FP16 unless they are preceded by a _PP hint.

And guess what, the _PP (partial precision) hint is a valid, compliant part of DX9.

FYI, fp24 is the baseline in DX9 for full precision. Anything fp24 or > is "full precision" under the API. Anything fp below fp24 is considered by the API to be partial precision.

When nVidia encodes its drivers to knock everything back to fp16 in 3d gaming, that's no less "truncating" than ATi running fp24--except that fp24 is a higher rendering precision than fp16, right?

Putting it another way, what's the purpose for fp32 and fp16? I mean, why doesn't nVidia simply offer fp32 to the exclusion of fp 16, as ATi offers fp24 to the exclusion of fp16?

Answer: nVidia's fp32 implementation doesn't perform well enough to allow nVidia to drop support for fp16 in 3d gaming; unlike the case for ATi's fp24, which performs at least as well as nVidia's fp16, but is higher precision at the same time (and needs no lower-precision fallback.)

You've written in many posts how you think fp16 is just fine, and that there's nothing wrong with it. Obviously, then, you simply cannot argue with fp24 as your own statements undercut any basis for doing so...

radar1200gs · Jun 14, 2004

Then why would the ATi employees explain it differently?

(the explanation i'm thinking of is by one of them made in this forum a fair while back now, probably shortly after the 3dmark03 cheating issue blew up, don't remember the exact thread though).

WaltC · Jun 14, 2004

radar1200gs said:
Then why would the ATi employees explain it differently?

(the explanation i'm thinking of is by one of them made in this forum a fair while back now, probably shortly after the 3dmark03 cheating issue blew up, don't remember the exact thread though).

They didn't. What I suspect is that you understood it differently....

ninelven · Jun 14, 2004

FYI, fp24 is the baseline in DX9 for full precision. Anything fp24 or > is "full precision" under the API. Anything fp below fp24 is considered by the API to be partial precision.

The baseline for full precision in DX9 is dependent on the Shader Model. In SM2.0, fp16 is considered partial with fp24 and fp32 being full precision. In SM3.0, however, only fp32 is considered full precision and fp16 and fp24 are both partial. Thus, in ATI's implementation of it (SM2.0), fp24 is always full precision.

jvd · Jun 14, 2004

ninelven said:
FYI, fp24 is the baseline in DX9 for full precision. Anything fp24 or > is "full precision" under the API. Anything fp below fp24 is considered by the API to be partial precision.

Click to expand...

The baseline for full precision in DX9 is dependent on the Shader Model. In SM2.0, fp16 is considered partial with fp24 and fp32 being full precision. In SM3.0, however, only fp32 is considered full precision and fp16 and fp24 are both partial. Thus, in ATI's implementation of it (SM2.0), fp24 is always full precision.

Right. So whats the problem ati doesn't support sm 3.0 . Only nvidia does .

ninelven · Jun 14, 2004

No problem, I thought it was self explanatory... just a clarification.

jvd · Jun 14, 2004

ninelven said:
No problem, I thought it was self explanatory... just a clarification.

okay

i may have read it wrong. I thought u were trying to say that under sm3.0 its 32fp and ati doesn't do it. But I was like um ati doesn't do sm3.0

WaltC · Jun 14, 2004

ninelven said:
The baseline for full precision in DX9 is dependent on the Shader Model. In SM2.0, fp16 is considered partial with fp24 and fp32 being full precision. In SM3.0, however, only fp32 is considered full precision and fp16 and fp24 are both partial. Thus, in ATI's implementation of it (SM2.0), fp24 is always full precision.

So much simpler to just do away with fp16 completely, seems to me...

Much more logical to make ps3.0 "full precision only," from fp24 up, just like with sm2.0, but drop pp (fp16) precision completely from sm3.0. That way your baseline for full precision would remain constant in the DX9 version of the API, and all of its SM's. Sounds unduly complex as you've described it.

jvd · Jun 14, 2004

Well walt hopefully in dx 10 or next or whatever fp 32 is the minimum. Not for nothing but fp24 and fp16 pp hints are nice and all and were most likely needed last gen and this gen hopefully the r500 and nv50 wont need such a crutch

Drak · Jun 14, 2004

WaltC said:
So much simpler to just do away with fp16 completely, seems to me...

FP16 with its 10bit precision is still useful for indexing and traversing textures up to 1024x1024, or even counting up to 1024. If there is some performance advantage for using FP16 over FP24/32, they're still useful to have.

WaltC said:
Much more logical to make ps3.0 "full precision only," from fp24 up, just like with sm2.0, but drop pp (fp16) precision completely from sm3.0. That way your baseline for full precision would remain constant in the DX9 version of the API, and all of its SM's. Sounds unduly complex as you've described it.

My head is hurting now. Is FP24 full or partial precision under PS3.0?

DeanoC · Jun 14, 2004

SM 3.0 is FP32 by default

WaltC · Jun 15, 2004

jvd said:
Well walt hopefully in dx 10 or next or whatever fp 32 is the minimum. Not for nothing but fp24 and fp16 pp hints are nice and all and were most likely needed last gen and this gen hopefully the r500 and nv50 wont need such a crutch

The thing is though that I don't see how fp32 is any more of a maximum ceiling than fp24 or fp16 or i24 & a8, in terms of color precision in the pipeline. I know we're all in love with "32" as a number....

But I don't see it as being any less of a "crutch" at some point than people consider fp16 to be today. I don't see fp24 as a crutch, but rather a "sweet spot" in terms of fp color precision that allows ATi to avoid fp16. It's right smack in the center of fp16 and fp32, providing precision benefits > fp16 but performance very similar to fp16, while not suffering from the performance limits of fp32 technology implementations to date. Ie, it's because nV40 doesn't have fp24 support that it requires fp 16--or, rather, that nVidia feels fp16 support is necessary.

My point was that for the DX*9* version of D3d, maintaining the fp24 baseline for full precision throughout the SM's, but dropping off pp for SM3.0, just would seem to be more logically consistent throughout the structure of the API. Of course I'd expect to see the definition of "full precision" change with DX10--and maybe change again with DX11, and so forth, as the hardware gets better and better. I mean, if for 3d-gaming nV40's going to have to drop to fp16 to maintain adequate performance levels with 3.0, then I can't see the SM3.0 point in terms of color precision, as SM 2.0b+ running at fp24 is still going to offer more of it.

jvd · Jun 15, 2004

Well we can't stay with 16fp pp for ever. We need to move away from that. 24fp was great for this generation but we know its limits come much sooner than 32fp. Of course higher is going to be better and I think 32fp is a good baseline for dx next with higher percisions being wanted and welcome .

I'm hopping that with the next gen nv50 and r500 they are once again 3 or 4 times faster in all aspects like this gen was over the previous.

Thus with 4x the perfromance of the r420 and nv40 we shouldn't need to use 16fp_pp . We shouldn't need fp24 as a middle ground. It should be full fp32 at all times if not greater.

jimmyjames123 · Jun 15, 2004

Putting it another way, what's the purpose for fp32 and fp16? I mean, why doesn't nVidia simply offer fp32 to the exclusion of fp 16, as ATi offers fp24 to the exclusion of fp16?

Answer: nVidia's fp32 implementation doesn't perform well enough to allow nVidia to drop support for fp16 in 3d gaming; unlike the case for ATi's fp24, which performs at least as well as nVidia's fp16, but is higher precision at the same time (and needs no lower-precision fallback.)

NV uses mixed precision modes because it is advisable to use as low a precision as possible as long as no visual anomolies or errors are present. NV's decision to use mixed FP16/FP32 precision makes sense in this light. Also, I believe that there are some instances where FP24 will result in errors vs FP32. Keep in mind that the entire industry is moving towards FP32. FP24 will not be the standard moving forwards for DirectX 9.0c and beyond. I believe that FP24 was a very practical choice for the R3xx cards given the general performance limitations of the hardware in that generation, but in the future FP24 will be considered partial precision. Apparently, the NV40 should have FP32 performance that approaches the FP16 performance once the compiler is more up to speed. But again, in any situation where the higher precision is not needed and no visual errors are present, then it makes sense to use lower precision.

Pete · Jun 15, 2004

WaltC said:
The thing is though that I don't see how fp32 is any more of a maximum ceiling than fp24 or fp16 or i24 & a8, in terms of color precision in the pipeline.

I'm not sure I get your drift here, as FP128 > FP96. We're not so much talking about color precision as we're talking about rounding errors.

I know we're all in love with "32" as a number.... But I don't see it as being any less of a "crutch" at some point than people consider fp16 to be today.

That's a bit of a stretch. The whole point is that FP32 delays getting its crutches longer than FP16 or FP24.

I don't see fp24 as a crutch, but rather a "sweet spot" in terms of fp color precision that allows ATi to avoid fp16. It's right smack in the center of fp16 and fp32, providing precision benefits > fp16 but performance very similar to fp16, while not suffering from the performance limits of fp32 technology implementations to date. Ie, it's because nV40 doesn't have fp24 support that it requires fp 16--or, rather, that nVidia feels fp16 support is necessary.

From what I've read, FP24 doesn't so much provide precision and performance benefits over FP16 as it does save transistor usage and engineering headaches over FP32.

It would appear that the move to FP32 as a standard, as well as the move to loops and branches, is to make GPU programming more similar to CPU programming. I don't see where FP24 fits into that.

bloodbob · Jun 15, 2004

Well I'm fairly certain SM3.0 is FP24 is still full precision to all the end users. It a couple of months it probably won't be because DX9.0c will be out.

Evildeus · Jun 15, 2004

No SM3.0 requires FP32 as full precision. Would be interesting to see when Ati makes the move.

bloodbob · Jun 15, 2004

Evildeus said:
No SM3.0 requires FP32 as full precision. Would be interesting to see when Ati makes the move.

Can you show me a single document from MS prior to DX9.0c that says this? Anyone using the DX9.0b SDK or DDK wouldn't have been forced to conform to that 32bit FP.

x800 texture shimmering: FarCry video

kihon

Dave Baumann

Gamerscore Wh...

WaltC

radar1200gs

WaltC

ninelven

PM

jvd

ninelven

PM

jvd

WaltC

jvd

Drak

DeanoC

Trust me, I'm a renderer person!

WaltC

jvd

jimmyjames123

Pete

Moderate Nuisance

bloodbob

Trollipop

Evildeus

bloodbob

Trollipop

Similar threads