x800 texture shimmering: FarCry video

No i don't, (and don't want to search); now can you provide me a link where FP24 is the baseline for SM 3.0 in DX9.0a/b? ;)
 
Pete said:
That's a bit of a stretch. The whole point is that FP32 delays getting its crutches longer than FP16 or FP24.

the whole point of fp32 is simply, it's the todays accepted standard for all sort of computation. of course we can go higher, 64 and 80bit fp are supported, others are doable manually, but fp32 is the standard numerical type on any pc, and most other systems, too.
as is the 32bit integer.

bit of that is in change with 64bit systems. but fp in general stays with 32bit. it's the thing we play games with, and do most of the precalculations on the level data, and all. fp32 is good enough. else, we would have huge graphical glitches in all games.
 
Evildeus said:
No i don't, (and don't want to search); now can you provide me a link where FP24 is the baseline for SM 3.0 in DX9.0a/b? ;)

Unfortunatly I don't got access to the DDK or else I'd show you the baseline for Full Precision in pixel shaders in DX9.0 ( which wasn't changed at a or b)
 
davepermen said:
Pete said:
That's a bit of a stretch. The whole point is that FP32 delays getting its crutches longer than FP16 or FP24.

the whole point of fp32 is simply, it's the todays accepted standard for all sort of computation. of course we can go higher, 64 and 80bit fp are supported, others are doable manually, but fp32 is the standard numerical type on any pc, and most other systems, too.
as is the 32bit integer.

bit of that is in change with 64bit systems. but fp in general stays with 32bit. it's the thing we play games with, and do most of the precalculations on the level data, and all. fp32 is good enough. else, we would have huge graphical glitches in all games.

Not only that but FP16 & FP32 (like longwords and words) are easy for computer systems to transport from place to place and are easy to manipulate in registers. This is why in 2D, 32 bit per pixel became preferred over 24 bit despite the overheads - it's easier to shift and operate on 1 long word (or 1 word for high color) than it is to fiddle with 3 bytes.
 
radar1200gs said:
Not only that but FP16 & FP32 (like longwords and words) are easy for computer systems to transport from place to place and are easy to manipulate in registers. This is why in 2D, 32 bit per pixel became preferred over 24 bit despite the overheads - it's easier to shift and operate on 1 long word (or 1 word for high color) than it is to fiddle with 3 bytes.

Thats just so wrong. It completely depends on what the architechture is targeted to and what you want to do with it.
 
It isn't wrong. Name me almost any consumer oriented computer system from the Altair on up and you will see data is mainly transported as single bytes, words or long words, whichever happens to fit the need at hand best.
 
you're both right. in a general purpose environment, 32bit aligned (or even higher bit alignment) is a general performance thing.

pc's all like the powerof2 alignment, thus we run at 32bit modes all the time (windows doesn't bother about the alpha part really..).

on the other hand, dedicated hw can of course define the alignment how ever they want. so internally, ati can use 24bit alignment at full speed.

but at the interface level, they have to fall back to the standards of the pc => 32bit aligned values.
 
radar1200gs said:
It isn't wrong. Name me almost any consumer oriented computer system from the Altair on up and you will see data is mainly transported as single bytes, words or long words, whichever happens to fit the need at hand best.

I wasnt saying that you were wrong about the data being transported in bytes, just that on a graphics card its better to be aligned on 16 or 32 bit boundries. However if you really wanted you could create a custom architecture with a byte size of 10 bits, wouldnt be compatible with much but you could.

The reason 2D cards went to 32bit was to add a full colour 24bit mode with 8bit alpha channel (ARGB8888). There is no speed benefit internaly from going from 24bit to 32bit, all that would be required is a memory controller optimised for that specific byte alignment.
 
jimmyjames123 said:
NV uses mixed precision modes because it is advisable to use as low a precision as possible as long as no visual anomolies or errors are present. NV's decision to use mixed FP16/FP32 precision makes sense in this light. Also, I believe that there are some instances where FP24 will result in errors vs FP32.

Just as there are instances where fp16 will show rendering errors when fp24 will not. My point is that while fp16 makes sense for nVidia's architecture, because fp32 isn't fast enough, ATi's fp24 implementation is at least as fast as nV's fp16, which is why the ATi architecture doesn't need an fp16 fall back. IE, fp24 kills two birds with one stone--precision (more of it than fp16), and performance (at least equal to fp 16, and faster than fp32.)

Keep in mind that the entire industry is moving towards FP32.

What I'm saying is that at some point in the future most likely fp32 will be exceeded by fp48, fp64, etc. Color rendering precision in the pipeline is a moving target--there is no absolute ceiling for it--fp32 is not it anymore than fp24. The target has been moving since 3dfx shipped the V1. Current ceilings are dictated by architecture, which is in turn impacted by yield projections based on current manufacturing processes, and how those factors are weighed by the individual IHVs.

FP24 will not be the standard moving forwards for DirectX 9.0c and beyond.

But for DX9, which is what I'm talking about, fp24 is the baseline for full precision, despite how that baseline may move up in DX10, 11, etc., in the future. If for DX9.0b the baseline for sm2.0 full precision is fp24, then in DX9.0c the "partial" precision fp24 in sm3.0 is the same level of precision considered "full" in DX9.0c sm2.0, which strikes me as an inconsistency--if that in fact is the way that ps3.0 will be implemented by M$ in DX9.0c. I just don't see any point whatever in changing the terminology--because that's what you'd be doing, and that's all you'd be doing.

IE, it isn't relative to ps3.0 in DX9 to change the baseline of "full precision" in the sm from fp24 to fp32, since the distinguishing features of sm3.0 under DX9 compared to sm2.0 are features *other* than color precision in the pipeline (since M$ has already established that fp24 is the baseline for full precision in DX9 sm2.0, which ATi meets with fp24 and nVidia meets with fp32. Of course, all of this will change with DX10 and beyond, no question.) So, the actual hardware color precision levels are still the same for nV40 and R420 under all of DX9, regardless of sm, as they were for R3x0 and nV3x under sm2.0--fp16/32 for nVidia, fp24 for ATi. I can only see that we're *calling* them something different under 3.0 than we did under 2.0, which strikes me as inconsistent. But that's just my opinion...;)

I believe that FP24 was a very practical choice for the R3xx cards given the general performance limitations of the hardware in that generation, but in the future FP24 will be considered partial precision.

Even though I'm trying to stay somewhat in the present with DX9, I can only say that it's also possible that at some point in the future that fp32 will be considered "partial precision." I don't think it is any wiser to cap color precision absolutely at fp32 than it is to cap it at fp24...;)

Apparently, the NV40 should have FP32 performance that approaches the FP16 performance once the compiler is more up to speed. But again, in any situation where the higher precision is not needed and no visual errors are present, then it makes sense to use lower precision.

OK, but if, as you say, "NV40 should have FP32 performance that approaches the FP16 performance once the compiler is more up to speed," then fp16 really won't be needed anymore, for the same reason ATi doesn't need it with fp24...;) As I said, nVidia thinks they will need it, obviously, and I don't expect to see it fall off nV architecture until such time as they think they no longer do.
 
Pete said:
I'm not sure I get your drift here, as FP128 > FP96. We're not so much talking about color precision as we're talking about rounding errors.

Rounding errors relative to color precision, though. fp16 is R16,G16,B16, A16, or "64-bits," roughly, as compared to the older integer "32-bit" scale of R8,G8,B8,A8, which compares with i16 as R4,G4,B4,A4 = 16-bits, etc. FP24 is R24,G24,B24,A24 or, roughly, "96-bits" on the i scale. It's not exact, but close, and the whole point of fp precision in the pixel pipe is color precision for the RGB&A pixel elements, imo.

Why should R32,G32,B32,A32, which is fp32, represent the ultimate ceiling in 3d pixel pipe color precision? I think it is currently a glass ceiling as opposed to an absolute ceiling, and that it will be shattered on down the line as technology and necessity dictate, just as fp24 exceeds fp16, and fp32 exceeds fp24, and all fp precision exceeds i precision. For years, people stated over and over again that "32-bits" is quite enough, but it wasn't, was it?....;)


It would appear that the move to FP32 as a standard, as well as the move to loops and branches, is to make GPU programming more similar to CPU programming. I don't see where FP24 fits into that.

I choose not to confuse my 3d gpus with my cpus...;) Neither is in danger of replacing the other anytime soon as far as I can see. Also, phrases like "loops and branches," while colorful metaphors promising meaning--really aren't very descriptive of specifics, are they?...:D Sometimes "looping and branching" can be valuable, sometimes not. It all depends on what you want to do.
 
jimmyjames123 said:
But again, in any situation where the higher precision is not needed and no visual errors are present, then it makes sense to use lower precision.
Funny how things slip right back into the same exact category as trilinear optimizations, isn't it? :) Though please let us not say "visual errors" but "visual differences."
 
But again, in any situation where the higher precision is not needed and no visual errors are present, then it makes sense to use lower precision.

I don't see fp16/fp32 as any better than fp24 .

As a matter of fact i would have liked sm 3.0 to require fp32/fp24 and exclude fp16

I'd also like to see fp64 as full percision and fp32 as partial in dx next .
 
jvd said:
As a matter of fact i would have liked sm 3.0 to require fp32/fp24 and exclude fp16

Why exlude FP16 ? it's optional for the developer to use when you only need that much precision.

I'd also like to see fp64 as full percision and fp32 as partial in dx next .

AFAIK, FP64 is used very scarsely even in todays CGI rendering so moving to FP64 as full would be rather unecessary.

How about half = FP16, full = FP32, double = FP64 ?
 
Back
Top