Dawn FP32 figures

Code:
Name    Type              Range                                        Versions 
cn   Constant register  -1 to +1                                     All versions  
rn   Temporary register -MaxPixelShaderValue to +MaxPixelShaderValue All versions 
tn   Texture register   -MaxPixelShaderValue to +MaxPixelShaderValue 1_1 to 1_3  
tn   Texture register   -MaxTextureRepeat to +MaxTextureRepeat       1_4  
vn   Color register                                                  1_4

For pixel shader version 1_1 to 1_3, MaxTextureRepeat must be a minimum of one. For 1_4, MaxTextureRepeat must be a minimum of eight.

That means texture registers need to be [-8, +8], so FP16 is needed for that, while temporary registers can be [-2, +2] (FX12).
 
demalion said:
...using temporary registers as texture coordinates (first phase) would work with fp16...
There is no need for FP16 in PS_1_4 as the only place where FP16 is needed is texture registers. But they can be clampled to [-2, 2] (FX12 numbers) when copying to temporary registers. And they only need high range/precision ([-8, 8] minimum) when sampling texture directly (i.e. texld r0, t0.xyz), but in this case values came directly from texture coordinate iterators (FP32 in GFFX?), so probably they not envolve any registers!
 
Evildeus said:
Performance in anything, quality in anything?
Ante P said:
Evildeus said:
Is there any diffencies with 44.61 drivers?

when it comes to what?

3DMark03 performance is up and quality is maintained.
"3DMurk" is removed.

When nVidia sent me the drivers they said that "these are the first new drivers we've released since the 3DMark03 debacle".


Haven't seen any changes anywhere else.
 
Evildeus said:
Ok thx, and concerning PS2.0 non change?

The PS2.0 test in 3dmark03 still differs from the reference and Atis ones. I wouldn't say that it looks worse but the wooden base has a different pattern on nvidia boards.
That is still present with 44.61.

PS2.0 in general doesn't seem to have changed much though I of course would need more tests to make any general assumptions.
 
Uttar said:
The NV3x *only* got FP32 registers. But it can divide them into FP16 registers.

How about the possibility to fit even more FX temps into a FP32 register?
 
ram said:
How about the possibility to fit even more FX temps into a FP32 register?
Yeah, how? FX12 register is 48 bits wide. How will you fit 48 bits into 128 bits more then twice?
 
MDolenc said:
Yeah, how? FX12 register is 48 bits wide. How will you fit 48 bits into 128 bits more then twice?

I was unclear, I should have written "FP32". I don't see why the registers must necessary be exactly 128bits and can't be 144 bits.
 
I'm sorry but all this FX12 temps make no sense.
If nVidia had any such capacity, they would obviously expose it via their extensions in OpenGL to potentially increase performance. And they obviously aren't.
So maybe they could do some things manually for PS1.1. and stuff, but I'd be VERY surprised if that was true.


Uttar
 
So does the lack of FX12 registers mean that with FX12 the GeforceFX receives the same penalties for register usage as FP16? Every four registers means a performance hit?
 
Ostsol said:
So does the lack of FX12 registers mean that with FX12 the GeforceFX receives the same penalties for register usage as FP16? Every four registers means a performance hit?

This wouldn't be the case for PS1.1 - PS1.3 as there are only 2 temporaries available. PS1.4 already offers 6 though, which would result in a performance drop as only 4 temps would fit in two full speed 128bit registers.

If nVidia had any such capacity, they would obviously expose it via their extensions in OpenGL to potentially increase performance. And they obviously aren't.

:?:

Can you explain me why they have to expose the way how they store temporaries in the chip as an open gl extension?
 
Back
Top