Thats weird. PS2 precision Test ATI new Drivers.

engall

Newcomer
PixelShader 2.0 precision test. Version 1.3
Copyright (c) 2003 by ReactorCritical / iXBT.com
Questions, bug reports send to: clootie@ixbt.com

Device: RADEON 9500
Driver: ati2dvag.dll
Driver version: 6.14.10.6378

Registers precision:
Rxx = s23e7 (temporary registers)
Cxx = s23e7 (constant registers)
Txx = s16e7 (texture coordinates)

Registers precision in partial precision mode:
Rxx = s23e7 (temporary registers)
Cxx = s23e7 (constant registers)
Txx = s16e7 (texture coordinates)


s23e7 ,Whats that??
 
engall said:
PixelShader 2.0 precision test. Version 1.3
Copyright (c) 2003 by ReactorCritical / iXBT.com
Questions, bug reports send to: clootie@ixbt.com

Device: RADEON 9500
Driver: ati2dvag.dll
Driver version: 6.14.10.6378

Registers precision:
Rxx = s23e7 (temporary registers)
Cxx = s23e7 (constant registers)
Txx = s16e7 (texture coordinates)

Registers precision in partial precision mode:
Rxx = s23e7 (temporary registers)
Cxx = s23e7 (constant registers)
Txx = s16e7 (texture coordinates)


s23e7 ,Whats that??
Some sort of app. bug? Unless you found the secret registry key that enables 31-bit float support ;)
 
OpenGL guy said:
engall said:
PixelShader 2.0 precision test. Version 1.3
Copyright (c) 2003 by ReactorCritical / iXBT.com
Questions, bug reports send to: clootie@ixbt.com

Device: RADEON 9500
Driver: ati2dvag.dll
Driver version: 6.14.10.6378

Registers precision:
Rxx = s23e7 (temporary registers)
Cxx = s23e7 (constant registers)
Txx = s16e7 (texture coordinates)

Registers precision in partial precision mode:
Rxx = s23e7 (temporary registers)
Cxx = s23e7 (constant registers)
Txx = s16e7 (texture coordinates)


s23e7 ,Whats that??
Some sort of app. bug? Unless you found the secret registry key that enables 31-bit float support ;)
I have no idea.
How could ATI make it?
 
Actually, it's supposed to be 124bpp. I read ages ago that it will be implemented.

I think this one is a bug though. :)
 
I get the same as he does except for Rxx = s0e7


====================================================================
PixelShader 2.0 precision test. Version 1.3
Copyright (c) 2003 by ReactorCritical / iXBT.com
Questions, bug reports send to: clootie@ixbt.com

Device: RADEON 9700 PRO
Driver: ati2dvag.dll
Driver version: 6.14.10.6378

Registers precision:
Rxx = s0e7 (temporary registers)
Cxx = s23e7 (constant registers)
Txx = s16e7 (texture coordinates)

Registers precision in partial precision mode:
Rxx = s23e7 (temporary registers)
Cxx = s23e7 (constant registers)
Txx = s16e7 (texture coordinates)
________
Iolite vaporizer
 
Last edited by a moderator:
mine:

====================================================================
PixelShader 2.0 precision test. Version 1.3
Copyright (c) 2003 by ReactorCritical / iXBT.com
Questions, bug reports send to: clootie@ixbt.com

Device: RADEON 9700 PRO
Driver: ati2dvag.dll
Driver version: 6.14.10.6378

Registers precision:
Rxx = s0e3 (temporary registers)
Cxx = s0e3 (constant registers)
Txx = s0e3 (texture coordinates)

Registers precision in partial precision mode:
Rxx = s0e3 (temporary registers)
Cxx = s0e3 (constant registers)
Txx = s0e3 (texture coordinates)

the app needs an upgrade :LOL:
 
Code:
;### Rxx_Mantissa
ps_2_0
def c0, 0.0f, 1.0f, %sf, 2.0f
mov r0, c0.y     // mov r0, 1
mov r1, c0.z
add_pp r0, r0, r1 // + BIG NUMBER (but itself lower precision)
sub_pp r0, r0, r1 // - BIG NUMBER (but itself lower precision
mov_pp oC0, r0

Maybe the new driver optimizes out the add+sub instructions?
 
The compiler is getting smarter all the time.

It's easy to see in that code that the r0 result is actually just arithmetic based on a constant, and so the value could be precalculated. C compilers do this kind of thing all the time.

Conversely, it could notice that the add and sub are cancelling each other out, and so skip them.

I would say that more advanced testing methods are required...
 
Dio said:
I would say that more advanced testing methods are required...

Hmm.
The constants shouldn't be in the shader, but set from outside.
The changing value should be passed twice and not reused so the optimizer wouldn't know the add/sub can be optimized.

That should do it, shouldn't it?
 
It's certainly better. There are other side-issues.

- exact order of evaluation isn't guaranteed for commutative operands. It would be better to break up the commutativity by using mad r0, r0, c0 (1), c1 (constant value), to ensure the two constant adds don't get folded and applied out-of-order but even then there could potentially be reordering.

- on some architectures 'recently used' intermediates could have higher precision than 'not recently used' intermediates (For example, one could imagine this would be the case on an x87 FPU implementation). I can't think of a simple way around this one, but I don't know if it would be a problem anyway.

- 'variable' precision in general is an issue.

- there are probably other things I can't think of right now

Maybe a completely different methodology might be better.
 
Seems new ATi driver indeed doing some constant propagation (not sure of english term). So quick fix is to add shaders quoted below to start of PShaders.txt configuration file. Also good thing is to remove original shaders.
Code:
;### Rxx_Mantissa
ps_2_0
def c0, 0.0f, 1.0f, %sf, 2.0f
mov r0, c0.y     // mov r0, 1
mov r1, c0.z
; broke ATi optimization  - Start
rcp r3, r0.x
add r1, r1, r3
; broke ATi optimization  - End
add_pp r0, r0, r1 // + BIG NUMBER (but itself lower precision)
sub_pp r0, r0, r1 // - BIG NUMBER (but itself lower precision)
mov_pp oC0, r0

;### Cxx_Mantissa
ps_2_0           // def c0, 0.0f, 1.0f, pp, pp+1
def c0, 0.0f, 1.0f, %sf, %sf
mov_pp r0, c0.w
mov_pp r1, c0.z
; broke ATi optimization  - Start
rcp r3, c0.y
add r0, r0, r3
; broke ATi optimization  - End
sub r0, r0, r1
mov oC0, r0
In code above one can see commented code snippets which brokes optimization in driver shader compiler.

After this modification:
Code:
====================================================================
PixelShader 2.0 precision test. Version 1.3
Copyright (c) 2003 by ReactorCritical / iXBT.com
Questions, bug reports send to: [email]clootie@ixbt.com[/email]

Device: RADEON 9500
Driver: ati2dvag.dll
Driver version: 6.14.10.6378

Registers precision:
Rxx = s16e7 (temporary registers)
Cxx = s16e7 (constant registers)
Txx = s16e7 (texture coordinates)

Registers precision in partial precision mode:
Rxx = s16e7 (temporary registers)
Cxx = s16e7 (constant registers)
Txx = s16e7 (texture coordinates)
 
Dio said:
I would suggest you make some more of the changes discussed above as well, though. ;)
Ok, better solution, still without changing the executable:
Code:
;### Rxx_Mantissa
ps_2_0
def c0, 0.0f, 1.0f, %sf, 2.0f
dcl v0
mov r0, c0.y       // mov r0, 1
mov r1, c0.z       // mov r1, 512
mul r2, c0.z, v0.x // mov r1, 512*1
add_pp r0, r0, r1  // + BIG NUMBER (but itself lower precision)
sub_pp r0, r0, r2  // - BIG NUMBER (but itself lower precision)
mov_pp oC0, r0

;### Cxx_Mantissa
ps_2_0             // def c0, 0.0f, 1.0f, pp, pp+1
def c0, 0.0f, 1.0f, %sf, %sf
dcl v0
mov_pp r0, c0.w
mov_pp r1, c0.z
mul r1, r1, v0.x
sub r0, r0, r1
mov oC0, r0
 
Back
Top