If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#1 |
|
Registered
Join Date: Jul 2003
Posts: 5
|
I've been doing some pixel shader work recently on ATI Radeon and GeForceFX series, other than the 8500 series being slightly buggy with a few operations (note: i'm still researching this), ATI is generally quite stable especially with the 9500+ series which I can't actually find a problem with.. nVidia isn't too bad either, but my biggest concern with them is that the range is from -1 to +1 .. even on the GeForceFX which is the "best" card that they have.. I find quite a few of my weird shaders have underflow and over flow problems only on this hardware as a result, where ATI/RefereneRasterizer display it 100% fine..
This is pixel shader 1.1-1.4 series, I was basically wondering why this decision was made to have such high end cards capable of such large ranges (32-bit float ?!) have such a crappy range of -1 to +1 .. I can see having that on a GeForce3, but why on the FX too? are they not using 32-bit precision on the FX in DX8? if not what is it 16-bit? and isnt that capable of better than -1 to +1 anyway? |
|
|
|
|
|
#2 |
|
Senior Member
Join Date: Feb 2002
Location: gjethus, Norway
Posts: 1,256
|
There have been some rather lengthy discussions here at B3d on this subject. It appears that at most of the GeforceFXes (except possibly the NV35) have separate functional units for FP32 and fixed-point operation, with the fixed-point units being much faster and more numerous than the FP32 units. The fixed-point units, of course, offer much less precision and range than the FP32 units, and are used to implement PS1.1 shaders. IIRC, the fixed-point units can do 12 bits of precision.
AFAIK, it is not really clear at this point why Nvidia made the architectural decision to do both low-precision fixed-point units, and high-precision FP32 units - possibly backwards compatibility (if you program for Geforce3, you may have come to expect and even use for certain effects the [-1,+1] clamping, although I don't know what kinds of effects that might be), optimization for the 'common case' (DX7/DX8-class features, which is what most present games use), saving transistors (too many fast FP units = large transistor count; adding a lot of fixed-point units is very cheap in comparison), and possibly other reasons as well. |
|
|
|
|
|
#3 |
|
Irregular
Join Date: Feb 2002
Posts: 1,170
|
Ok let's see how different cards execute PS1.1-PS1.3 shaders:
GF3/4 texture ops: float (FP32?) arithmetic ops: FX9 [-1; +1] R8500 texture ops: FX16 [-8;+8] arithmetic ops: FX16 [-8;+8] GFFX texture ops: float (FP32?) arithmetic ops: FX12 [-2; +2] R9700 texture ops: float (FP24) arithmetic ops: float (FP24) Refrast texture ops: float (FP32) arithmetic ops: float (FP32) For PS1.4 replace texture ops with phase1 and arithmetic ops with phase2. So you bottlenecks are R8500 for texture ops and GF3/4 for arithmetic ops. |
|
|
|
|
|
#4 |
|
Regular
Join Date: Apr 2003
Location: Louvain-la-Neuve, Belgium
Posts: 523
|
GeForce FX's fixed point units (for PS 1.1-1.4 and fixed point calculations in OpenGL) have a [-2,2) range. But I think that this range is only exposed in NVIDIA's OGL extensions and limited to [-1,1] in DX.
I always thought that this could be a problem as PS1.4 shaders may have been designed with ATI R200/R300 range in mind [-8,8]. Of course NVIDIA could choose to use the FP units for PS1.4 calculations (I think they did it with some drivers, maybe these ones could help you) but it's not really good for performances |
|
|
|
|
|
#5 |
|
Registered
Join Date: Jul 2003
Posts: 5
|
Well I sent an email about 1-2 months ago detailing some shader code I was using to nvidia, you know so you could just paste it in the MFCPixelShader app that comes with the dx8 sdk and compare the Reference Rast to their own hardware
never heard back, so either they are too busy, or they've heard this problem one too many times, from our development standpoint it really kind of sucks, its quite obscure issue unless you research it a bit which we did, luckily it doesn't show up that often, but you you know .. what's the purpose of a shader if can't abstract what hardware its on.. oh well thanks for all the info guys |
|
|
|
|
|
#6 | |
|
Senior Member
Join Date: Jul 2002
Location: UK
Posts: 1,758
|
From the DX9 SDK under 'Registers ps 1_x':
Quote:
|
|
|
|
|
|
|
#7 |
|
Registered
Join Date: Jul 2003
Posts: 5
|
We saw that in the dx8 caps as well, I guess the purpose behind this thread/question is more "why" is this the behavior of said cards.. some possible answers might be to make things consistent across all their cards (3-4-FX) or maybe 12-bit is too little, either way I guess they won't change it..
|
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| 3dfx Rampage ;) | Ante P | 3D Architectures & Chips | 219 | 26-Feb-2012 19:48 |
| ImgTech launches programmable shader graphics for mobiles | marco | Press Releases | 0 | 29-Jul-2005 09:40 |
| shadermark 2.1 released | tEd | 3D Architectures & Chips | 88 | 17-Oct-2004 02:06 |
| shadermark 2.0 tomorrow! | tEd | 3D Architectures & Chips | 95 | 14-Oct-2003 18:34 |
| 5900? NV4x! | Frank | 3D Architectures & Chips | 71 | 10-Oct-2003 21:12 |