What exactly is wrong with Nvidias Shader 1.4 Implementation

ChrisRay

<span style="color: rgb(124, 197, 0)">R.I.P. 1983-
Veteran
Now I hate Synthetic tests, But everything seems to Point Nvidias Pixel Shader 1.4 support just isnt that great.

But 3dmark 2001SE test is useful because it has a fallback to 1.1 Shaders with More Passes (theoretically slower)

and I've been Playing 3danalyze to try and figure out Why the Shader 1.4 Score is so much lower. Than you'd expect. And Verify if its just slow (what I suspected) Or if it was indeed just falling back to Shader 1.1 (which is not the case)

My Specs are listed below. But not really neccasary to understand the differences.

Pixel Shader 1.4 (Advanced Pixel Shader Test)

Shader 1.4
FPS 119

Shader 1.1
FPS 147.5

Is Nvidias Shader 1.4 emulated or something? I cant seem to even understand why Something that should require less passes being significantly slower than using More passes.
 
Proper implementation of PS 1.4 requires higher precision than FX12. PS 1.1 doesn't require as high of precision.

Therefore, when using PS 1.4, nVidia needs to use floating-point accuracy. We all know of the NV3x's FP processing woes.
 
Chalnoth said:
Proper implementation of PS 1.4 requires higher precision than FX12. PS 1.1 doesn't require as high of precision.

Therefore, when using PS 1.4, nVidia needs to use floating-point accuracy. We all know of the NV3x's FP processing woes.
I don't really buy this explanation. NVidia was willing to force PS 2.0 shaders to use FP16, but they're not willing to force PS 1.4 to FX12? The difference would not be very noticeable except in some dependent texture lookups.

Then in NVidia's HL2 rebuttal, they say some shaders can be made PS 1.4 instead of PS 2.0. Why would they care if your theory was correct?

I think it has to do with the flexible dependent texturing of PS 1.4 (and PS 2.0). NVidia is fine with dependent texturing in DX8 because they are fixed modes and it appears NVidia uses a different pipeline (borrowed from GF3/4?) for that.

I'd love to see some simple offset mapping benchmarks between R3xx and NV3x, as well as some without the offset instructions. As simple as possible, with only 3 or 4 texture accesses.

MDolenc's fillrate tester also shows that with the newer drivers PS 1.4 isn't always slower: http://www.beyond3d.com/forum/viewtopic.php?t=6142&amp;postdays=0&amp;postorder=asc&amp;start=118.
 
Keep in mind that MDolenc's fillrate tester uses extremely simplistic shaders (essentially just MAD's) that aren't indicative of real-world shaders.
 
rn Temporary register - MaxPixelShaderValue to + MaxPixelShaderValue All versions

tn Texture register - MaxTextureRepeat to + MaxTextureRepeat 1_4

For pixel shader version 1_1 to 1_3, MaxTextureRepeat must be a minimum of one. For 1_4, MaxTextureRepeat must be a minimum of eight.
This means in practice that NV3x can not use FX12 in the first phase of a two-phase PS1.4 shader, i.e. for dependent texture reads.
 
[rant]
At least the 'texcrd' instruction works in the NV3x implemantation.

It used to work on ATI cards - but appereantly they don't care enough to perform regression testing.
[/rant]
 
I'm not sure how accurate it is to compare 3dmark2001's Advanced Pixel Shader test between PS1.4 and PS1.1.

Certainly making wide sweeping conclusions based off of it will lead to inaccurate conclusions. Sometimes NV3x is faster with PS1.4, sometimes it's slower.

When NV3x uses PS1.4 in 3dmark03 instead of PS1.1 it is faster, for example.
 
StealthHawk said:
I'm not sure how accurate it is to compare 3dmark2001's Advanced Pixel Shader test between PS1.4 and PS1.1.

Certainly making wide sweeping conclusions based off of it will lead to inaccurate conclusions. Sometimes NV3x is faster with PS1.4, sometimes it's slower.

When NV3x uses PS1.4 in 3dmark03 instead of PS1.1 it is faster, for example.


Trying to remember that test with Nvidia card does better than ATI in the Shader 1.4 Test.. Gah what was it called..


I think I shoulda been more specific, Whats wrong with Nvidia Shader 1.4 Implementation in synthetic 3dmark2001SE ;p
 
ChrisRay said:
StealthHawk said:
I'm not sure how accurate it is to compare 3dmark2001's Advanced Pixel Shader test between PS1.4 and PS1.1.

Certainly making wide sweeping conclusions based off of it will lead to inaccurate conclusions. Sometimes NV3x is faster with PS1.4, sometimes it's slower.

When NV3x uses PS1.4 in 3dmark03 instead of PS1.1 it is faster, for example.


Trying to remember that test with Nvidia card does better than ATI in the Shader 1.4 Test.. Gah what was it called..


I think I shoulda been more specific, Whats wrong with Nvidia Shader 1.4 Implementation in synthetic 3dmark2001SE ;p

The 1.4 test is just more graphics card limited.

The 1.1 test is very CPU limited.

You should see the following post (from micb) and reply from Neeyik :

http://www.beyond3d.com/forum/viewtopic.php?t=10105
 
PeterAce said:
The 1.4 test is just more graphics card limited.

The 1.1 test is very CPU limited.

You should see the following post (from micb) and reply from Neeyik :

http://www.beyond3d.com/forum/viewtopic.php?t=10105
The PS1.1 test is only very CPU limited on a NV3x at resolutions less than 1024 x 768; across the rest of the resolutions it drops lots of fill rate. You can see this easily in Dave's 6800 Ultra preview:

http://www.beyond3d.com/previews/nvidia/nv40/index.php?p=20

Note that all the NV3x cards are still not 100% CPU limited in the PS1.1 test.
 
Back
Top