Futuremarks technical response

Dave H said:
And skinning is a light workload vertex shader operation, comparable to transforming, which is done on all the vertices.

Isn't skinning transforming? The difference between skinning and normal transforming is that more than one world transform is applied per vertex. Or are we talking about something else?
 
RussSchultz said:
It would be interesting to see the same shader compiled on both DX9 HLSL and Cg, when targetting PS2.0 and and a R300. It would also be interesting to see how the code differs when compiled at runtime for an NV30 vs. R300, GF4/9000/9100, etc.
Neither Cg, nor DX9 HLSL won't do any "special" optimizations based on which hardware it runs on. There really isn't much point of runtime compilation right now...
 
I thought I understood this, but others seem to say otherwise and I could use with some confirmation or correction: is floating point output a specification of the shader version, or the DirectX version?

I.e., is there such a thing as a "pixel shader 1.1 program working on floating point values", or does producing/accepting floating point values automatically make it "pixel shader 2.0", regardless of the instructions used?

The language used in, for example, Futuremark's discussion of GT 4, lead me to believe "pixel shader version" is determined by the instruction set and length, and it simply makes sense to me that it would be this way.
 
From what I remember from the DX spec, it's mentioned that first generation pixel shaders use one byte per channel. I think this implies that second generation use higher accuracy, and that accuracy isn't part of the spec (except perhaps as one byte per channel being the minimum).
 
demalion said:
Hyp-X said:
Hellbinder[CE said:
]
1. 3dmark03's shader routines were written with HLSL.

Only in the pixel shader 2.0 test.

It is!? Where is this info provided so I know what I failed to read closely enough...

Ahem.
You can find the info ... if you know where to look. ;)

This eases my mind a bit if it is compiled a run time...this is exactly the type of optimization nvidia should focus on achieving (if they can)...and maybe when the DX 9 HLSL compiler is further along this will be facilitated.

No it's precompiled unfortunately.

But it wouldn't help much if it was in source because DX 9 HLSL compiler is implemented in D3DX which is linked statically to the app, so Futuremark should have to release a patch anyway to update it.
 
demalion said:
I thought I understood this, but others seem to say otherwise and I could use with some confirmation or correction: is floating point output a specification of the shader version, or the DirectX version?

Neigther.

I.e., is there such a thing as a "pixel shader 1.1 program working on floating point values", or does producing/accepting floating point values automatically make it "pixel shader 2.0", regardless of the instructions used?

There's such a thing that "PS1.1 program working on FP values" but it's not specified nor denied by DirectX. The R300 does this (there's no integer shaders in R300).
DirectX requires that PS1.1 should handle at least 8 bit precisity and a range of [-1, 1].
The driver can indicate the highest absolute value it supports in 1.x shaders by the devcaps:

GF3/4: 1.0
R8500/9000: 8.0
GFFX: 2.0
R9500/9700: 340282346638528860000000000000000000000.0

The language used in, for example, Futuremark's discussion of GT 4, lead me to believe "pixel shader version" is determined by the instruction set and length, and it simply makes sense to me that it would be this way.

Pixel shader version have to be explicitly specified in the shader source.
For HLSL the version has to be passed to the compiler.
The HLSL compiler has no knowledge of the hardware apart from the PS version passed to it!
 
demalion said:
I thought I understood this, but others seem to say otherwise and I could use with some confirmation or correction: is floating point output a specification of the shader version, or the DirectX version?

I.e., is there such a thing as a "pixel shader 1.1 program working on floating point values", or does producing/accepting floating point values automatically make it "pixel shader 2.0", regardless of the instructions used?

The language used in, for example, Futuremark's discussion of GT 4, lead me to believe "pixel shader version" is determined by the instruction set and length, and it simply makes sense to me that it would be this way.
The PS specs put certain requirements on the instruction set, number of registers, read ports, instruction limits, etc. as well as they have some numerical requirements.
AFAIK PS 1.x only define what numerical range an implementation must support, but no min. precision. It should be perfectly valid to use float values in a PS1.1 implementation.
PS 2.0 however requires float values, at least s10e5 in some areas, and at least s15e8 in others.
 
Is this for real? Why is the NV30 capped at 2.0? And how does the R300 get so high a value? How does this affect the overbright lighting and HDR rendering capabilities of the NV30?

I'll expand on Dio's "floating point" response. ;)

NV30 is capped at 2.0 because it uses an "interger pipeline" when doing 1.1 pixel shaders. (And we still don't know, IIRC, what nV30 does with PS 1.4 shaders...)

It can affect overbright lighting, etc, for directX 8.X code However, this would mean that the code is checking the range caps for each card. In other words, using the Radeon 9500/9700 chip, "overbright" and other effects requiring high dynamic range can be performed in DirectX 8 shaders. You must use DX9 shaders on the FX to get the same result.
 
JF_Aidan_Pryde said:
Why is the NV30 capped at 2.0? How does this affect the overbright lighting and HDR rendering capabilities of the NV30?

That's 1.x shaders only.
NV30 has both integer and floating-point shader hardware, 1.x shaders are executed in the integer hw, 2.0 shaders in the fp hardware.
So it doesn't affect it's HDR capabilities as long as you use 2.0 shaders.

(We still don't know how exactly PS1.4 is implemented, my guess is that 1st phase uses FP, while 2nd phase uses integer.)

Edit: Damn I wasn't fast enough. :)
 
Plus no-one will do HDR stuff on ps_1_1 anyway...

Why, because nVidia can't do it? Is there any technical reason not to? If nVidia's PS 2.0 shader path really is as slow as it currently appears, wouldn't it be nice to be able to run some effects in the PS 1.1 path that don't need PS2.0 shaders?
 
Joe DeFuria said:
Why, because nVidia can't do it? Is there any technical reason not to? If nVidia's PS 2.0 shader path really is as slow as it currently appears, wouldn't it be nice to be able to run some effects in the PS 1.1 path that don't need PS2.0 shaders?
And what would you do with the HDR in ps_1_1? You still have to bring the HDR back in [0,1] range for display you know. You don't even have any fancy texture lookups (as in ps_1_4 and above). What are you going to do with it?
Pixel shaders 2.0 will run just fine once drivers are optimised.
 
And what would you do with the HDR in ps_1_1?

Perhaps you're right. PS 1.1 is extremely limiting. Devs won't do much with it at all in terms of really noteworthy features. ;)

so what about PS 1.4? Does FX support high dynamic range in PS 1.4? (And just as important, does the FX, with WHQL drivers, have acceptable speed with PS 1.4?)
 
Joe DeFuria said:
so what about PS 1.4? Does FX support high dynamic range in PS 1.4? (And just as important, does the FX, with WHQL drivers, have acceptable speed with PS 1.4?)
As Hyp-x said: We don't know. The obvious guess would be that they have to run at lest firsth phase in float mode (since it deals with texture coordinates, which might get out of [-2,2] range).
Why wouldn't FX be able to provide acceptable speed with ps_1_4 with WHQL drivers? From what we know about FX it is much more complex then Radeon 9700 (it executes pixel shaders from video memory, TMUs are able to output up to 4 texels per clock if they can fetch in advance,...). I think it's quite obious that NVidia optimised OpenGL NV30 path first (since they are running all their demos on this path), while all other paths were nothing more but working.
 
Back
Top