Vertex Shaders and Pixel Shaders beyond version 2.0:

Ostsol

Veteran
I don't know how much you guys already know about this, but in the DirectX 9 SDK the documentation contains not only the expected information on VS and PS 2.0, but information on "extended" versions of each, plus version 3.0. Here's a brief list of the new features:

Vertex Shader 2.0:

New instructions:

Setup instructions - defb, defi
Arithmetic instructions - mova
Macros - abs, crs, expp, logp, lrp, nrm, pow, sincos, slt
Static flow control instructions - call, callnz, else, end, endif, endloop, endrep, if, label, loop, rep, ret

New registers - constant float, constant integer, constant Boolean, loop counter

Vertex Shader 2.0 Extended:

New features (with a cap set):

Dynamic flow control instructions - break, breakc, ifc
Predication - setp instruction, p# register
Static flow control nesting depth
Number of temporary registers

Vertex Shader 3.0:

New features:

Static flow control nesting depth
Dynamic flow control instructions - break, breakc, ifc
Predication - setp instruction, p# register
Number of temporary registers
Indexing registers
Vertex textures - texld texture address instruction
Vertex stream frequency

Pixel Shader 2.0:

New instructions:

Setup instructions: def, ps
Arithmetic instructions: add, cmp, cnd, dp3, dp4, lrp, mad, mov, mul, nop, sub
Macros: exp, frc, log, m3x2, m3x3, m3x4, m4x3, m4x4
Texture instructions:
Added: texldb, texldp
Removed: tex, texbem, texbeml, texcoord, texcrd, texdepth, texdp3, texdp3tex, texm3x2depth, texm3x2pad, texm3x2tex, texm3x3, texm3x3pad, texm3x3tex, texm3x3spec, texm3x3vspec, texreg2ar, texreg2gb, texreg2rgb
New registers: constant float, sampler, output color, output depth

New modifiers: negate, partial precision, saturate

Removed co-issuing of instructions

Pixel Shader 2.0 Extended:

New static flow control instructions (with a cap set): call, callnz, else, end, endif, endloop, endrep, if, label, loop, rep, ret
Static flow control nesting depth
Number of temporary registers
Dynamic flow control instructions: break, breakc, ifc
Gradient instructions: dsx and dsy
Texture instructions: texldd
Predication - setp instruction, p# register
New registers: constant integer, constant Boolean, loop counter, predicate

New modifiers (with a cap set): arbitrary swizzle

Pixel Shader 3.0:

Static flow control instructions: call, callnz, else, end, endif, endloop, endrep, if, label, loop, rep, ret
Static flow control nesting depth
Number of temporary registers
Dynamic flow control instructions: break, breakc, ifc
Predication: setp instruction, p# register
New registers: constant integer, constant Boolean, loop counter, predicate

New modifiers: arbitrary swizzle

If the NV30 is to support the "Extended" versions of both VS and PS 2.0, the difference is quite significant. These versions provide dynamic flow control for vertex shader programs and both static and dynamic flow control for pixel shader programs.

Some more significant information on these new versions:

Instructions Counts:

VS 2.0 - 256
VS 2.0 Extended - 256
VS 3.0 - 512 (min)

PS 2.0 - 64 arithmetic + 32 texture
PS 2.0 Extended - 96 (min) to 512 (max)
PS 3.0 - 512 (min)
 
Interesting.
I wonder why it's called "Vertex Shader 2.0 Extended" instead of say "Vertex Shader 2.1".
 
That's a good point. . . Perhaps because it is not a requirement of DirectX 9? Then again, PS1.4 or even 1.3 didn't seem to be a requirement of DirectX 8.1. . .
 
Humus said:
Interesting.
I wonder why it's called "Vertex Shader 2.0 Extended" instead of say "Vertex Shader 2.1".

I think that's because each and very cap in the Extended spec is optional.

In PS 2.0 Extended the R300 supports the cap "Number of temporary registers" because it's higher than in the standard DX9 PS 2.0, but it cannot support the others AFAIK.
 
Would be nice to have DX9 PS & VS 3.0 hardware available now wouldn't it. :)

Looks like DX9 software is going to represent a whole lot of different things to a lot of platforms.

I was kinda hoping to bite the DX9 magic bullet and get it over and done with. :D
 
I think it makes perfect sense in the context of a HLSL.

Just need to see some DX 9 HLSL functionality demonstrated now.

I don't think it need necessarily be construed as "shaders ATI" or "shaders NVIDIA" if DX 9 HLSL is used to take advantage of the specifications instead of custom coded "shader assembly". How it will actually be deployed, in the context of DX 9, is still an open question (I think).
 
From Tom's Hardware "Vertex Shaders and Pixel Shaders" ...

"For DX8, we had two types of pixel shader, PS1.0 and PS1.1. However DX8.1 introduces PS1.2, PS1.3 and PS1.4. If this carries on for DX9 and DX10, we could end up supporting about ten different versions of pixel shaders, and each one will probably correspond to a small number of hardware devices. It would seem more sensible to only increment the version numbers for each major release, and then, to only increase them by one."

Sensible, but for real? :)
 
It sounds to me PS 2.0 Extended represents a set of possible extensions to PS 2.0 that can be detected with caps. A card might have some but not all of them; I don't know that we can assume that any card will have all of them. PS 3.0 represents the next level, where you know the card supports a set of features and you don't have to check for each explicitly.

I imagine the HLSL compiler will target PS 2.0 and PS 3.0, but won't necessarily be smart enough to make use of every possible combination of PS 2.0 Extended capabilities. Probably you'll have to use assembly to use PS2.0 Extended capabilities, rather than HLSL.

The Cg runtime will be different, since a profile will target a card-specific set of extensions.
 
antlers4 said:
...
I imagine the HLSL compiler will target PS 2.0 and PS 3.0, but won't necessarily be smart enough to make use of every possible combination of PS 2.0 Extended capabilities. Probably you'll have to use assembly to use PS2.0 Extended capabilities, rather than HLSL.

Why would there be any reason for the compiler not to take advantage of these capabilities?
 
demalion said:
antlers4 said:
...
I imagine the HLSL compiler will target PS 2.0 and PS 3.0, but won't necessarily be smart enough to make use of every possible combination of PS 2.0 Extended capabilities. Probably you'll have to use assembly to use PS2.0 Extended capabilities, rather than HLSL.

Why would there be any reason for the compiler not to take advantage of these capabilities?

It wouldn't necessarily know about them. (for example, if a card with new features came out slightly after DX9 went gold.

Unless, of course, part of the installable driver for the card includes the backend of the HLSL compiler. Anybody know if this is true or not?
 
demalion said:
antlers4 said:
...
I imagine the HLSL compiler will target PS 2.0 and PS 3.0, but won't necessarily be smart enough to make use of every possible combination of PS 2.0 Extended capabilities. Probably you'll have to use assembly to use PS2.0 Extended capabilities, rather than HLSL.

Why would there be any reason for the compiler not to take advantage of these capabilities?

Because it takes intelligence to use each set of abilities to its fullest. You can optimize the compiler back-end for one instruction set or another, but not for 2^n instruction sets where n represents the number of optional capabilities. Although now that I think about it, it depends somewhat on how closely the extended capabilities map to HLSL features; if they map closely, it would be straightforward to target them if they are available.
 
DirectX 9.0 HLSL can currently target everything (edit: but 1.0 shaders since all hardware supports at least 1.1 anyway) up to 2.0. It does not support extended 2.0 model nor does it support 3.0 shader model. That's where the Cg comes in...
 
MDolenc said:
DirectX 9.0 HLSL can currently target everything up to 2.0. It does not support extended 2.0 model nor does it support 3.0 shader model. That's where the Cg comes in...

Ah, so Cg does offer an advantage currently (for nVidia hardware).

My question is then whether DX 9 HLSL is precluded from offering this as well? Is Microsoft planning on enhancing the compiler? Is Microsoft the only one capable of doing so?

If the answer to the last is "yes", that's a shame. But if the answer to the question before that is "yes", the question becomes "When?"
 
demalion said:
My question is then whether DX 9 HLSL is precluded from offering this as well? Is Microsoft planning on enhancing the compiler? Is Microsoft the only one capable of doing so?
It would be hard for DX9 HLSL to properly support extended 2.0 shaders, since many features (including register counts) are caped and since DX9 HLSL does not know anything about what hardware it is running on.
Microsoft is planning to improve the compiler and this should happen sometime next year. And yes since DX9 HLSL is integrated in DirectX (Direct3DX to be exact), Microsoft is the only one that can write a profile and change the compiler.
 
Hmm...a possibly significant advantage offered by Cg if Microsoft is not timely in their compiler updates and locks out vendor optimized compilers. :-?
 
I fear there's some serious painting themselfs into a corner going on with the HLSLs. Both Cg and DX9 HLSL is built on the idea of standardising the compiler rather than just making the language a standard and let the compiler be a part of the driver. Making a compiler targetting a standardized instruction set and we'll repeat the whole x86 problem but for GPU's.
 
MDolenc said:
It would be hard for DX9 HLSL to properly support extended 2.0 shaders, since many features (including register counts) are caped and since DX9 HLSL does not know anything about what hardware it is running on.

Well, you could still include profiles to let HLSL know about the specific hardware (like in Cg), but I'm not a programmer so I don't know whether it is 'allowed' to expose the VS 2.0+ and PS 2.0+ caps in a profile?
 
Back
Top