DirectX 9.0 macros and dynamic branching?

Is there any confirmation as to the r300's ability to dynamically branch? We now know it can run vertex programs with greater the 65k lengths (does anyone know the exact number?) and supports up to 32 vertex registers in hardware. If it can dynamically branch and execute these directx 9 macro's as effectively as CineFX architecture can for trig functions, then the two processors are fairly comparable in terms of geometric performance.
Can anyone verify whether the r300 executes sin/cos functions similarly
to the nv30?

This information would be very helpful in clearing things up.
 
I don't know about the dynamic branching, but the sin/cos "macros" basically means that the VS/PS assembler can take a "sin" instruction and expand it into a bunch of simpler instructions. If you do a web search on MacLaurin/Taylor series expension, you'll find that you can define sin/cos as a sum of polynomials. That's one way to support sin/cos without actually having a matching hardware instruction. The thing that sucks about this method is that a sin/cos instruction can take up lots code space.
 
I'm still very confused here...

As I said in my previous reply, according to the info released so far, R300 is capable of 1024 vertex instructions, while NV30 is capable of 256 vertex instruction and that number can raise to 65536 through the use of data dependent branching & subroutines. I'm not really sure R300 is capable of that and there is no other info right now that seems to suggest it is indeed capable of that...

If you look carefully at the table Dave presented in the R9700pro review, you'll see that the number of vertex instructions doesn't make any sense (and it's not 65k as u say).

Also, if I'm not mistaken, he noted in one thread that it's actually 16 vertex registers in hardware and not 32 (so basically, it's the same amount as in NV30, while it was previously reported that R300 is capable of only 12 vertex registers in hardware).

I would really like someone to clear this all up! :D
 
Simple answers:

1) The total number of possible vertex shader instructions executed is 65,026 on the 9700. The computation is as follows:
a) The 9700 supports any number in the range [0,255] as a loop count
b) There are 256 possible "static" instructions
c) Loops can exist in any of the first 255 instruction slots (not in the last, since you need another instruction, at least, to represent the looping "code")
So, total instruction execution = 255*255 + 1 = 65,026
I believe that most DX9 parts will execute something similar. Saying 64k is reasonable.

2) The R300 has 32 temporary registers in the vertex and pixel shaders (64 "total"). We currently "reveal" 12 in the pixel shader (not sure about vertex shader), following DX9 recommendations. We will raise that as caps bits allow or DX9 specs change.

Hope this help
 
We support branching (JSR, JMP, Loop) per primitive, as DX9 specifies (i.e. each primitive can have a different execution through the shader, following different branch points). Basically, each primitive can supply its own branching flags. Not exactly sure what "dynamic" means.
 
He means that the branch can happen based on a condition code dynamically (not constant per shader)


For example:

x = x * 10;
if(x > 1000) goto foo;

as opposed to static branching which is

if(CONSTANT) goto foo;
 
Our branching is based on constants which can be changed per polygon. Following DX9 specs, as far as I know.

Hope that clears it.
 
I believe the term "dynamic", with respect to nvidia, to my limited understanding, means data-dependent branching (both forwards and backwards) which is not limited to only constants.

Edit: Oops, Democoder, I did not see your post. That is a much better explanation of what I meant. Even I now have a better understanding of what I meant, in the face of all these tech docs.
 
Can anyone verify whether the r300 executes sin/cos functions similarly to the nv30?
We'd need a NV30 for such a verification :)

I was told that the SIN/COS is supported as macros on the R300 - function is available but may be implemented as multiple instructions. Microsoft defines SIN/COS has requiring 8 or less instructions but the R300 has much less than 8.
 
If you look carefully at the table Dave presented in the R9700pro review, you'll see that the number of vertex instructions doesn't make any sense (and it's not 65k as u say).

Also, if I'm not mistaken, he noted in one thread that it's actually 16 vertex registers in hardware and not 32 (so basically, it's the same amount as in NV30, while it was previously reported that R300 is capable of only 12 vertex registers in hardware).
I believe it was me rather than Dave that said that but perhaps dave said so elsewhere.

As sireric said, it's 32 + 32 (PS + VS) in metal. "12" is quoted because that's what DX8.1 limitation/specs are. DX9 allows [12,32].
 
It appears not. What it does support is branching off of the shader's constant registers. That is, you can write

if(constant[n]) do something

But constant[n] is a read only register and can't be modified by the shader, only by the CPU before the shader is executed.


Thus, this functionality is basically equivalent to using the C preprocessor "at runtime" during the shader's execution. Different parts of the shader can be turned off/on via a flag supplied by the CPU before you start rendering that primitive.
 
alexsok said:
I'm still confused, does R300 support true dynamic branching like NV30 does?

what exactly are you so confused about? - R300 branching is static-per-primitive. it does not support "true dynamic branching like NV30 does".
 
DemoCoder said:
It appears not. What it does support is branching off of the shader's constant registers. That is, you can write

if(constant[n]) do something

But constant[n] is a read only register and can't be modified by the shader, only by the CPU before the shader is executed.


Thus, this functionality is basically equivalent to using the C preprocessor "at runtime" during the shader's execution. Different parts of the shader can be turned off/on via a flag supplied by the CPU before you start rendering that primitive.

This explains things pretty nicely, thx m8! :D
 
what exactly are you so confused about? - R300 branching is static-per-primitive. it does not support "true dynamic branching like NV30 does".

Of course, if this is true then NV30's vertex shaders can't be SIMD, which is...interesting.

I think that would be rather expensive, in terms of transistor count.
 
Maverick said:
Of course, if this is true then NV30's vertex shaders can't be SIMD, which is...interesting.
Why not? Of course an IF (or IFC, for that matter) statement evaluates only one expression, not four.
 
DaveBaumann said:
I believe it was me rather than Dave that said that but perhaps dave said so elsewhere.

It was in the review.
I must've missed it then... I was talking about what ATI says re their vertex shader registers specs and what their drivers reveal instead.
 
Back
Top