Convert SM3.0 to SM2.0?

nelg

Veteran
Please feel free to ignore this if it is a stupid question.

If the r420 supports most of the instructions for PS3.0 and all of them for VS3.0 could a tool such as ASHLI be used to convert SM.30 routines to this SM.20+ form?
 
I'll post the one obvious case it can't emulate.

while (some condition)
{
do something that affects the condition in a none obvious way
}

It can emulate a lot of looping by unwinding the loop inside of it's instruction limit, if it can't determine the worst case loop, (for example if the loop depth is based on texture data) then it can't emulate the loop.

I doubt that this will be a particularly common construct.
 
The two things that I came up with that couldn't be easilly emulated (without restrictions or potentially massive performance penalties), were

Tracing a ray through a data set (which I think is potentially very useful for a number of effects.).

And working on two large sparse vectors.

But a mandelbrot set is a good example of an unbounded problem.
 
Number of iterations isn't always constat (even though loop only takes constant integer register). You can conditionaly brake out of loop based on what ever comparison you want... ;)
 
Yes, you can write a loop with a dynamically calculated upper bound as the following

Code:
upper_bound = f(x);
for(int i=0; i<LARGEST_INTEGER; i++) //  i<upper_bound
{
  if(i >= upper_bound) break;
}

This can still be emulated if you can do effectively infinite length pixel shaders, but you take a penalty that is effectively the maximum penalty you can take. :) e.g. if you can handle say, 65,536 instruction shaders, then you have a penalty of 65,536 cycles. :)
 
MDolenc said:
Number of iterations isn't always constat (even though loop only takes constant integer register). You can conditionaly brake out of loop based on what ever comparison you want... ;)
A bit OT, but MDolenc, will you upgrade your tester to test PS 3.0? If so, how do you think you will test PS 2.0/PS3.0 gap?
 
DemoCoder said:
This can still be emulated if you can do effectively infinite length pixel shaders, but you take a penalty that is effectively the maximum penalty you can take. :) e.g. if you can handle say, 65,536 instruction shaders, then you have a penalty of 65,536 cycles. :)
Which, of course, you would handle by making multiple passes of, oh, 100-1000 cycles or so a piece. You wouldn't need to execute the worst-case loop: you could instead trade off that performance hit for one involving periodically checking whether or not the loop has completed.
 
forgive my newbness on this matter, but if you had

Code:
upper_bound = f(x); 
for(int i=0; i<LARGEST_INTEGER; i++) //  i<upper_bound 
{ 
  //actually do something usefull
  if(i >= upper_bound) break; 
}
and say, LARGEST_INTEGER is a huge number and so is upper_bound, wouldn't a SM3.0 architechture still have to execute the code as many times as i<LARGEST_INTEGER until it breaks upper_bound? I wouldn't think that example would be that great on any SM architechture, though I freely admit your knowledge on this is much more extensive than mine.

If I were, however writing said code, I'd approximate the upper_bound or perhaps have several upper_bounds (like a low number (uppr_bounda) for higher performance, then a high number (uppr_boundb) if necessary) and just write out as many statements as necessary. then I'd have an if statement to determine which uppr_bound* is necessary, and determine which set of statements is necessary. I do, however, assume SM2.0 allows nested if statements (like I said, I really haven't had time to get into HLSL)
 
LARGEST_INTEGER is meant to be infinity. Basically, the loop is as follows

while(1)
{
if(i >= n) break;
}

Using the IF statements for upper bound in PS2 won't gain you any benefit unless they are static. But we are presupporting that upper_bound is not something you can "estimate" and replace.


For example, it may represent a condition for looping through a dynamic datastructure.
 
and say, LARGEST_INTEGER is a huge number and so is upper_bound, wouldn't a SM3.0 architechture still have to execute the code as many times as i<LARGEST_INTEGER until it breaks upper_bound? I wouldn't think that example would be that great on any SM architechture, though I freely admit your knowledge on this is much more extensive than mine.

If I were, however writing said code, I'd approximate the upper_bound or perhaps have several upper_bounds (like a low number (uppr_bounda) for higher performance, then a high number (uppr_boundb) if necessary) and just write out as many statements as necessary. then I'd have an if statement to determine which uppr_bound* is necessary, and determine which set of statements is necessary. I do, however, assume SM2.0 allows nested if statements (like I said, I really haven't had time to get into HLSL)

Well the difference is that upper_bound need not be a compile-time constant in PS3 (afaik), that's what upper_bound = f(x) refers to. So you couldn't write a similar for loop in PS2, as your compiler would not accept executing that if statement on a variable.
 
Chalnoth said:
DemoCoder said:
This can still be emulated if you can do effectively infinite length pixel shaders, but you take a penalty that is effectively the maximum penalty you can take. :) e.g. if you can handle say, 65,536 instruction shaders, then you have a penalty of 65,536 cycles. :)
Which, of course, you would handle by making multiple passes of, oh, 100-1000 cycles or so a piece. You wouldn't need to execute the worst-case loop: you could instead trade off that performance hit for one involving periodically checking whether or not the loop has completed.

Yes, but I thought we were talking about what transformations the compiler/driver could do. Auto-multipassing would be a bit too much to ask.
 
DemoCoder said:
LARGEST_INTEGER is meant to be infinity. Basically, the loop is as follows

while(1)
{
if(i >= n) break;
}

Using the IF statements for upper bound in PS2 won't gain you any benefit unless they are static. But we are presupporting that upper_bound is not something you can "estimate" and replace.


For example, it may represent a condition for looping through a dynamic datastructure.

heh that makes sense, and I can see where you'd want upper_bound to be dynamic. I thought, if you wanted too, you could write a function with recursion; ie

Code:
upper_bound = f(x);

y(upper_bound) {
  //do something useful, evaluate a temp to test with
  if(temp>= upper_bound) {
   return recursion of y();
  } else {
   break;
}

then I did some research and found out HLSL (at least SM2.0) doesn't support recursion of functions! doh, looks like theres some things you can definitly be SOL if you're using SM2.0 as opposed to SM3.0.

All this talk has got me wanting to write some HLSL stuff, and SM2.0 looks too restrictive, I guess it's NVIDIA this round for me.
 
Bear with my ignorance here guys but looking at this it seems that there is not to much of a difference WRT PS2.x vs PS3.0. Even with the loop/endloop instructions could not just use rep/endrep instead with an arbitrarily high counter?

ps_3_0 Features
New features:

Consolidated 10 Input Registers (v#)
Indexable Constant Float Register (c#) with Loop Counter Register (aL)
Number of Temporary Registers (r#) increased to 32
Number of Constant Float Registers (c#) increased to 224
New instructions:

Setup instruction - dcl_usage
Static flow instructions - loop, endloop
Arithmetic instruction - sincos (new syntax)
Texture instruction - texldl
New registers:

Input Register (v#)
Position Register (vPos)
Face Register (vFace)
 
joe emo said:
then I did some research and found out HLSL (at least SM2.0) doesn't support recursion of functions! doh, looks like theres some things you can definitly be SOL if you're using SM2.0 as opposed to SM3.0.
Neither SM3.0 nor the OpenGL2.0 shading language allow recursive functions either, so you are SOL in any case. SM3.0 has a 'call' instruction, but it is only allowed to do forward calls (that is, calls where the program counter of the called routine is strictly greater than the program counter of the call instruction) - so there is no way for a function to call itself.
 
right, but I was saying you possibly do a similar task as to what DemoCoder defined by using recursion of functions. With SM2.0 you can't do loops, and you can't do recursion, so your totally SOL. With SM3.0 you can do loops, but you can't do recursion, so you can do it one way.
 
joe emo said:
right, but I was saying you possibly do a similar task as to what DemoCoder defined by using recursion of functions. With SM2.0 you can't do loops, and you can't do recursion, so your totally SOL. With SM3.0 you can do loops, but you can't do recursion, so you can do it one way.

It's possible for the compiler to do tail recursion. :)
 
Back
Top