Which API is better?

Waltar · Nov 7, 2003

GLslang destorys hlsl in every way possible.

JohnH · Nov 7, 2003

Humus said:
1,3,4 have been discussed to death in several lengthy threads already. I and many others don't believe 1 is much of a problem, 4 is easily solved, and 3 isn't anything new (buglessness have never been a guarantee). 2 is wrong, unless you know some inconsistency in the spec?

I have to agree that I’ve probably overstated 4, although it probably isn’t as insignificant as some would claim. 2 is a result of the way certain IHV’s attempt to control the market, I wouldn’t put it past them to do things subtly differently and then just use the excuse that they are “the industry standard” so must be correct, but hey, that’s just me be paranoid. I think 1 maybe a problem, but to be honest I’d admit that the driver has to do a level of parsing of the tokenised format it gets from Dx which is also going to be subject some bugs, so score even on this.

Humus said:
1 also means that bugs can completely halt your development or force you to ditch certain visual attributes for many months until MS update their runtime. 2 is bad thing, extensibility is desirable. 3 sucks, targets other than the hardware will be suboptimal, nor does it guarantee or make it significantly more likely (or at all) that it will run correctly on all hardware.

Regarding your objection to 1, two things, how much more responsive are the IHV’s than MS at fixing “critical” bugs, and second can you give examples where there has been such a bug in the Dx9 HLSL that weren’t fixed before they where found?
2 is a very good thing, it ensures consistency, HLSL should not be extended by changes in syntax, only by addition intrinsic’s which should still follow defined syntax. Extensions a bit of a funny one really, as an IHV I think they’re very useful, however as a programmer and part time graphics fiddler I can’t see how they don’t lead to fragmentation of the API.
3 doesnt suck, it’s very useful to have at least half a chance of being able to run on a different piece of HW that claims support for a specific profile, how does this work in GLSL (I did ask this question in my post)? I also think its very easy for people to blame the effect of poorly written drivers on the mismatch between the intermediate formats and HW internals, the reality is that the currently provided profiles aren’t, or rather don’t need to be the problem here. And, I’m saying this based on experience within drivers, so its not just speculation.

Humus said:
The only way to get a good enough intermediate format would be if it's completely free from restrictions. Basically, to get around all limitations you will end up with a MS compiler that does nothing but parsing.

Hmm, parsing, tokenising, expression evaluation etc etc, I’m really not interested in having to do these, takes a fair amount of work off my hands if I’m given an im format that uses simple ops and preserved flow control constructs and no assumptions about temps (last one probably only relevant on HW that won’t be available for 1-2years).

John.

JohnH · Nov 7, 2003

DemoCoder said:
Parsing is trivial. It's a done deal. It took me approximately 3 hours to take a C++ grammar and modify it to parse Cg and GLSlang. ARB already provides a YACC grammar for GLSlang that is likely to be used by implementors. This particular issue is not worrisome.

As I’ve said already, I have probably overstated this issue.

DemoCoder said:
As for 3rd party syntax extensions, this is far more likely with a public intermediate representation, since anyone can write a compiler front-end in any language they want to generate the intermediate representation. NVidia could provide a HLSL that is "ML like" (functional) and ATI could provide one that is XML-ish.

Not sure that this argument has any real bearing on this discussion, given a defined programming interface you’re free to stick whatever you like on top of it, but that would be completely missing the point of having a standard. Take this to the extreme, if I wanted I could add extensions to OGL that completely replace the functionality of the whole API in a manner that I consider “better”, but then this would be defeating the point of having a standard API.

DemoCoder said:
Assumes parser bugs are the biggest issue with compilers, they aren't. Parsers are generated from automated tools which take LL or LR grammars and as long as your grammar is correct, you're parser will be, unless your YACC/PCCTS/ANTLR/CUP/etc generator is bugged.

Yes I overstated this.

Secondly, the syntax isn't fixed, because now multiple-front-end parsers can generate intermediate representation.

No, the syntax of MS HLSL is fixed, if you choose to use a third party front end then you’re not using MS HLSL.

Third, parsing is the smallest part of the work in a compiler, so the fact that you've removed parsing out of the pipeline is irrelevent.

Yes, generally true, except I think you should look at the length of many of the shaders that are being pushed through these things i.e. short, and often short in intermediate form than HLSL form, in these cases the parsing IS the significant part of compilation. Obviously this becomes less of an issue as shaders get longer.

The fact is, Microsoft's so-called "intermediate representation" (DX9 assembly) cannot model many of the semantics that you may want to pass to the hardware for acceleration. How do you represent a Perlin noise() function, or a matrix transpose? Even normalization isn't passed to the HW, so if HW had a hardware normalizer built in, it becomes much more work (remember, minimal work neccessary?) for the driver to detect and optimize it.

The limitations exposed in their intermediate representation deny many optimizations that could be possible on more powerful hardware (loop,funcall limits, register limits, no type information at all!)

I think you over estimating the value of these things e.g. NO HW in existence has specific acceleration for “Perlin noise()”, or matrix transpose, the MS approach takes this into account, you have to supply a function to do it, this is unlikely to change given the insistence that HW needs to be programmable to be useful now.
Other things like expansion of macros (like nrm) are bugs, and most could be fixed in the drivers optimiser anyway, in fact you would want this irrespective of what HLSL is doing as the user could choose the same instruction sequence anyway.
Many of the other limitations make sense given that you can’t exceed them anyway on any HW that is currently available…

Well, there's the rub. MS's "intermediate language" is nothing like the intermediate representations used in most compilers, be it Java, .NET CLR, C compilers, etc. Most compiler IR-reps use either Tree-IR, or 3/4-address code, and do not use actual register names, but use autogenerated labels which do not get assign registers until near the end.

You need to get away from what "could have been", and deal with what the reality is: The reality is, DX9 VS/PS 2.0/3.0 do not have the representational power that other intermediate representations have, and won't allow the kinds of optimizations that GLSlang's model will.

If Microsoft had an intermediate representation for 3D that was more like .NET CLR, and less like some low level assembly syntax, I might be tempted to agree with you. About a year ago, I was arguing on this board that MS should provide an interface whereby you can plug in source code, and get back a parse-tree or IR rep, and thereby allow the driver to do it's work based on more higher level information, instead of having to translate DX9 assembly.

The current iml provided may not be perfect but, my own experience working at the driver level has shown that they are NOT a significant bar to optimisation.

But since all of your claimed benefits don't actually show up unless MS rethinks the way they're currently doing things, talking about how a hypothetically rich IR with separate parsing would be ideal is moot, since the reality is: DX9 FXC + assembly vs OGL2.0 HLSL

Many of the claimed benifits do exist. Tell me how you guarantee that something written for GLSlang and test on one driver/HW is guaranteed to work on any piece of HW in the field ? The profiles used for the intermediate target are a good step to fixing this, they need beefing up a bit by inclusion of a few “external” caps, but as I said to Humus, the 3.0 profile does this.

This could run and run.
Later,
John.

JohnH · Nov 7, 2003

Chalnoth said:
JohnH said:

IHV's _Cannot_ tweak the syntax of the HLSL as they do not have acces to it. They Also can't tweak the intermediate format as it won't get past validation.

Click to expand...

Let me get this straight, JohnH.

You like DX9 HLSL because IHV's can't cheat on it, because IHV's have in the past "modified" DX9 shaders?

Think about that for a second.

This has nothing to do with intermediat vs direct compilation as being discussed here, instead it has everything to do with someone make a bad descision during driver development.

John.

KimB · Nov 7, 2003

JohnH said:
This has nothing to do with intermediat vs direct compilation as being discussed here, instead it has everything to do with someone make a bad descision during driver development.

John.

Which means the "cheat" angle is a non-issue here as it can be done regardless of compiler used.

And since IHV's drivers already have to do much of the heavy lifting associated with shader compiling (assembly->machine), it will probably not be any harder on them (in terms of bugs or development time).

In fact, I would contend that nVidia will have an easier time with OpenGL HLSL than they're having with Microsoft's HLSL (since DX9 much more closely mirrors ATI hardware).

JohnH · Nov 7, 2003

Which cheat "angle" ?

Xmas · Nov 7, 2003

JohnH said:
Many of the claimed benifits do exist. Tell me how you guarantee that something written for GLSlang and test on one driver/HW is guaranteed to work on any piece of HW in the field ? The profiles used for the intermediate target are a good step to fixing this, they need beefing up a bit by inclusion of a few ?external? caps, but as I said to Humus, the 3.0 profile does this.

One of the easiest ways to solve this would be that each IHV offers a download of small "validation tools" that share their code with the shader validation mechanism in the driver. Then a developer doesn't need the hardware to know whether it will run or not. It has the implicit assumption that upcoming hardware is at least as capable as its predecessor, but I don't think that's a problem.

Frank · Nov 7, 2003

When you want to optimize the code generated by your DX9 driver, you need to parse the input anyway. Do it all over, with a lot less information. To be able to pull that one off, it brings you to the terrain of statistical analysis and pattern recognition, to see what sequences of code can be replaced by better ones, without destroying the initial intent of the developer.

While the info you really need and want is destroyed by the p-code compiler! A state machine with parse tree is much better and easier to optimize than a random set of instructions.

DemoCoder · Nov 7, 2003

Yes, current drivers have to "parse" DX9 assembly byte codes and recreate internal compiler datastructures: DAGs, register interference graphs, use-def chains or SSA, etc

I think you over estimating the value of these things e.g. NO HW in existence has specific acceleration for â€œPerlin noise()â€, or matrix transpose, the MS approach takes this into account

Thus ensuring that these functions will never be accelerated by HW, since it would be next to impossible for drivers to take advantage of it from DX9 instruction streams.

MS's intermediate representation has no notion of external linkage to "intrinsic" builtin library functions. GLSlang provides a large library of such functions and leaves it up to the IHV to determine how they will be implemented: as software approximations or emulations, or as HW.

Frank · Nov 7, 2003

Of course, all of the above posts don't adress the real reason for the points mentioned in the discussion.

It is not about efficiency or great graphics. It is about power and money. And that explains quite neatly why Microsoft is doing what it does.

Who is going to make the standard? And why would they do that?

The party that can create the standard everyone else has to live by, makes the most money. The one that has the power to make a monopoly by defining the standard, patenting it and selling huge amounts of it will win.

In that, Microsoft is not going far enough: there is too much variation possible in the hardware that can run DX9.

JohnH · Nov 7, 2003

Xmas said:
JohnH said:

Many of the claimed benifits do exist. Tell me how you guarantee that something written for GLSlang and test on one driver/HW is guaranteed to work on any piece of HW in the field ? The profiles used for the intermediate target are a good step to fixing this, they need beefing up a bit by inclusion of a few ?external? caps, but as I said to Humus, the 3.0 profile does this.

Click to expand...

One of the easiest ways to solve this would be that each IHV offers a download of small "validation tools" that share their code with the shader validation mechanism in the driver. Then a developer doesn't need the hardware to know whether it will run or not. It has the implicit assumption that upcoming hardware is at least as capable as its predecessor, but I don't think that's a problem.

Having a standard is supposed to help you avoid having to do things like that.

The fact is you have to provide paths which run on lesser variant of HW any way, having a set of profiles that define the those variants should make it a lot less hit and miss, its either that or back to detecting vendor and device ID's (not that we've managed to get away from that yet).

John.

JohnH · Nov 7, 2003

DemoCoder said:
Yes, current drivers have to "parse" DX9 assembly byte codes and recreate internal compiler datastructures: DAGs, register interference graphs, use-def chains or SSA, etc

I think you over estimating the value of these things e.g. NO HW in existence has specific acceleration for â€œPerlin noise()â€, or matrix transpose, the MS approach takes this into account

Click to expand...

Thus ensuring that these functions will never be accelerated by HW, since it would be next to impossible for drivers to take advantage of it from DX9 instruction streams.

MS's intermediate representation has no notion of external linkage to "intrinsic" builtin library functions. GLSlang provides a large library of such functions and leaves it up to the IHV to determine how they will be implemented: as software approximations or emulations, or as HW.

Careful, you might start advocating a return to fixed function pipelines, and then, finally, some of us may find the time to get a life

I don't think MS current iml is going to remain fixed for long, I think you'll find it will move forward as dictated by HW progress, it won't always prevent external linkage. In the mean time its profiles fit whats available in HW, probably for the next year.

I'm sure there was a proposal on the table at some point for profiles in OGL2.0, not sure what happened to it, maybe wishful thinking...

John.

Joe DeFuria · Nov 7, 2003

DiGuru said:
It is not about efficiency or great graphics. It is about power and money. And that explains quite neatly why Microsoft is doing what it does.

Well, "duh."

That also explains quite neatly the existence of Cg.

Every one of these companies does what it thinks is ultimately in its best interest. However, that doesn't mean what's in one company's best interest doesn't also coincide with what's in the interests of consumers.

Frank · Nov 7, 2003

Joe DeFuria said:
DiGuru said:

It is not about efficiency or great graphics. It is about power and money. And that explains quite neatly why Microsoft is doing what it does.

Click to expand...

Well, "duh."

That also explains quite neatly the existence of Cg.

Every one of these companies does what it thinks is ultimately in its best interest. However, that doesn't mean what's in one company's best interest doesn't also coincide with what's in the interests of consumers.

Yes. ;-)

If every vendor offers almost the same thing for about the same price, things stall and get boring quite fast. And a monopoly kills any innovation.

I want to live in interesting times.

Humus · Nov 8, 2003

JohnH said:
Regarding your objection to 1, two things, how much more responsive are the IHVâ€™s than MS at fixing â€œcriticalâ€ bugs, and second can you give examples where there has been such a bug in the Dx9 HLSL that werenâ€™t fixed before they where found?

IHVs are much quicker for sure. Sending a bug report to ATI I tend to get a fix quite soon. Often within a week or two, at most a month. I haven't tried asking MS to fix some bugs, I frankly don't know any email address to send such requests to (but I guess there is one). I sure have bumped into several bugs in HLSL. When I worked at ATI I spent a lot of time writing RenderMonkey workspaces. I didn't count, but I experienced lots of compiler problems, far more often than driver problems in fact, sometimes I could work around it, though it happened I couldn't, or at least gave up trying. Over a period of five weeks I suppose I ended up with at least double digit number of shaders that the compiler had problems with.

JohnH said:
2 is a very good thing, it ensures consistency, HLSL should not be extended by changes in syntax, only by addition intrinsicâ€™s which should still follow defined syntax. Extensions a bit of a funny one really, as an IHV I think theyâ€™re very useful, however as a programmer and part time graphics fiddler I canâ€™t see how they donâ€™t lead to fragmentation of the API.

The syntactical rules need not be changed, but you sure would like to extend the language with more features as your hardware adds them, and not have to wait for MS to add them, or even ditch them. Personally, I don't want to wait, either from an IHVs perspective, nor from a developer's point of view.

JohnH said:
3 doesnt suck, itâ€™s very useful to have at least half a chance of being able to run on a different piece of HW that claims support for a specific profile, how does this work in GLSL (I did ask this question in my post)? I also think its very easy for people to blame the effect of poorly written drivers on the mismatch between the intermediate formats and HW internals, the reality is that the currently provided profiles arenâ€™t, or rather donâ€™t need to be the problem here. And, Iâ€™m saying this based on experience within drivers, so its not just speculation.

As hardware evolves this profiles with add up to a real burden. All new hardware must support all older profiles. Today we have ps1.1, ps1.2, ps1.3, ps1.4, ps2.0, ps2.0a, ps3.0. In the long run it will be more difficult to support an assload of profiles that one standard language, not to mention to add optimizers that can deal with all attributes of all individual profiles. Or all the artificial restrictions these profiles unneccesarily enforces on hardware that are more capable than the profile.

Humus · Nov 8, 2003

JohnH said:
Not sure that this argument has any real bearing on this discussion, given a defined programming interface youâ€™re free to stick whatever you like on top of it, but that would be completely missing the point of having a standard. Take this to the extreme, if I wanted I could add extensions to OGL that completely replace the functionality of the whole API in a manner that I consider â€œbetterâ€, but then this would be defeating the point of having a standard API.

And since anything such have never happened in the OpenGL API's history we can file that under paranoia. And even if someone would do anything such, they would still have to support the standard API in addition to that.

JohnH said:
Tell me how you guarantee that something written for GLSlang and test on one driver/HW is guaranteed to work on any piece of HW in the field ?

If a shader is correct, then the shader specification is your guarantor. Sure, driver bugs can exist, and driver bugs has always existed and will exist in the future too. Nothing new under the sun. MS compiler doesn't change that. The driver can still have a bug. And that's the whole reason why the developer's QA department needs to run their newly written game on graphic cards on the market. There's no way around that problem. You need to do that if you're using HLSL too.

DemoCoder · Nov 8, 2003

JohnH said:
Careful, you might start advocating a return to fixed function pipelines, and then, finally, some of us may find the time to get a life

I don't think it's the same thing. Many of the complex PS instructions like DP4, could in fact, be macros that get translated into multiple micro instructions on some architectures. But MS didn't make DP4 a macro, they made it an instrinic. Now, the driver can decide how DP4 is implemented, since the driver actually receives the byte code for a DP4. Normalize can be implemented as either: N * rsq (N.N) (and RSQ is in fact, another compound instruction), or with Newton-Raphson approximation, or using a HW unit. But DX9 doesn't specify that normalize makes it into the driver layer. It gets expanded by the compiler, and I think, by the assembler as well. Ditto for POW/EXP/SAT/etc

There are many future "instrinsic" instructions which could be added: noise, transpose, smoothstep, matrix multiplication, reflect, FFT/DCT, pack/unpack, etc

Even if these aren't hardware accelerated, DirectX's expansion of these into further DX9 code is not neccessarily the most optimal implementation for all platforms. The DX9 compiler assumes for example, that the best way to implement SINCOS is to expand it into a power series eating up 8 instruction slots, which is definately not the optimal implementation on the NV30 which has HW support for SINCOS.

What's the NV30 driver to do? Try to "recognize" a sequence of 8 instructions and assume it's a SINCOS?

Adding support for high level function instrinics isn't a return to fixed function, since there is usually an adequate software emulation of these functions (e.g. SINCOS), it's simply giving the driver the opportunity to detect and replace a SINCOS() call to native HW instead of letting the silicin set idle and burning up regular vector unit slots.

Frank · Nov 8, 2003

I wanted to put this in the GP(G)P thread, but it belongs here.

What is more important: the hardware, the philosophy or the market share?

The hardware is needed anyway. And it is designed by concepts ~3 years old, modified by the marketing and management expectations of the current time. All to make a huge amount of money, of course.

The philosophy will tell you how it should be done and why. Not many people think that things that don't have a direct monetary impact are important. But without a clearly defined goal that specifies what you want to do and why it is important (other than monetary, of course), it won't work.

It has to work, right?

Read the above line again, please.

Are we all in agreement that it has to work? Not just that we need the power to pull arms with the competition and make a huge amount of money out of thin air by wrestling it from unsuspecting suckers?

Great! So, what is our all-important philosophy and how will it influence our market share? (Market share = the part we have a monopoly over, as by marketing.)

It is just about the money people want to invest in your product by buying it. Duh. Who cares! Anyone knows that!

The best you can do is sell nothing for a vast amount of money legally. But if you're not into legal stuff, you want a good product for a good price.

Right?

No exploits? No monopolies? No killing opponents? We actually have to design and produce a really good product? Can we do that? But... that's cheating! Boring! Anyone can sell a good product for a fair price!

And we all know what happened with the company that was thinking like that when the competition was actually designing and making a good product...

Well, what do we care? We just want great products and graphics, right? Right. How do we get them?

DeanoC · Nov 8, 2003

DemoCoder said:
What's the NV30 driver to do? Try to "recognize" a sequence of 8 instructions and assume it's a SINCOS?

Adding support for high level function instrinics isn't a return to fixed function, since there is usually an adequate software emulation of these functions (e.g. SINCOS), it's simply giving the driver the opportunity to detect and replace a SINCOS() call to native HW instead of letting the silicin set idle and burning up regular vector unit slots.

And thats a bug in the HLSL compiler, in the same way that a GLSLANG compiler (and sometimes certainly will) could produce code that 'forgets' to us a hardware instruction. All code will have bugs...

BUGS are a serious issue that has caused more performance issues than any other on PC (largely because many best practises can't be used outside demos due to broken drivers).

We need to remove the potential for drivers bugs not increase it, else PC games development costs are going to sky rocket. Thats my main issue for GLSLANG over D3DX (HLSL is a static library, its not a part of the driver/runtime. The version I ship is the version everybody uses).

JohnH · Nov 8, 2003

DemoCoder said:
JohnH said:

Careful, you might start advocating a return to fixed function pipelines, and then, finally, some of us may find the time to get a life

Click to expand...

I don't think it's the same thing. Many of the complex PS instructions like DP4, could in fact, be macros that get translated into multiple micro instructions on some architectures. But MS didn't make DP4 a macro, they made it an instrinic. Now, the driver can decide how DP4 is implemented, since the driver actually receives the byte code for a DP4. Normalize can be implemented as either: N * rsq (N.N) (and RSQ is in fact, another compound instruction), or with Newton-Raphson approximation, or using a HW unit. But DX9 doesn't specify that normalize makes it into the driver layer. It gets expanded by the compiler, and I think, by the assembler as well. Ditto for POW/EXP/SAT/etc

There are many future "instrinsic" instructions which could be added: noise, transpose, smoothstep, matrix multiplication, reflect, FFT/DCT, pack/unpack, etc

Even if these aren't hardware accelerated, DirectX's expansion of these into further DX9 code is not neccessarily the most optimal implementation for all platforms. The DX9 compiler assumes for example, that the best way to implement SINCOS is to expand it into a power series eating up 8 instruction slots, which is definately not the optimal implementation on the NV30 which has HW support for SINCOS.

What's the NV30 driver to do? Try to "recognize" a sequence of 8 instructions and assume it's a SINCOS?

Adding support for high level function instrinics isn't a return to fixed function, since there is usually an adequate software emulation of these functions (e.g. SINCOS), it's simply giving the driver the opportunity to detect and replace a SINCOS() call to native HW instead of letting the silicin set idle and burning up regular vector unit slots.

Err, maybe I was having a small joke!!

Which API is better?

Which API is Better?

OpenGL2.0 is more elegant, easier to program

DirectX9 is more elegant, easier to program

Both about the same

I use DirectX mainly because of market size and MS is behind it

Waltar

JohnH

JohnH

JohnH

KimB

JohnH

Xmas

Porous

Frank

Certified not a majority

DemoCoder

Frank

Certified not a majority

JohnH

JohnH

Joe DeFuria

Frank

Certified not a majority

Humus

Crazy coder

Humus

Crazy coder

DemoCoder

Frank

Certified not a majority

DeanoC

Trust me, I'm a renderer person!

JohnH

Similar threads