I have to agree that I’ve probably overstated 4, although it probably isn’t as insignificant as some would claim. 2 is a result of the way certain IHV’s attempt to control the market, I wouldn’t put it past them to do things subtly differently and then just use the excuse that they are “the industry standard” so must be correct, but hey, that’s just me be paranoid. I think 1 maybe a problem, but to be honest I’d admit that the driver has to do a level of parsing of the tokenised format it gets from Dx which is also going to be subject some bugs, so score even on this.Humus said:1,3,4 have been discussed to death in several lengthy threads already. I and many others don't believe 1 is much of a problem, 4 is easily solved, and 3 isn't anything new (buglessness have never been a guarantee). 2 is wrong, unless you know some inconsistency in the spec?
Regarding your objection to 1, two things, how much more responsive are the IHV’s than MS at fixing “critical” bugs, and second can you give examples where there has been such a bug in the Dx9 HLSL that weren’t fixed before they where found?Humus said:1 also means that bugs can completely halt your development or force you to ditch certain visual attributes for many months until MS update their runtime. 2 is bad thing, extensibility is desirable. 3 sucks, targets other than the hardware will be suboptimal, nor does it guarantee or make it significantly more likely (or at all) that it will run correctly on all hardware.
Hmm, parsing, tokenising, expression evaluation etc etc, I’m really not interested in having to do these, takes a fair amount of work off my hands if I’m given an im format that uses simple ops and preserved flow control constructs and no assumptions about temps (last one probably only relevant on HW that won’t be available for 1-2years).Humus said:The only way to get a good enough intermediate format would be if it's completely free from restrictions. Basically, to get around all limitations you will end up with a MS compiler that does nothing but parsing.
As I’ve said already, I have probably overstated this issue.DemoCoder said:Parsing is trivial. It's a done deal. It took me approximately 3 hours to take a C++ grammar and modify it to parse Cg and GLSlang. ARB already provides a YACC grammar for GLSlang that is likely to be used by implementors. This particular issue is not worrisome.
Not sure that this argument has any real bearing on this discussion, given a defined programming interface you’re free to stick whatever you like on top of it, but that would be completely missing the point of having a standard. Take this to the extreme, if I wanted I could add extensions to OGL that completely replace the functionality of the whole API in a manner that I consider “better”, but then this would be defeating the point of having a standard API.DemoCoder said:As for 3rd party syntax extensions, this is far more likely with a public intermediate representation, since anyone can write a compiler front-end in any language they want to generate the intermediate representation. NVidia could provide a HLSL that is "ML like" (functional) and ATI could provide one that is XML-ish.
Yes I overstated this.DemoCoder said:Assumes parser bugs are the biggest issue with compilers, they aren't. Parsers are generated from automated tools which take LL or LR grammars and as long as your grammar is correct, you're parser will be, unless your YACC/PCCTS/ANTLR/CUP/etc generator is bugged.
No, the syntax of MS HLSL is fixed, if you choose to use a third party front end then you’re not using MS HLSL.Secondly, the syntax isn't fixed, because now multiple-front-end parsers can generate intermediate representation.
Yes, generally true, except I think you should look at the length of many of the shaders that are being pushed through these things i.e. short, and often short in intermediate form than HLSL form, in these cases the parsing IS the significant part of compilation. Obviously this becomes less of an issue as shaders get longer.Third, parsing is the smallest part of the work in a compiler, so the fact that you've removed parsing out of the pipeline is irrelevent.
I think you over estimating the value of these things e.g. NO HW in existence has specific acceleration for “Perlin noise()”, or matrix transpose, the MS approach takes this into account, you have to supply a function to do it, this is unlikely to change given the insistence that HW needs to be programmable to be useful now.The fact is, Microsoft's so-called "intermediate representation" (DX9 assembly) cannot model many of the semantics that you may want to pass to the hardware for acceleration. How do you represent a Perlin noise() function, or a matrix transpose? Even normalization isn't passed to the HW, so if HW had a hardware normalizer built in, it becomes much more work (remember, minimal work neccessary?) for the driver to detect and optimize it.
The limitations exposed in their intermediate representation deny many optimizations that could be possible on more powerful hardware (loop,funcall limits, register limits, no type information at all!)
The current iml provided may not be perfect but, my own experience working at the driver level has shown that they are NOT a significant bar to optimisation.Well, there's the rub. MS's "intermediate language" is nothing like the intermediate representations used in most compilers, be it Java, .NET CLR, C compilers, etc. Most compiler IR-reps use either Tree-IR, or 3/4-address code, and do not use actual register names, but use autogenerated labels which do not get assign registers until near the end.
You need to get away from what "could have been", and deal with what the reality is: The reality is, DX9 VS/PS 2.0/3.0 do not have the representational power that other intermediate representations have, and won't allow the kinds of optimizations that GLSlang's model will.
If Microsoft had an intermediate representation for 3D that was more like .NET CLR, and less like some low level assembly syntax, I might be tempted to agree with you. About a year ago, I was arguing on this board that MS should provide an interface whereby you can plug in source code, and get back a parse-tree or IR rep, and thereby allow the driver to do it's work based on more higher level information, instead of having to translate DX9 assembly.
But since all of your claimed benefits don't actually show up unless MS rethinks the way they're currently doing things, talking about how a hypothetically rich IR with separate parsing would be ideal is moot, since the reality is: DX9 FXC + assembly vs OGL2.0 HLSL
Chalnoth said:Let me get this straight, JohnH.JohnH said:IHV's _Cannot_ tweak the syntax of the HLSL as they do not have acces to it. They Also can't tweak the intermediate format as it won't get past validation.
You like DX9 HLSL because IHV's can't cheat on it, because IHV's have in the past "modified" DX9 shaders?
Think about that for a second.
Which means the "cheat" angle is a non-issue here as it can be done regardless of compiler used.JohnH said:This has nothing to do with intermediat vs direct compilation as being discussed here, instead it has everything to do with someone make a bad descision during driver development.
John.
One of the easiest ways to solve this would be that each IHV offers a download of small "validation tools" that share their code with the shader validation mechanism in the driver. Then a developer doesn't need the hardware to know whether it will run or not. It has the implicit assumption that upcoming hardware is at least as capable as its predecessor, but I don't think that's a problem.JohnH said:Many of the claimed benifits do exist. Tell me how you guarantee that something written for GLSlang and test on one driver/HW is guaranteed to work on any piece of HW in the field ? The profiles used for the intermediate target are a good step to fixing this, they need beefing up a bit by inclusion of a few ?external? caps, but as I said to Humus, the 3.0 profile does this.
I think you over estimating the value of these things e.g. NO HW in existence has specific acceleration for “Perlin noise()â€, or matrix transpose, the MS approach takes this into account
Having a standard is supposed to help you avoid having to do things like that.Xmas said:One of the easiest ways to solve this would be that each IHV offers a download of small "validation tools" that share their code with the shader validation mechanism in the driver. Then a developer doesn't need the hardware to know whether it will run or not. It has the implicit assumption that upcoming hardware is at least as capable as its predecessor, but I don't think that's a problem.JohnH said:Many of the claimed benifits do exist. Tell me how you guarantee that something written for GLSlang and test on one driver/HW is guaranteed to work on any piece of HW in the field ? The profiles used for the intermediate target are a good step to fixing this, they need beefing up a bit by inclusion of a few ?external? caps, but as I said to Humus, the 3.0 profile does this.
Careful, you might start advocating a return to fixed function pipelines, and then, finally, some of us may find the time to get a lifeDemoCoder said:Yes, current drivers have to "parse" DX9 assembly byte codes and recreate internal compiler datastructures: DAGs, register interference graphs, use-def chains or SSA, etc
I think you over estimating the value of these things e.g. NO HW in existence has specific acceleration for “Perlin noise()â€, or matrix transpose, the MS approach takes this into account
Thus ensuring that these functions will never be accelerated by HW, since it would be next to impossible for drivers to take advantage of it from DX9 instruction streams.
MS's intermediate representation has no notion of external linkage to "intrinsic" builtin library functions. GLSlang provides a large library of such functions and leaves it up to the IHV to determine how they will be implemented: as software approximations or emulations, or as HW.
DiGuru said:It is not about efficiency or great graphics. It is about power and money. And that explains quite neatly why Microsoft is doing what it does.
Joe DeFuria said:DiGuru said:It is not about efficiency or great graphics. It is about power and money. And that explains quite neatly why Microsoft is doing what it does.
Well, "duh."
That also explains quite neatly the existence of Cg.
Every one of these companies does what it thinks is ultimately in its best interest. However, that doesn't mean what's in one company's best interest doesn't also coincide with what's in the interests of consumers.
JohnH said:Regarding your objection to 1, two things, how much more responsive are the IHV’s than MS at fixing “critical†bugs, and second can you give examples where there has been such a bug in the Dx9 HLSL that weren’t fixed before they where found?
JohnH said:2 is a very good thing, it ensures consistency, HLSL should not be extended by changes in syntax, only by addition intrinsic’s which should still follow defined syntax. Extensions a bit of a funny one really, as an IHV I think they’re very useful, however as a programmer and part time graphics fiddler I can’t see how they don’t lead to fragmentation of the API.
JohnH said:3 doesnt suck, it’s very useful to have at least half a chance of being able to run on a different piece of HW that claims support for a specific profile, how does this work in GLSL (I did ask this question in my post)? I also think its very easy for people to blame the effect of poorly written drivers on the mismatch between the intermediate formats and HW internals, the reality is that the currently provided profiles aren’t, or rather don’t need to be the problem here. And, I’m saying this based on experience within drivers, so its not just speculation.
JohnH said:Not sure that this argument has any real bearing on this discussion, given a defined programming interface you’re free to stick whatever you like on top of it, but that would be completely missing the point of having a standard. Take this to the extreme, if I wanted I could add extensions to OGL that completely replace the functionality of the whole API in a manner that I consider “betterâ€, but then this would be defeating the point of having a standard API.
JohnH said:Tell me how you guarantee that something written for GLSlang and test on one driver/HW is guaranteed to work on any piece of HW in the field ?
JohnH said:Careful, you might start advocating a return to fixed function pipelines, and then, finally, some of us may find the time to get a life
DemoCoder said:What's the NV30 driver to do? Try to "recognize" a sequence of 8 instructions and assume it's a SINCOS?
Adding support for high level function instrinics isn't a return to fixed function, since there is usually an adequate software emulation of these functions (e.g. SINCOS), it's simply giving the driver the opportunity to detect and replace a SINCOS() call to native HW instead of letting the silicin set idle and burning up regular vector unit slots.
Err, maybe I was having a small joke!!DemoCoder said:JohnH said:Careful, you might start advocating a return to fixed function pipelines, and then, finally, some of us may find the time to get a life
I don't think it's the same thing. Many of the complex PS instructions like DP4, could in fact, be macros that get translated into multiple micro instructions on some architectures. But MS didn't make DP4 a macro, they made it an instrinic. Now, the driver can decide how DP4 is implemented, since the driver actually receives the byte code for a DP4. Normalize can be implemented as either: N * rsq (N.N) (and RSQ is in fact, another compound instruction), or with Newton-Raphson approximation, or using a HW unit. But DX9 doesn't specify that normalize makes it into the driver layer. It gets expanded by the compiler, and I think, by the assembler as well. Ditto for POW/EXP/SAT/etc
There are many future "instrinsic" instructions which could be added: noise, transpose, smoothstep, matrix multiplication, reflect, FFT/DCT, pack/unpack, etc
Even if these aren't hardware accelerated, DirectX's expansion of these into further DX9 code is not neccessarily the most optimal implementation for all platforms. The DX9 compiler assumes for example, that the best way to implement SINCOS is to expand it into a power series eating up 8 instruction slots, which is definately not the optimal implementation on the NV30 which has HW support for SINCOS.
What's the NV30 driver to do? Try to "recognize" a sequence of 8 instructions and assume it's a SINCOS?
Adding support for high level function instrinics isn't a return to fixed function, since there is usually an adequate software emulation of these functions (e.g. SINCOS), it's simply giving the driver the opportunity to detect and replace a SINCOS() call to native HW instead of letting the silicin set idle and burning up regular vector unit slots.