DemoCoder said:
demalion said:
It is actually better than the "problem" we have right now for implementing game performance improvements, because targetting the "LLSL" and the "90% of the work of glslang" you propose for IHVs is not statically linked. Your self contradiction is mind boggling, as is this bogeyman of "DX HLSL will make you have to download patches every month!".
Sorry, I don't see the contradiction. I propose making 100% of the compiler dynamic and under the control of the IHV. The fact that you can achieve partial dynamism today is not a self-contradiction.
On the one hand, you state that HLSL requires statically linked upgrades to "all your programs" to get improvements when you download a new driver.
On the other hand, you state that IHVs have already done "90%" of the work necessary for implementing a glslang compiler by implementing the driver side optimizer for DX "LLSL".
If they've done "90%" of the work for their driver side optimizer, why do you have to "download patches for all your programs" to get the performance improvements from the driver. I have the same question with other numbers that are less than "90%", but not "0%".
In fact, you can achieve 100% dynamism if you ship FXC with your game and invoke it via D3DXCompileShader*
Yes, I remember when I observed in the last discussion, among other things. Are you proposing this explains something you're trying to say, or just bringing it up? What does that asterik mean?
But of course, it still prevents the poor IHV from making decisions early in the compile phase, and despite what you think, and as I have explained in the past, compiler front-ends share a lot of code in common, but that is not in contradiction to the fact that early phase optimizations are still a win. The code that implements the high level optimizations may be the same, but the heuristic which determines whether the optimization gets performed or not is architecture specific.
Hey, wouldn't it be neat if you had different "thingies" for changing compiler behavior, based on introducing such architecture specific heuristics as necessary? You could call them "profiles" or something, and then the question becomes whether the "poor IHVs" or MS would find optimizations in a more timely fashion. Pity it took so long to discuss that, as this was pointed out in my initial post listing the factors.
Go lookup how inlining and cache collision optimizations algorithms work for example. (oh please, spare me "oh, but they are not the same, those are CPU algorithms" anti-analogy Demalion response. Go lookup register allocators then)
Cache collision reminds me of some of the principles discussed in Nick's vertex buffer thread, which would seem to be most directly relevant to texture load type isntructions if I understand you correctly. We discussed inlining and register allocation in our last discussion, though at the time, you said you didn't feel like continuing the discussion, though I found it interesting...has that changed?
OK, so how do these things answer any of the points I brought up?
For example, all the code for parsing, type checking, semantic analysis, and even optimizations can be shared. However, which optimizations are switched on and how they are chosen is not always a decision which can be made STATICALLY and GLOBALLY based on some idea of a "profile" that applies to an entire category of hardware.
What is the connection between the above and this conclusion? The problem with your analogies is they fail to establish it.
"Not always" consists of "sometimes, and sometimes not". What is the "sometimes not" that "WILL" cause HLSL performance to be lacking?
(sorry, can't even parse what you are trying to say)
Do you mean "statically linked"? For my usage, "build time" inclusion is "static linking" and "run time" inclusion is the result of "dynamic linking".
That's "not a good job" because their hardware is deficient and requires lots of work to run well at all.
No. Just because they created an architecture that is register usage sensitive doesn't excuse the fact that a compiler could be written to take advantage of the strengths.
DC, stuff like this just doesn't make sense.
You proposed that IHVs did a "bad job". I provided a reason why it seems a valid characterization for one particular IHV's compiler:
because their hardware needed it badly. This relates to Cg as well, which
was a compiler written to take advantage of the strengths of a particular hardware, and the IHV failed at it in comparison to HLSL.
The problem was your insisting that all IHVs did a bad job with HLSL as a premise for proposing that IHVs will be universally better served by glslang if they simply weren't "crappy". It doesn't make sense to then go on of a rant about how I'm being ludicrous for accusing nVidia of doing a bad job when I'm just pointing out the difference between how nVidia and other IHVs in the context of your assertion, and that the distinction is associated with demonstrably bad decisions.
You may as well complain that since standard compiled C or Fortran doesn't run well on massively parallel supercomputers, it is a fault of the hardware, and not the fault of the standard compiler not being aware of how to parallelize.
Likewise, it was Intel's fault that they designed "deficient" HW since taking advantage of SSE2 automatically requires a lot of work to run well.
Hmm...well, let's see why I have such a problem with certain analogies: If a massively parallel supercomputer offered less performance than a single CPU when running "standard compiled C", and we ignored the "LLSL" compilation issue completely for the moment, your analogy would have some relation to correcting my characterization of "deficient hardware". However, I daresay that this would rather clearly still be "deficient", even if called a "supercomputer", if code optimized for it still compared unfavorably for performance to the alternative. Which is the situation we were actually talking about when I mentioned "deficient" hardware.
Or it could be that I'm against
all analogies, and not just ones used to change the conversation direction at points inconvenient for someone who doesn't want to discuss problems with what they said. I don't happen to think so, though.
BTW, as far as your second analogy, it would be more like comparing 3dnow! to SSE, I think, at least as far as getting rid of the obvious problems.
And, I ask again, what is the baseline of optimization, conflicts, and bugs that this will establish? It is not a complicated question, it just an unanswered one. If you have an answer, please share.
I don't know, but shared source is all over the place, in applications, compilers, MS's operating system, etc. Either you'll be stung by the bugs in MS's FXC (shared by everyone), or you'll be stung by bugs in some open source DDK for OGL2.0 driver writers.
This looks familiar. Almost like I discussed something similar before. If I ask you to read my initial discussion of the comparison to the approaches again, or otherwise point out that this was already a topic of my conversation before your analogies steered away from it, will this illustrate the problem with them more clearly?
FXC offers no advantages, unless you somehow think that a CLOSED SOURCE compiler written by Microsoft will have less bugs than an open-source DDK that has thousands of developers examining it and building their own compilers.
I invite you to address the elements of my initial post, where I itemize and list something other than "no advantages", in some way besides proposing an MCD/ICD parallel as a substitute to actually discussing them. That way, when you go on to propose there are "no advantages" as a premise of your argument again, analogies like those you've used might be something other than simply a mechanism for proposing your viewpoint in a vacuum of dissenting discussion.
This concept isn't complex, just wholly ignored and sacrificed on the altar of your arguing from the standpoint of "I'm right, why should I bother to actually pay attention to what you say?".
Will DX HLSL and the profile system stop evolving? Or is it simply that you propose it will need to evolve "every month" for some reason?
So do you propose that profiles be created for each and every chipset in existence, and have them all maintained by Microsoft and shipped in DX9.0? updates (which is how new FXC versions get distributed)?
No, and though I'm disappointed that you've forgotten the answers I gave back in July when we discussed this lat, I am relatively glad you asked here (because it isn't a flawed analogy, and actually
deals with the content of my commentary...can we keep this up, please?).
I propose that more profiles be created as necessary to reflect a LLSL expression suitable for "intermediate->GPU opcode" optimisation by IHVs. How many will that be? Well, that depends on the suitability of the LLSL and profile characteristics itemized at that time.
Your alternative proposes that a new and unique type of compilation paradigm will be 1) introduced every month 2) unable to be resolved by a back end optimizer of the "shorter instruction count" principle intermediary or whatever other characteristics in prior profiles there are.
1) seems absolutely ludicrous, and I think I've specified why.
2) depends on how many different approaches are needed for what IHVs do. We have 2 at the moment, one being "shorter operation itemization count, and executing within the temp register requirement without significant slow down". I think this is a pretty generally useful intermediate guideline that more than one IHV should be able to utilize. You seem to propose that every single IHV product introduction will be an exception to that such that a new set of guidelines will be required in accordance with 1).
So if Nvidia, ATI, 3dLabs, et al want to ship a new FXC with an updated profile to correct some optimizations, they will have to wait until Microsoft wants to ship another FXC update?
Maybe if you stop going on about "every month" we can discuss that possibility in a reasonable fashion. If you can do that, please note that I've provided an answer when I discussed the advantages and disadvantages initially. For future reference, please don't have a discussion with someone who thinks HLSL is perfect and call them by my handle.
Along the lines of my prior discussion, I propose that IHVs should already be communicating with MS about taking these matters into account, and their profile decisions should reflect such considerations. If they don't, glslang will have an opportunity to display a significant competitive advantage, if it overcomes the challenges it faces while DX HLSL is at a disadvantage, and IHVs find implementing a custom compiler less problematic than communicating with MS.
What about that was unclear?
The set of advantages and challenges in the respective comparisons are not the same, and the end result is not pre-ordained by substituting one set for the other.
Classic Demalion.
Please stop arguing against my handle/name, and argue against my discussion.
The response to any analogy is to claim "but their not exactly the same" Carmack and 50+ other developers signed a petition to have MS ship the MCD devkit, because they were afraid that OpenGL drivers were too complex to implement
Unless the petition involved the details of the issues about the challenges for compilation, replacing a discussion with this analogy still seems to be problematic to actually evaluating the current situation. If you weren't busy throwing around "typical Demalion", perhaps you'd have noticed that that is what I'd just said the first time.
...petition...
This is exactly the argument that others have posted in this forum: OpenGL2.0 drivers that have a full compiler embedded in them will be more complex, more buggy, more incomplete, etc. Hence, the analogy between the cries that ICDs were too difficult.
Yes, and that is a superficial analogy, because whether people are arguing about one approach being too complex or prone to bugs compared to another is not the issue, it is whether it actually will be, and why. That's why I deride your insistence on it, and try to encourage discussions that don't depend on its substitution. How is my initial observation about the analogy incorrect?
Now, you can cry about how writing a compiler and writing an ICD are too entirely different things, but that's just you playing critic about something you have no idea about.
So writing an ICD and writing a compiler are the same thing, then? I'm missing something...how does accusing me again change the relationship between the two things?
Personally, as someone who has implemented many compilers, I find compilers MUCH easier to write than writing a large device driver like OpenGL ICD.
How about writing a compiler that has to interact directly with all the elements of a large device driver like the OpenGL ICD?
Compiler writing has a large body of study behind it. There are thousands of papers and hundreds of books on it. It is taught to undergaduate computer science majors. It is a well understood task, and you can find many people who know how to do it.
On the other hand, writing device drivers is not an academic exercise. It is an exercise in engineering based on specific experience, a black art that few people have experience in. There are far more people who can write compilers than people who can do OpenGL guy's job. He's got it much tougher.
What are the challenges in this "black art"? I do think maintaining API compliance with complex interaction is part of it. Which actually seems to point out that a driver in the compiler increases the complexity of the driver writing challenge, and doesn't simply replace it with the "ease" of the challenge of writing a compiler, and actually seems to support the things you accuse people of "crying" about.
However, I understand the separate relevance to the proposition that writing a compiler as optimal as Microsoft's should be easy, at least as far as a standard compiler starting point. First, this does not seem to have been demonstrated to be true so far, but we seem to have agree that this was a "bad job" by nVidia. What I continue to not see, as I specified initially, is how a compiler can both resolve such conflict issues, and be freely and easily changed to suite individual architectures more than HLSL can without re-introducing them as challenges. Hey, maybe we can actually discuss that now?
The problem with OpenGL2.0 isn't the compiler, it's all the other stuff they added IMHO.
Err...maybe "all the other stuff" is related to what I'm talking about? How is it that ARB decisions are self-evidently right when you agree with them, and irrelevant when you view them as a problem?
In isolation, I think this comment has merit, and I think I agree with it. However, I don't think it is a universal view, or one that validly precludes viewing DX, in its current incarnation, as easier.
OpenGL is preferred by the majority of game developers.
Hmm...OK, this doesn't seem to be demonstrated at the moment. This seems to remind me more of comments related to DX 7 and earlier rather than DX 9. However, I'm glad if this is actually so, as OpenGL needs factors like this to counter the effect of trailing DX HLSL as much as it has.
It is easier to use, easier to understand, and self consistent. DirectX is a mishmash of crappy Windowisms, and lots of legacy holdovers from 9 different versions of the API. While DirectX9 fixed alot of stuff, it still takes way fewer lines of OpenGL code to accomplish something, and the code is easier to understand. The only reason DX still exists and wasn't killed is because of MS.
Yeah, monopolies suck that way, eh...?
I think MS has evolved DX in response to OpenGL's strengths. I really hope glslang makes a strong showing, as it is important that such an alternative continue to apply pressure in this regard if DX evolution isn't just going to stagnate and ossify. Similarly, I believe OpenGL's focus on one shader expression instead of just continuing the extension system was hastened to respond to the centralized DX implementation methodology and the advantages it offers for developers and ease of implementation for various hardware.
Just a few years ago, game developers were petioning MS to drop DX and throw all effort into OGL.
Err...a lot changes in a "few years", so I don't quite get the relevance of this. However, if we're still talking about the first part of my statement, where I agreed, and not the second, that doesn't matter much.
Do I understand you correctly to be proposing somehow that DX HLSL is simply not an "abstraction", so is completely denied this "power of abstraction"?
DX is an abstraction, but it still presents a view of the HW as pointers to HW memory structures, even if that is not really the case, it's initialization heavy, and is based on passing around C-Structures. OpenGL treats the gfx HW as an abstract state machine. OpenGL objects are referenced by
handles instead of memory pointers. OpenGL is procedural, and you build up the "scene" by streaming instructions to the hardware. DirectX is based on creating objects and structures, setting up a bunch of struct fields, and then calling methods to send these structs to the HW. It's just much harder and more annoying to use. Hundreds of lines of initialization code.
From discussions of exactly this that I recall here, I don't think this view is quite universal, though I do tend to agree with the gist of your opinion.
This seems a pretty ludicrous precept, given things like that Microsoft has met the requirement of "deciding" as well
Microsoft is pragmatic. Rather than spend time architecting the most elegant solution, they pick the low hanging fruit and ship it early. This is the case for all their software. IE1.0/2.0. Windows 1.0/2.0/3.0. DirectX1.0/2.0/3.0..../... Windows CE 1.0/2.0/3.0/... (PocketPC, etc)
Yes, but you're answering why MS has an advantage, not addressing what I actually had a problem with: "You seem to be proposing that once you decide on something, achieving success with it, compared to what someone else decided to work toward, is a
given?"
That's why Microsoft software takes on average 4-5 revisions before people acknowledge it as stable and of useful quality.
Well, what if the competition doesn't even show up until after that occurs? Hopefully the programming advantages you mention will work to counter as much of that eventuality that manifests.
If you look at the discussions that go on at ARB in the meeting notes, and compare them to the internal DX9 beta program, it is clear that Microsoft's perogatives are not to design a clean, elegant, solution, but to get something out as quickly as possible to get the early market lead.
Not sure things are quite that simple, as it would seem DX has evolved to be cleaner as it has progressed...so it would seem to be part of what MS is considering. Yay competition.
ARB members are far more concerned about getting things right and NOT DAMAGING THE API with hacks.
Well, I don't think that such concern automatically means they don't do that, unfortunately, and that still leaves my wondering about whether you considered the rest of the discussion of mine around the comment you're addressing.