DemoCoder said:
John said:
Err, but you get those instructions down at the driver they're all part of the intermediate format, no information is lost there (well other that the annoying bug with sincos).
A HLSL branch, depending on the code, can be written as MIN/MAX, CMP, LRP, IF, and IF_PRED. On some HW, it is more efficient to use CMP, on some, LRP/MIN/MAX, on others, branching, and perhaps others, predicates. The FXC compiler is forced to pick one of these, so let's say it picks CMP to implement HLSL if(cond).
A conditional should be left as a conditional, no argument there, although many of the combinations you list are _easy_ to convert back into something different (generally all but LRP).
Well, CMP performs badly on NV30, so you are now asking their driver to take low level assembly, which has been inlined and reordered, and reverse engineer loop branches out of it.
Loop branches ? Well if you’re talking about VS2.0 loops get left as loops, if you’re talking about PS, then the NV30 doesn’t support loops, so what is there to reverse engineer?
And IHV's are support to have an easier time developing DX9 drivers because of this? Why don't they just put a DECOMPILER back to SOURCE in there while they are at it?
If your HW is only really capable of supporting profile X, and thats all you attempt to expose then its a whole lot easier. Problems only start to arise when someone gets it wrong, and then attempts to claim that something is something that it isn't.
Your answer to everything is "profiles, profiles, and more profiles!" Despite the fact the number of profiles will have to grow quite large, Microsoft will have to maintain all of them, and it still doesn't remove the burden from the IHVs to write compilers to deal with DX9 assembly.
Profiles are a solution to a problem that you have repeatedly failed to address.
Actually its not that heavy duty to spot these types of optimisations, but yes its is more work than if you had higher level information available to you.
It's not that heavy duty if they remain relatively intact like I showed you above, but it will be hellishly difficult if the instructions get reordered through a scheduler, registers get packed, and some of those intermediate results are reused by the compiler.
The driver still has to create a dag to do its own scheduling, this will often reveal the majority of opportunities that were not obvious from simple examination. The scheduling performed by the Dx9 HLSL shouldn’t attempt to take account of latencies (afaik it doesn't) as it need to be neutral, instead it only try's to make things fit wrt to things like dep read chain limits as defined by the profile, and before you shout about it, there is one that doesn’t include those limits.
Isn't that a bug with the HLSL? Has no impact on the im format.
Yes, but I am listing the inadequacies of the whole platform. And right now, Microsoft has one poorly optimizing compiler to rule them all.
The issue is actually that the MS HLSL compiler shouldn’t be attempting many optimisations as they can/should be performed inside the drivers optimiser. This is an issue that needs addressing.
MS expands any and all DX9 macros. MS fubar's constant folding. And when will this be fixed?
All=some, when will things be fixed? Ask MS. When can I start writing GLSlang shaders that are guaranteed to work on all HW with some kind of undefined shader support? (as required by the specification).
And when people are playing HL2, will their SINCOS units be sitting idle?
Yes probably quite idle, irrespective of their presence or the compiler used, I think you'll find that the majority if shaders in half life don't use sincos.
Basically none of these issue need lead to less efficeint code, but yes it does lead to some effort in writing the driver based compiler, I won't deny it.
^^^^^^^^^^^^^^^^^^^^^^
Are you listening Joe Defuria?
[/quote]
Hmm, yes.
Of course you're completely ignoring the HUGE difficulties created by having to support all features in GLSlang if you expose the extensions, this will only go away when HW that fully supports it is available, and then you're still left with what to do for the huge pile of legacy HW out there.
John