DaveBaumann said:
ATI and 3DLabs will have to do this ANYWAY for DX9 HLSL and they will have to implement a backend/optimizer for DX9 HLSL.
I was under the impression that MS would do the DX9 HLSL Compiler since it would just be compiling generic DX9 Assembly. IIRC I already mentioned what you are saying here and the idea was there in principal but I don't think that was how it was going to work.
MS will provide generic DX9 assembly output, but that won't get your optimized performance on all platforms. C compilers can do "generic 80x86" output, but with widely varying performance hits on different architectures (486, 586, P3, P4, AMD, etc) Modern compilers are sensitive to # pipelines, cache architecture, scheduling constraints (e.g. CPU can't do a DIV and MUL at same time, but can do MUL and ADD, so rearrange instructions to maximize ILP), branch prediction, etc.
Same goes for GPUs. Different vertex shaders will execute with different performance characteristics. For example, it may be that you can do a texture instruction in parallel with a color op, or that a RCP or SIN/COS function operates more efficiently on some platforms than others. Perhaps RCP eats up extra resources and you need to insert scalar ops in any open slots in the schedule.
A given vertex shader can be optimized somewhat by the device driver, but there are limitations, since the shader is not a good intermediate representation of the original semantics.
If ATI or 3DLabs really wanted the most optimal output, they'd take the HLSL and compile it themselves, performance high level optimizations for the known characteristics of their hardware.
My position has always been:
1) There will be a "catch all" generic compiler for all hardware. (both Cg and MS D3DX will provide this)
2) "Catch all" won't provide best performance
3) Vendors wishing to eek out better performance are going to implement their own HLSL compiler
4) NVidia has done so for NV30 (to the disdain of the ati *boy consortium)
5) If ATI/3DLabs/et al want top performance, they should do so as well
6) Reinventing the wheel is a bad thing. That's why "unified drivers" exist, because alot of the significant work in writing an OpenGL ICD is reusable. Same goes from a compiler.
7) There exist open-source compilers these companies can use to start themselves off (GNU's arch, NVidia's Cg, and many more)
As GPU's become general purpose computation devices, the SAME trends we saw with CPU's and compilers will be applied to GPUs. Just like the first CPUs, we started with fixed function devices (eniac, collosus, etc), they graduated to assembly language (punch cards), and now we are getting to our first high level languages for CPUs. Over the next decade, you will see many many HLSLs for GPUs, not just one. And you will see many many compilers and IDEs.
Those's hoping for a single programming language for the GPU for going to be in for a rude awakening.