MfA said:
C isnt all that high level and something with fully virtualized resources cant be really termed low level, they will probably never allow custom functions ... but the only other fundamental difference will just be that variables have numbers instead of names AFAICS.
Well, you can just about view any instruction set as "virtualized" if you want to. That's principally how Intel and AMD view the x86 instruction set these days. There are also Java chips that run Java bytecode in HW.
On the other hand, no one will deny that the x86 "virtual" instruction set causes problems by both restricting CPU architecture and making compilers harder to write. For example, an assembly language programmer can typically beat an x86 compiler by 50%, but on the SPARC or StrongARM, human and machine will come within 10%.
The choice of intermediate representation is important, not just for freely HW manufacturers from a specific implementation, but also making compilers easier to write. GCC is a classic example of this problem. The compiler's machine independent path produces substandard code on the x86.
The problem is although the PS is virtualized, the DX compiler is making choices about instruction selection, order, and register labeling that obcure the original source expressions and make it harder for the driver to "recognize" patterns of instructions.
Let me explain it this way: The original source code has variables with labels A1, A2, ..., AN which form a interference graph in the DX compiler. The new output assembly, has labels R0, R1, ..., RM which form a new interference graph for the driver compiler. The two graphs will not be identical, but will be isomorphic with some additional constraints. By the well known
graph isomorphism problem, it is very difficult for the driver compiler to find the isomorphism back to the original source graph. That means some structures amenable to HW optimization may be obscured by the morphed graph.
It doesn't matter that two graphs are identical if you simply relabel the vertices and add some edges. A compiler which can easily detect constructs in one graph to optimize will not be able to easy detect such patterns in the other. In fact, this fact is used in crytographic zero-knowledge proof systems.
Anyway, this is one of the reasons why the architecture specific compilers can trash generic compilers like GCC.