First, I was you trying to say that for x86 that wasn't that big a problem, not arguing about shaders and HLSLs.
Chalnoth said:
The main reason is simply this: video hardware is changing at a breakneck rate. If we get bogged down in a standardized instruction set, then that instruction set will hold progress back, just as has happened with the x86 architecture. While it is true that you can, for example, squeeze a little bit mroe performance out of x86 by going straight to the assembly, the truth is that our processors would be running one heck of a lot faster if the HLL's had been standardized instead of the processor instruction set.
You mean Java, C or Visual Basic CPUs? And then you say that RISC machines are better than CISCs... Are you stating that the processors should translate C (or any other High Level Language) at fly? How many time do you think it takes a program to compile? BTW the HLLs are already standardized, and they can be used with any available instruction set. The problem is that when you have an stablished base of software (in binary format, not code) for a plataform you want to continue using it. It is not an engineering problem, but an economic one.
Independence of the ISA has a cost, either storing the source code and recompiling or providing a layer of translation between ISAs (al Transmeta).
Chalnoth said:
One other example: What would you rather have in three years: a 1GHz GPU running on an equivalent of the x86 instruction set, or a 1GHz GPU running on an equivalent of a RISC instruction set? Which would be faster? Obvoiusly the more advanced one would be.
I'm really hoping that DX10 takes a "hands off" approach to assembly programming, and goes all HLSL. I also hope that 3DLabs' proposal to standardize the HLSL, not the assembly, goes through for OpenGL 2.0.
I can't arque about graphics (because i know a little about this topic) but I see it as a different problem. CPUs are designed for general purpose and to be flexible, the ISA is also part of this flexibility (even if it becomes a burden because of the compatiblity issue). I'm sure that going to HLSL for graphics APIs is good as they are high level abstractions of the hardware (as a programming language for a CPU), and as shaders remain small (yet) they can be compiled from the HLSL to the specific hardware ISA at run-time. But in the CPU world compiling Word each time you want to execute it and wanting it to run as fast as statically natively compiled is just crazy (although MS would be happy to provide more arguments to Intel/AMD/whoever for faster CPUs with they .NET VM approach
. I have been studing the problem of binary translation for quite a time and I think there are heavy reasons because approaches like Transmeta (doesn't work properly (other than using a stupid VLIW ISA to translate in fly a CISC ISA).
Chalnoth said:
Yes, it is a problem. Particularly for a GPU, having to decode would require far more precious transistors. On a GPU, those transistors could be put to use much more effectively than the same transistors in a CPU. And, just as you stated before, a compiler can't be quite as optimal as programming right to the assembly. Don't you think that the internal translator in those CPUs reduces performance?
That could be a problem, however i don't know how many of the P4 or Athlon transistors are used for the process of translation, but in CPUs the transistor count uses to more to the caches than to any other part of the chip. In fact in the next years what we would see is that there will be 'too much' transistors. With a billion of transistors you start to have problems to put them in use (other than larger caches or embedded memory) with current architecture models (delay penalties between units, lack of exploitable ILP).