Battle of three HLSL : OpenGL 2.0 / DX9 / Cg

Yeah, I'm not enamored with the idea of functional syntax itself. We could start with any Cg-like language, add high order first class functions, remove mutatative assignment, add recursion, add monads for texture sampling and constant registers not known at compile time, etc
 
I think cG is just because Nvidia is mad that it cannot get on the opengl board. ATI, Matrox, and 3dlabs are just mad that Nvidia showed them how to make products quick (both in how fast to market and in how fast they run). So they (the other graphics manufactures) wont allow Nvidia membership, that way they can determine what opengL should do, and make it favor their cards. ;)
 
Just a bit more news on this:

I listened in on the ATI DX9 / HLSL presentation yesterday, and there were a few interesting tidbits:

1) When asked about when DX9 would be officially released, ATI said that they couldn't speak for Microsoft, but they believed it to be sometime next week.

2) The microsoft HLSL compiler appears to be very good at generating near optimal machine code. (The presenter said that first, they created hand-tuned assembly versions of the shaders they demoed, and then ported them to HLSL language. When compiled, they ran just as fast, or even faster, than the hand tuned assembly.

3) HLSL code can be compiled to any of the pixel shader version and vertex shader targets. (At design or run-time.)

4) OpenGL HLSL is not dependent on GL 2.0 In other words, expect the OpenGL HLSL to be available before GL 2.0 is finalized. (It will compile of course to GL 1.4).
 
Joe DeFuria said:
4) OpenGL HLSL is not dependent on GL 2.0 In other words, expect the OpenGL HLSL to be available before GL 2.0 is finalized. (It will compile of course to GL 1.4).

It will rather be an extension like GL2_fragment_program (or something like that) that compiles to the client hardware instead of a certain pixel shader version or extension. The idea is to completely get rid of the assembly language step.
 
ATI made the same claims at Mojo day, but I don't buy them. The shaders they were feeding DX9 FXC were relatively simple and easy to compile, yet, the hand-coded assembler still beat it by a few instructions in length. A few instructions on a pixel shader can yield huge performance differences. Hell, 1 extra instruction in a 10 instruction shader could be a 10% performance hit.

Moreover, FXC can only optimize for DX9 assembler, it doesn't know what the underlying hardware is, so it cannot do instruction scheduling. Relying on the driver to figure this out from the assembly fed to it will be suboptimal.

Thus, I still believe you will see third party HLSL compilers "tuned" for R300, NV30, etc pipelines.
 
Instruction scheduling can still be handled very effectively by the driver. The only problem with this approach is if the Dx version the compiler is using doesn't support all the features of your HW then you'll get a non optimal result.

John.
 
Depends how much CPU and memory resources the driver can burn. DX9 pixel shaders can be considered a form of intermediate representation. However, to do really well vs hand-coded assembly against a known device, the driver would have to create/destroy a lot of data structures. Alot of this is simplified due to the lack of loops, branches, and memory writes, but I'm willing to bet that until these drivers because very very mature, most of them will opt for very simple "low hanging fruit" driver-side translations.
 
Performance Hit

Hell, 1 extra instruction in a 10 instruction shader could be a 10% performance hit.

If it meant that DX9 shading gained widespread and rapid acceptance amongst game and app developers wouldn't that be an acceptable performance hit?
 
The shaders they were feeding DX9 FXC were relatively simple and easy to compile, yet, the hand-coded assembler still beat it by a few instructions in length.

In this particular presentation, the compiled assembler either matched the instrcution length of the hand-coded assembler, or beat it by one instruction.
 
The data structures associated with optimising the intermediate assembler format aren't huge, particularily when you take into account the size of the programs currently being optimised. This data is also only transient. Any application or driver that is worth its salt will also # the created shaders so the time and memory overhead associated with shader compilation shouldn't be repetative.

John.
 
Is what DemoCoder suggests part of DX9? Can DX9 driver writers substitute their own compiler that goes direct from HLSL to device-specific shader instructions, or is DX9 assembly language a required intermediate step?
 
DX9's HLSL compiles to Dx9's intermediate assembly format, not sure if there's a way of insert you own backend (which I beleive Cg allows).

John.
 
If they first hand tuned the assembly versions, and then ported them to HLSL language, it's quite likely that they thought in a rather assembly-friendly way when writing the HLSL. So the compiler could have had a rather easy task to optimize it to the exact same code.

A common reason for compilers to be better than hand optimizations is that they don't get tired, they don't loose concentration for a few instructions, they don't miss the "low hanging fruit".
So while what you said about the demonstration indicates that the compiler certainly doesn't suck, it doesn't prove that it it's realy good. I would like to see some actual examples before giving it such a label.
Note: I don't say "the compiler isn't good", just that we don't know that yet.

But even if the compiler isn't all that good at finding the clever tricks, it could be quite OK anyway. Generally if I use C for such low level programming, I just see it as a nice assembler syntax. I'd do most optimizations myself, and if the compiler adds some, it's just a perk.
 
Back
Top