Carmack's comments on NV30 vs R300, DOOM developments

Doomtrooper · Feb 4, 2003

I don't agree at all, there is no reason the standard ARB extensions could not be used..and why more and more I want DX9 to succeed, then the need for proprietary extensions will be a thing of the past.

I never thought I'd say that, but it appears to be the only solution to game developers like Bioware and Never Winter Nights players that were without features that other card owners enjoyed for months, even though their hardware supported it and paid the same amount of money for that title.

Will OGL 2.0 HLSL support proprietary extensions..I hope not.

Luminescent · Feb 4, 2003

I also believe the language should be universal, a result of input from IHV's, w/o propritetary extensions. Each IHV should then build their own compiler and optimize it to translate for their hardware.

Joe DeFuria · Feb 4, 2003

I also believe the language should be universal, a result of input from IHV's, w/o propritetary extensions. Each IHV should then build their own compiler and optimize it to translate for their hardware.

There are advantages and disadvantages for IHVs making their own compilers, vs. one "body" making a single, multi-platform compiler. (Interestingly, the latter approach is being carried out presently with DirectX).

On the one hand, every IHV making their own compilers would provide the best opportunity to hadrware specific optimization. (Best performance.)

On the other hand, every IHV making their own compilers leaves the door open for wide variation in "implementation" and could make life more difficult for developers in terms of bugs. (The same HLSL code producing different results on different platforms, and different versions of compilers on the same hardware...). It also places an additional resource burden on the IHVs to each develop and support a compiler.

Keeping the consumers in mind, I don't see either way as a clear-cut winner.

demalion · Feb 4, 2003

Hmmm...

If the HLSL retains standard behavior, and developers support the HLSL (and, notably, NOT proprietary extensions), then there is no problem.

Where we seem to have some disagreement is that you see the proprietary extensions existing as a guarantee that they'd be used (by developers). My problem with proprietary extensions has always been with their being used to disadvantage comptetitors with equivalent functionality, not for their ability to exploit the advantages of a particular architecture. As long as the HLSL is actually adopted and used in place of proprietary extensions, my particular problem with proprietary extensions does not exist.

I'm unclear as to whether the OpenGL HLSL will compile to the specification of an extension or all the way down to the GPU instruction stream. It seems to me for maintaining a standard, defining the behavior of the HLSL->ARB shader extension specification compiler should be maintained by the standards body. What I view Chalnoth as proposing is that IHV's have the ability to replace the compiler/specify a different target, including proprietary extensions...and as long as this can retain the defined behavior of the standard compiler, I see little problem with that regarding my particular problem with proprietary extensions.

If the existence of proprietary extensions are the concern, with the idea "if they are there someone might use them", I can see some validity in that, in which case I'd echo Luminescent's statement. My understanding for OpenGL 2.0 moving forward is that no new proprietary extensions for LLSL would be created, so currently I just view this as an issue for currently existing LLSL extensions, and I tend not to be concerned within that scope.

OpenGL guy · Feb 4, 2003

Joe DeFuria said:
I also believe the language should be universal, a result of input from IHV's, w/o propritetary extensions. Each IHV should then build their own compiler and optimize it to translate for their hardware.

Click to expand...

There are advantages and disadvantages for IHVs making their own compilers, vs. one "body" making a single, multi-platform compiler. (Interestingly, the latter approach is being carried out presently with DirectX).

IHVs already make their own compiler to convert VS/PS instructions into machine code. Think of the HLSL compiler as the front-end and the IHV's compiler as the back-end.

On the one hand, every IHV making their own compilers would provide the best opportunity to hadrware specific optimization. (Best performance.)

On the other hand, every IHV making their own compilers leaves the door open for wide variation in "implementation" and could make life more difficult for developers in terms of bugs. (The same HLSL code producing different results on different platforms, and different versions of compilers on the same hardware...). It also places an additional resource burden on the IHVs to each develop and support a compiler.

There already are such bugs as some hardware doesn't meet specs. When a game developer uses such a broken platform for development, they assume the output is correct, not realizing that the result is actually wrong.

It seems sufficient to have a single HLSL compiler that generates VS/PS instructions for the IHV's compiler to convert and optimize.

Joe DeFuria · Feb 4, 2003

OpenGL guy said:
IHVs already make their own compiler to convert VS/PS instructions into machine code. Think of the HLSL compiler as the front-end and the IHV's compiler as the back-end.

Thanks for the tip! I presume that the back-end compiling of vs/ps "instructions" into machine code is much more "straighforward" than the front-end compiler, which moves HLSL into discriept PS/VS instructions. Correct?

It seems sufficient to have a single HLSL compiler that generates VS/PS instructions for the IHV's compiler to convert and optimize.

I'll take your word for it.

Humus · Feb 4, 2003

Chalnoth said:
ARB2 was designed by ATI, then accepted by the OpenGL Architecture Review Board after some modifications.

Really? I was pretty sure that's Carmacks work

As for the ARB_fragment_program, it doesn't matter a whole lot who lead the work. In the end nVidia votes yes for it, so if they weren't happy with the design they have themself to blame.

Humus · Feb 4, 2003

Doomtrooper said:
Will OGL 2.0 HLSL support proprietary extensions..I hope not.

Fear not, for OpenGL 2.0 HLSL works fundamentally different than that of DirectX9 HLSL and Cg. The language is standardized, but not the compiler. The driver provides the compiler and directly targets the underlying hardware without passing through any redundant middle layer of a assembler language. There's pretty much only advantages of doing it this way.

Humus · Feb 4, 2003

Joe DeFuria said:
On the other hand, every IHV making their own compilers leaves the door open for wide variation in "implementation" and could make life more difficult for developers in terms of bugs. (The same HLSL code producing different results on different platforms, and different versions of compilers on the same hardware...). It also places an additional resource burden on the IHVs to each develop and support a compiler.

I don't believe this will be the case. Parsing the shader into a workable data set is not a particularly complex task. Also, mapping a higher level data set to the hardware isn't any harder to do than mapping a from a low level instruction set. Unless the low level instruction set maps pretty much directly to the hardware I would almost argue the opposite to be true.

Humus · Feb 4, 2003

OpenGL guy said:
It seems sufficient to have a single HLSL compiler that generates VS/PS instructions for the IHV's compiler to convert and optimize.

I don't agree. It's much harder to read the developers intentions from a low level piece of code than a high level. If the developer specifies something that the hardware has direct support for but the compiler translates into lower level instructions it will be hard for the backend to reverse it and figure out that it can use its special hardware. Especially if the high-level compiler tries to optimize the code.

Joe DeFuria · Feb 4, 2003

Parsing the shader into a workable data set is not a particularly complex task.

Unless, of course, there is no truly "workable and 100% valid" data set that it can be parsed into, due to hardware limitations.

In a nut-shell, I see the Microsoft DX9 approach as a trade-off for compatibility / stability / consistency to the detriment of performance.

I see the "IHV effectively create both front and back-end compilers" as a trade-off in the opposite direction.

I have no issue with each approach being valid. But I do see each trade-off as "real." I've witnessed the evolution of the OpenGL 1.X ICDs on consumers, and it was pretty painful for consumers, developers, and IHVs. (Again, not to say that the DirectX evolution didn't have it's own pain....just a different type of pain.)

OpenGL guy · Feb 4, 2003

Humus said:
OpenGL guy said:

It seems sufficient to have a single HLSL compiler that generates VS/PS instructions for the IHV's compiler to convert and optimize.

Click to expand...

I don't agree. It's much harder to read the developers intentions from a low level piece of code than a high level. If the developer specifies something that the hardware has direct support for but the compiler translates into lower level instructions it will be hard for the backend to reverse it and figure out that it can use its special hardware. Especially if the high-level compiler tries to optimize the code.

Sometimes its easier to optimize at the symbolic level. Now, every IHV will have to have their own symbolic compiler/optimizer.

Either method has their advantages/disadvantages.

Humus · Feb 4, 2003

Joe DeFuria said:
Parsing the shader into a workable data set is not a particularly complex task.

Click to expand...

Unless, of course, there is no truly "workable and 100% valid" data set that it can be parsed into, due to hardware limitations.

In a nut-shell, I see the Microsoft DX9 approach as a trade-off for compatibility / stability / consistency to the detriment of performance.

With "workable data set" I mean a binary represeantation of the semantics of the shader. For instance this expression:
a = (b + 1) * (7 - c);

this can be turned into something like

assign(a, mul(add(b, 1), sub(7, c)));

or in something like an expression tree,

Code:

assign
 |    \
mul   a
 |  \
add add
|\   / \
b 1 7   c

A data structure like this is pretty straightforward to work with.
Now if we do it the DX9 HLSL way the driver can be passed a representation of either a ps1.1, ps1.2, ps1.3, ps1.4 ps2.0 and so on. And the list will only grow larger with time. Instead of needing one path for each card to map a single data structure into the hardware you instead for each new card need ways to map every old version into the new hardware. I can't see how this can be advantageous for anyone. In the long run it will be a compatibility hell rather than helping compatibility, plus that you lose performance too.
It's like with Java vs. C++ (when written in a portable way). The only advantage of Java is that you don't need to recompile it for each platform, but performance and compatilibity suffers.

psurge · Feb 4, 2003

HZ and stencil optimization idea:

Each node in the HZ buffer stores
- z-min
- z-max
- stencil value

The z-min allows for acceleration of the depth fail technique. Furthermore, it allows block operations for stencil: i.e. if it can be determined that all the pixels in a tile pass/fail the depth test, increment/decrement the tile's stencil value (but not the lowest level stencil buffer value).

To obtain an actual stencil value for a pixel, note that the pixel is contained in a series of ever smaller tiles (HZ pyramid nodes) T0, T1, T2,... with stencil values s0, s1, s2,...

Then the pixel's stencil value is s0 + s1 + s2 + ... + sp (where sp is a per pixel stencil value). Since rasterization involves heirarchical descent of the HZ pyramid, the 'addition' of the per tile stencil values can be done in parallel with rasterization at no performance penalty.

This should save some off-chip stencil buffer writes.
--

Furthermore having a z-range available for HZ level n - 1 (where level n is the z-buffer), could be used for z-buffer compression. If the z-range is small enough, z-values could be stored at reduced precision (16 or 8 bits instead of 32 or 24).

The stencil buffer could also be compressed: at the lowest level, you only need enough space to accomodate the stencil ops which filter all the way down to the pixel level.

Assuming that the z/stencil buffer is accessed in blocks as defined by HZ level n-1, the z-bit depth and stencil bit-depth could be adjusted every time a block is written out to memory (without precluding other compression techniques).

Comments?
Serge

[edit] grammar

demalion · Feb 4, 2003

How could you depend on the z range for reducing precision?

I think the n-1 z range might end up changing later in rendering, and wouldn't you have to convert your compression data once that happened?

Sorry if I'm missing something simple.

psurge · Feb 4, 2003

If z-max and z-min are close, the number of possible z values between the two is much smaller than the total possible z-values.

Take for example a 24 bit integer z-value: it has 16777216 possible values. If z min and max are separated by less than 65536, you could represent z as a 16bit offset relative to z-min. If they are spearated by less than 256, you can use an 8 bit offset, and so on.

As for the z-range changing during the course of rendering: yes it would. I am saying that the number of bits used to represent a z-value could potentially change every time a block of HZ level n-1 is written to memory.

I don't know if this is worth doing, it's just an idea...

Serge

Sxotty · Feb 4, 2003

You peopel are slightly confusing me, when you say you dislike proprietary extensions, are you implying you dislike proprietary extensions only in so far as they are redundant?

B/c otherwise the hardware makers would have to wait on board room debates and months of planning to even start their hardware. Or wait until a few months to show what their hardware can do.

So I agree that proprietary extensions should cease, unless it (the extension) opens a new functionality that could not be accesed otherwise.

demalion · Feb 5, 2003

psurge,

I understand the basic idea of how the range reduces bits required to represent the z values within the range.

What I didn't understand, and understand now a bit better, is you were thinking of conserving bandwidth used to send the block to memory, and not storage space. Sorry.

Tahir2 · Feb 5, 2003

Sxotty said:
You peopel are slightly confusing me, when you say you dislike proprietary extensions, are you implying you dislike proprietary extensions only in so far as they are redundant?

B/c otherwise the hardware makers would have to wait on board room debates and months of planning to even start their hardware. Or wait until a few months to show what their hardware can do.

So I agree that proprietary extensions should cease, unless it (the extension) opens a new functionality that could not be accesed otherwise.

This is where, I think, you can't have your cake and eat it at the same time. Bit of a catch-22.

To clarify, this is what the DX model is supposed to do but when DX is late and its features are added or subtracted at a late stage, you are going to find yourself at square one..writing your own compilers etc.. to expose your featurres until the API is ready to do it in a more standard form that all IHV's can use/implement.

Doomtrooper · Feb 5, 2003

Sxotty said:
You peopel are slightly confusing me, when you say you dislike proprietary extensions, are you implying you dislike proprietary extensions only in so far as they are redundant?

B/c otherwise the hardware makers would have to wait on board room debates and months of planning to even start their hardware. Or wait until a few months to show what their hardware can do.

So I agree that proprietary extensions should cease, unless it (the extension) opens a new functionality that could not be accesed otherwise.

Yes thats true, they must wait until the ARB would support that function, yet DirectX has operated that way since the beginning.
When a graphic company is allowed to introduce proprietary extensions they can then use that to leverege developers by stating: well we can significantly improve x,y,z with our extensions.

The waiting part is a moot point, shaders were introduced two years ago yet Daniel Vogel from Epic states here we don't need them for UT 2003, what would a couple of months have done to get a standard shader extension two years ago have done :?:

As a consumer, and value choice for selecting your hardware then I'd much rather see the ARB extensions supported otherwise who knows, down the road you may be on the recieving end of buying a PC title and compare it to a friend and realize his game looks so much better due to specific code paths and extensions...

A good example of that was Never Winter Nights, a very popular title..Biowares own tech people on their forums were telling Radeon 8500/9000 users their hardware couldn't support Pixel Shader effects for water :!:

Bioware chose a proprietary Nvidia extension to show those effects, not exactly great for the consumer is it.

Carmack's comments on NV30 vs R300, DOOM developments

Doomtrooper

Luminescent

Joe DeFuria

demalion

OpenGL guy

Joe DeFuria

Humus

Crazy coder

Humus

Crazy coder

Humus

Crazy coder

Humus

Crazy coder

Joe DeFuria

OpenGL guy

Humus

Crazy coder

psurge

demalion

psurge

Sxotty

demalion

Tahir2

Doomtrooper

Similar threads