Why doesn't DX9 support simple Quads?

Humus

Crazy coder
Veteran
I've wondered about this before, but it haven't bothered me so much since I mostly use OpenGL and the few cases when I use D3D I haven't really needed it. Why doesn't D3D, even in DX9, still not support quads as a primitive type? Doesn't pretty much all hardware support it anyway?

I just basically ported my particle system code from my old framework into my new one, and now I made it API independent. A particle system basically produces loads of independent quads every frame. In OpenGL I can just pass the quads to the API and be done. In D3D however I must workaround the lack of quads by either duplicate vertices (50% higher AGP traffic) or by using an index buffer (33% higher AGP traffic (unless I create one static "large enough" buffer)).

I can't see any real reason why quads shouldn't be supported.
 
Humus said:
I've wondered about this before, but it haven't bothered me so much since I mostly use OpenGL and the few cases when I use D3D I haven't really needed it. Why doesn't D3D, even in DX9, still not support quads as a primitive type? Doesn't pretty much all hardware support it anyway?
I don't think so.
I just basically ported my particle system code from my old framework into my new one, and now I made it API independent. A particle system basically produces loads of independent quads every frame. In OpenGL I can just pass the quads to the API and be done. In D3D however I must workaround the lack of quads by either duplicate vertices (50% higher AGP traffic) or by using an index buffer (33% higher AGP traffic (unless I create one static "large enough" buffer)).

I can't see any real reason why quads shouldn't be supported.
I don't understand your complaints here. A triangle strip = 4 vertices, same as a quad, right?. Why not use a triangle stripd/fan for a "quad"? If you really want to save AGP bandwidth (not a major concern is most cases anyway), use point sprites as they are only a single vertex.

Edit: Even a quad strip can be efficiently represented as a triangle strip. I.e. it takes two vertices to add another quad to a quad strip and two vertices add two triangles to a tri strip.
 
OpenGL guy said:
Humus said:
I've wondered about this before, but it haven't bothered me so much since I mostly use OpenGL and the few cases when I use D3D I haven't really needed it. Why doesn't D3D, even in DX9, still not support quads as a primitive type? Doesn't pretty much all hardware support it anyway?
I don't think so.
I think so :p
Serious, without direct hardware support an implementation would have to create index buffers on the fly. There isn't any (measurable) evidence for that.
OpenGL guy said:
I don't understand your complaints here. A triangle strip = 4 vertices, same as a quad, right?. Why not use a triangle stripd/fan for a "quad"? If you really want to save AGP bandwidth (not a major concern is most cases anyway), use point sprites as they are only a single vertex.

Edit: Even a quad strip can be efficiently represented as a triangle strip. I.e. it takes two vertices to add another quad to a quad strip and two vertices add two triangles to a tri strip.
That's all nice and dandy, but you can't render multiple disjoint quads as a triangle strip. You'll have to stop after four vertices and restart.
GL_ARB_multi_draw_elements can handle this, as can GL_NV_primitive_restart, but that's OpenGL again. The former defines an additional 'primitive run' array and is IMO implemented in software everywhere which makes it kinda pointless atm. Even if it gets fully hardware supported, it's useless in this scenario because the bandwidth overhead for the additional array will outweigh any gains, even more so when compared to real quads.
The latter defines a 'restart' index that will behave like a glEnd()/glBegin(same_thing_as_last_time) pair.

Anyway, it's not basic functionality by any stretch of imagination. NV_primitive_restart requires NV3x ...

Is this sort of functionality available in DX9?
 
Current hardware does not support quad primitives. The only "Support" they have is via tesselation into triangles in setup.

PC chips haven't supported real quads for some time...
 
Wow..this a field where the PS2 can do it better :) I mix triangles and quads in triangle strips via a primitive restart bit that the PS2 hardware can encode in the data passed from VU1 to the GS..it's a very nice thing 8)
ciao,
Marco
 
The problems with quads is that they are open to abuse, the hardware just tesselates them into two triangles (even if the hardware consumes quads, it still internally treats them in a similar mannar to triangle strips). But the quad interface leaves a big dark hole around the non-planar vertex issue, 4 vertices are usually not co-planar. Now the hardware usually ignores the issues but how it ignores the issues causes different cards (and indeed the same card under different clipping conditions) to alter radically the linear interpolation. So all your perspective correct goodness just went out the window.

As indexed quads can be consumed as efficently as indexed triangles, D3D just didn't bother.

The only real case where the OpenGL has an advantage is non-indexed planar quads (i.e. particles) and the like, but in doing so OpenGL has a big quality hole around non-planar quads, which is why its generally considered a "depreciated" interface.

Points sprites are the beginning of a planar quad interface for particles but are sadly lacking in many critical ways. I like to see both OpenGL and D3D support planar quads but OpenGL should remove non-planar quads (or at least define EXACTLY how they are tesselated).
 
Hmm. 1 Quad = 4 vertices = Triangle strip of 2 = 4 vertices, no overhead.

You don't want to use strips, use indexed triangles, 1 quad still = 4 vertices. Yes it costs you two additional indices but the BW of these isn't normally relevent e.g. each vertex = 32 bytes x4 = 128 bytes, same for both tri's or quads. Tris require 12 vs 8 bytes of index data (assuming you're not using 32 bit indices when you don't need to, should be pretty much all the time), This doesn't mean 33% extra BW at all, it means about 3% more BW for geometry data that actually needs quads (which is ? maybe subdiv surfs ? still not required though), not relevent if you look the whole picture.

That said a planar quad vs two triangles requires slightly less setup so could, in theory, be fractionally faster than two triangles, its questionable if the increase would be worth the extra area and effort given that you still need triangles anyway. So even if support I'd be surprised if they wouldn't still just be converted to two triangles in the core before rasterisation.

Of course an API could add support for them in such a way that it becomes difficult to not do them properly, at least at the front end of the pipeline...

John.
 
arhra said:
For a particle system, wouldn't you be better off using point sprites?

Well, first off, not all hardware supports this. Also, since in my case the size varies with each individual particle and is defined in world space rather than screen space it's not directly usable.
 
OpenGL guy said:
Humus said:
I've wondered about this before, but it haven't bothered me so much since I mostly use OpenGL and the few cases when I use D3D I haven't really needed it. Why doesn't D3D, even in DX9, still not support quads as a primitive type? Doesn't pretty much all hardware support it anyway?
I don't think so.

Are you saying that the hardware doesn't accept quads directly but that the driver rather has to preprocess the vertex buffer or index buffer to tesselate it into triangle lists?

I don't understand your complaints here. A triangle strip = 4 vertices, same as a quad, right?. Why not use a triangle stripd/fan for a "quad"? If you really want to save AGP bandwidth (not a major concern is most cases anyway), use point sprites as they are only a single vertex.

I don't wan't to have to do 1000+ DrawPrimitive() calls every frame for a single particle system for obvious reasons.
 
Tagrineth said:
Current hardware does not support quad primitives. The only "Support" they have is via tesselation into triangles in setup.

PC chips haven't supported real quads for some time...

Well, ok, that's basically what I mean with quads, that the hardware can accept four vertices and makes it two triangles. Not that it should perform rasterization on quad primitives.
 
DeanoC said:
The problems with quads is that they are open to abuse, the hardware just tesselates them into two triangles (even if the hardware consumes quads, it still internally treats them in a similar mannar to triangle strips). But the quad interface leaves a big dark hole around the non-planar vertex issue, 4 vertices are usually not co-planar. Now the hardware usually ignores the issues but how it ignores the issues causes different cards (and indeed the same card under different clipping conditions) to alter radically the linear interpolation. So all your perspective correct goodness just went out the window.

As indexed quads can be consumed as efficently as indexed triangles, D3D just didn't bother.

The only real case where the OpenGL has an advantage is non-indexed planar quads (i.e. particles) and the like, but in doing so OpenGL has a big quality hole around non-planar quads, which is why its generally considered a "depreciated" interface.

Points sprites are the beginning of a planar quad interface for particles but are sadly lacking in many critical ways. I like to see both OpenGL and D3D support planar quads but OpenGL should remove non-planar quads (or at least define EXACTLY how they are tesselated).

I don't see this as a problem. The result of non-planar quads is undefined, thus you shouldn't use non-planar quads unless you like to live with no guarantees. There are many other uses of of the API that leave undefined results, yet doesn't cause any problems. For instance after using a vertex array the corresponding data types used in immediate mode is undefined. When using ARB_vertex_program the use of certain vertex attribute leaves the conventional vertex/normal/etc. attributes undefined. It may or may not alias.
 
JohnH said:
Hmm. 1 Quad = 4 vertices = Triangle strip of 2 = 4 vertices, no overhead.

You don't want to use strips, use indexed triangles, 1 quad still = 4 vertices. Yes it costs you two additional indices but the BW of these isn't normally relevent e.g. each vertex = 32 bytes x4 = 128 bytes, same for both tri's or quads. Tris require 12 vs 8 bytes of index data (assuming you're not using 32 bit indices when you don't need to, should be pretty much all the time), This doesn't mean 33% extra BW at all, it means about 3% more BW for geometry data that actually needs quads (which is ? maybe subdiv surfs ? still not required though), not relevent if you look the whole picture.

In my example it's 12 bytes vs. zero bytes, and not 12 vs. 8, because with support for quads I wouldn't need to have index my data at all. Anyway, I still made a mistake in my calculation, it's an increase of 8.3% since my vertices are 36 bytes, and in case of your size of 32bytes it would be 9.4%.
 
I can't see why you don't just use PointSprites (together with per particle PSIZE).

Yes, it's only supported on GF3+, (don't about ATI, but I suspect 8500 upwards should do), but then quad-rendering is supported nowhere in DX (as it's ill-defined when the vertices aren't coplanar).

Use D3DXSprite as a fallback for non-PSIZE-hardware... ;)
 
Tagrineth said:
Current hardware does not support quad primitives. The only "Support" they have is via tesselation into triangles in setup.
That qualifies for "direct hardware support" in my book. It happens inside the chip with no external (app/driver) control required.
PC chips haven't supported real quads for some time...
It doesn't matter if it's rendered as two triangles (which is obvious, given the wording of the gl spec on the subject).

Humus is right on the mark with the index stuff. Particle system quads don't require indices at all.
Large scale particle systems can also straddle the 65536 vertex limit, so you need uints, which pile up to 6*4=24 bytes of index traffic vs nil using straight quads. Vertex traffic is equal for both methods.
 
Humus said:
JohnH said:
Hmm. 1 Quad = 4 vertices = Triangle strip of 2 = 4 vertices, no overhead.

You don't want to use strips, use indexed triangles, 1 quad still = 4 vertices. Yes it costs you two additional indices but the BW of these isn't normally relevent e.g. each vertex = 32 bytes x4 = 128 bytes, same for both tri's or quads. Tris require 12 vs 8 bytes of index data (assuming you're not using 32 bit indices when you don't need to, should be pretty much all the time), This doesn't mean 33% extra BW at all, it means about 3% more BW for geometry data that actually needs quads (which is ? maybe subdiv surfs ? still not required though), not relevent if you look the whole picture.

In my example it's 12 bytes vs. zero bytes, and not 12 vs. 8, because with support for quads I wouldn't need to have index my data at all. Anyway, I still made a mistake in my calculation, it's an increase of 8.3% since my vertices are 36 bytes, and in case of your size of 32bytes it would be 9.4%.

If your data's not indexed then why can't it be converted directly into a non indexed triangle strip with no overhead?

Edit. Oh hang on you're using it for sprites, so you'd have to submit in seperate prim calls. Though I still doubt that the extra data for teh indices is going to make that much difference to your performance, unless of course there's something dodgy with the HW you're running on.

Of course this would be fixed by a programmable prim processor.

John.
 
The big problem is that there isn't much native support.

Converting to triangles isn't equivalent (because the interpolation should employ four barycentrics rather than three). Then there is the non-flat quad problem - and while we are using imprecise arithmetic, almost all quads are non-flat at some level....

I think we're better with just triangles than with quads incorrectly emulated as triangles...
 
Dio said:
The big problem is that there isn't much native support.

Converting to triangles isn't equivalent (because the interpolation should employ four barycentrics rather than three). Then there is the non-flat quad problem - and while we are using imprecise arithmetic, almost all quads are non-flat at some level....

I think we're better with just triangles than with quads incorrectly emulated as triangles...
I think we're even better off with a primitive type called "quad" that is defined as a 2-triangle-fan. Simple as that, no "ill-defined" problems any more. Just the same as it is in reality in any OpenGL implementation.
 
Back
Top