Direct3D versus OpenGL

It depends on what you mean by DirectX, if you mean DX9 then I'd say OGL is better, but no so with DX10. It incorperates most of the features that makes OGL better (shared GPU, not much kernal code) and it has a much more streamlined API, but it's categorically not extensible, it's non-portable and not a public industry standard. Thus, DX10 is slightly better from a technical perspective, but slightly worse from a cross-platform/maintainability perspective.

However, this advantage will not last. First of all, control OpenGL will be transferred from the ARB to the Kronos Group by the end of this year, which will help ensure faster updates and stronger marketing. Second, OpenGL has excellent profiling, at least with Nvidia, thanks to the latest versions of NVPerfHUD. Most importantly, there are already plans for an OGL 3.0 that breaks backwards compatibility and is far more sophicsticated then DX10. Of course, this still maintains BC by seperating the API into legacy and current profiles, which the application selects by how it loads OGL. Anyways, the proposed API fully abstracts all state into well-structured objects, exposes shaders for geometry, framebuffer sampling and texturing sampling, offers more types of interpolants, adds instancing, and throws in some more sophisticated data formats/arrays. It will come out in the form of OGL extentions starting ASAP until the final API can be decided on.

http://www.gamedev.net/columns/events/gdc2006/article.asp?id=233
 
Last edited by a moderator:
I looked at Q3 source and it seemed to have glVertex3f() scattered in several places. From what I read, D3D is basically 1000 batches per frame max. Does such a limit exist for OpenGL?
 
SoftwareGuy256 said:
I looked at Q3 source and it seemed to have glVertex3f() scattered in several places. From what I read, D3D is basically 1000 batches per frame max. Does such a limit exist for OpenGL?

Not exactly.
OpenGl doesn't have the small batch problem in the same way.
It's still a good idea to limit the number of batches submitted, but it's nothing like a crucial.
 
SoftwareGuy256 said:
I looked at Q3 source and it seemed to have glVertex3f() scattered in several places. From what I read, D3D is basically 1000 batches per frame max. Does such a limit exist for OpenGL?
No. OpenGL has always had a proper driver model, so the API overhead was entirely up to the implementors (ATI, NVIDIA et al) and in practice is much much lower than Direct3D, which is making switches to kernel mode and back all the time for no good reason.

Of course this has never been relevant up until now, but because Microsoft has promised to fix this glaring flaw with Vista, it's now the most important thing ever :rolleyes:

Beware of people bullshitting you into buying Vista for its "DX9L support to make your old games run even smoother". Trust me, will happen soon. Those are the same people that used to say OpenGL is "for workstation apps, not for games".
 
zeckensack said:
No. OpenGL has always had a proper driver model, so the API overhead was entirely up to the implementors (ATI, NVIDIA et al) and in practice is much much lower than Direct3D, which is making switches to kernel mode and back all the time for no good reason.

Of course this has never been relevant up until now, but because Microsoft has promised to fix this glaring flaw with Vista, it's now the most important thing ever :rolleyes:

Beware of people bullshitting you into buying Vista for its "DX9L support to make your old games run even smoother". Trust me, will happen soon. Those are the same people that used to say OpenGL is "for workstation apps, not for games".

No. The fixes in Vista have nothing to do with uncessary calls to the kernel. I don't know why the myth persists that D3D calls the kernel more often the OpenGL. They both buffer to cmd buffers in user mode and then send them to the kernel. They may chose to use different buffer sizes, but they fundamentally do they same thing. The place where they differ is that the D3D runtime in XP builds machine independent buffers and the IHV driver translates them to machine specific tokens in kernel mode. In Vista the runtime calls a user mode driver to build machine-specific buffers and avoid this translation step. The catch is that IHV drivers MUST ensure that these user-mode machine-specific cmd buffers do not contain operations that can compromise the security or robustness of the overall system (e.g., DMA access to arbitrary system memory). It remains to be seen whether OpenGL drivers ever provided that level of security, but it is a fundamental requirement for Vista to improve stability.

Read the WinHEC slides on the driver model, it has been publicly described for at least 3 years now.
 
db said:
No. The fixes in Vista have nothing to do with uncessary calls to the kernel.

I have the feeling that the D3D9 XP runtime makes more kernel calls than the D3D9 User Mode driver on Vista. I cannot prove this yet but I am working on a benchmark for this.
 
db said:
No. The fixes in Vista have nothing to do with uncessary calls to the kernel. I don't know why the myth persists that D3D calls the kernel more often the OpenGL.
Maybe because the amount of draw calls you can make through Direct3D currently coincides with the speed of thread scheduling? Whichever way you might want to explain that, the performance differential is not a myth, no sir.
db said:
They both buffer to cmd buffers in user mode and then send them to the kernel. They may chose to use different buffer sizes, but they fundamentally do they same thing.
The truth is that you can't make any generalizations on how an OpenGL ICD functions, because it isn't specified anywhere. Implementors can thus compete on driver efficiency, and they have produced very good results.
db said:
The catch is that IHV drivers MUST ensure that these user-mode machine-specific cmd buffers do not contain operations that can compromise the security or robustness of the overall system (e.g., DMA access to arbitrary system memory). It remains to be seen whether OpenGL drivers ever provided that level of security.
Do I smell FUD there? Care to explain how a crash in userland can "compromise the security or robustness of the overall system"?
If you're talking about something with a remarkable portion of code in kernel mode, BSODs and all those nasty problems, yeah, I can see a problem there. It just doesn't apply to OpenGL.

And besides, parameter validation doesn't explain the performance characteristics of Direct3D anyway. OpenGL has error handling, too, in case you really didn't know.
 
Demirug said:
I have the feeling that the D3D9 XP runtime makes more kernel calls than the D3D9 User Mode driver on Vista. I cannot prove this yet but I am working on a benchmark for this.
Whenever the 3DCenter forums come back up, you might want to search for a thread of mine called "Schnelle Thread-Kommunikation" or something like that on the programming board. It should have full source for something that can be used as a thread scheduling benchmark.
 
zeckensack said:
Whenever the 3DCenter forums come back up, you might want to search for a thread of mine called "Schnelle Thread-Kommunikation" or something like that on the programming board. It should have full source for something that can be used as a thread scheduling benchmark.

Don’t get me wrong but I don’t want to measure thread scheduling. I want to count the kernel calls to the graphics subsystem.
 
Demirug said:
Don’t get me wrong but I don’t want to measure thread scheduling. I want to count the kernel calls to the graphics subsystem.
Fair enough. I used it for reference, or rather I was struck by the similarity in performance. Any context switch involves executing some kernel code after all.

So even if you find that there is no switch to kernel code, there are certainly context switches through kernel code.

And the test is relevant for another reason: I reckon the runtime does this context-switching to force serialization for crazily multi-threaded applications. The test is a very naive* implementation of a mechanism that does just that.

*wrongly assumes sane OS implementations of sync primitives and thread scheduling
 
Back
Top