The Technology of a 3D Engine - Part One

so the only way we will get these massive speedups promised by dx10 is if the dev says "sod dx9" ?

Depends on what you define as ”massive speedups“? It is possible to write an D3D9/10 Engine that can take advanced of the lower call overhead and make use of other advanced Direct3D 10 features. But to do this the engine need to be written with 9 and 10 in mind. The current engines are based on Direct3D 9 and the Direct3D 10 support was added late.
 
Thanks for the explanations. I am beginning to understand now. I did not look at it that way before.

I think what Davros meant when saying "massive speedups" is if game A is implemented in DX 9 purely, and if game A has another version written in DX 10 purely then DX 10 version with all the new features enabled will look better but use less resources and generate higher fps than the DX 9 version. Is that correct?
 
I think what Davros meant when saying "massive speedups" is if game A is implemented in DX 9 purely, and if game A has another version written in DX 10 purely then DX 10 version with all the new features enabled will look better but use less resources and generate higher fps than the DX 9 version. Is that correct?

Sometimes... It really depends on where your performance bottleneck is located.

If you're completely maxing out on, say, the multipliers in the pixel shaders, then, on the same HW, you'll render that part at the speed no matter which API you're using.

In practice, a scene has different parts with different properties. Some will be floating point limited, others will be texture unit limited etc.
 
Sometimes... It really depends on where your performance bottleneck is located.

If you're completely maxing out on, say, the multipliers in the pixel shaders, then, on the same HW, you'll render that part at the speed no matter which API you're using.

In practice, a scene has different parts with different properties. Some will be floating point limited, others will be texture unit limited etc.

A pretty interesting read is here where profiling was done:
http://developer.nvidia.com/docs/IO/8230/BatchBatchBatch.pdf

The mentioned hardware is a bit old now, and it lacks statistics about D3D10 (and vista) but it's still a good read :)
 
Thanks for the link to the pdf. I understand for the most part but the term 'batches' as I am understanding refers to 'jobs for drawing objects'?
 
Thanks for the link to the pdf. I understand for the most part but the term 'batches' as I am understanding refers to 'jobs for drawing objects'?

On the 1st page, it is written that "Every DrawIndexedPrimitive() is a batch". For non D3D users, DrawIndexedPrimitive is the "draw call" referred in the article, i.e. the CPU asking the GPU to draw the triangles (the "objects" as you call it) it prepared for rendering.
 
On the 1st page, it is written that "Every DrawIndexedPrimitive() is a batch". For non D3D users, DrawIndexedPrimitive is the "draw call" referred in the article, i.e. the CPU asking the GPU to draw the triangles (the "objects" as you call it) it prepared for rendering.

Thanks!
 
Here's something from Tom Forsyth about Draw Calls (& state changes) costs w/ D3D9:
http://tomsdxfaq.blogspot.com/2006_04_01_archive.html

I know the kernel mode switch is expensive, although it seems it might not be the most expensive operations performed during a Draw Call.

A quick summary of Draw Call costs would be :
D3D9 > OpenGL >= D3D10

(I'd like to spend some time making my own Draw Call speed tests in a real case scenario/game, however I don't have any time atm.)
 
I wouldn't be so sure about that. I remember one NVidia GPU architect-engineer once said that D3D10 will be much better than D3D9 in terms of draw call cost but it won't be as good as OpenGL. Unfortunately I couldn't find the post where he said it.

Maybe this person was talked about the overhead of a single call down to the API. As Direct3D 10 still uses a runtime between the application and the driver there is some additional overhead for bookkeeping and some data conversion.

But developers normally refer to something different when the talked about “Draw call overhead”. This overhead includes the processing time for any pipeline updates that prepare the draw call. Direct3D 10 has an advanced here as you need less API calls to do the same. Additional API features like state objects and resource views allow the driver to move work from runtime to creation time.

It is not a surprise that OpenGL 3.0 aims for a very similar overall design.
 
I am not sure what you are mean with “format conversion” here.
Gah, I meant to reply to this at the time. I meant changing the representation of a surface on disk to something the GPU likes to consume, before its sent to the hardware.
 
Thanks for putting this together. It's a very interesting introduction before I start diving into IEEE papers.

DK
 
Back
Top