The Technology of a 3D Engine - Part One

Davros · Jan 1, 2008

so the only way we will get these massive speedups promised by dx10 is if the dev says "sod dx9" ?

Demirug · Jan 2, 2008

Davros said:
so the only way we will get these massive speedups promised by dx10 is if the dev says "sod dx9" ?

Depends on what you define as ”massive speedups“? It is possible to write an D3D9/10 Engine that can take advanced of the lower call overhead and make use of other advanced Direct3D 10 features. But to do this the engine need to be written with 9 and 10 in mind. The current engines are based on Direct3D 9 and the Direct3D 10 support was added late.

suryad · Jan 2, 2008

Thanks for the explanations. I am beginning to understand now. I did not look at it that way before.

I think what Davros meant when saying "massive speedups" is if game A is implemented in DX 9 purely, and if game A has another version written in DX 10 purely then DX 10 version with all the new features enabled will look better but use less resources and generate higher fps than the DX 9 version. Is that correct?

silent_guy · Jan 2, 2008

suryad said:
I think what Davros meant when saying "massive speedups" is if game A is implemented in DX 9 purely, and if game A has another version written in DX 10 purely then DX 10 version with all the new features enabled will look better but use less resources and generate higher fps than the DX 9 version. Is that correct?

Sometimes... It really depends on where your performance bottleneck is located.

If you're completely maxing out on, say, the multipliers in the pixel shaders, then, on the same HW, you'll render that part at the speed no matter which API you're using.

In practice, a scene has different parts with different properties. Some will be floating point limited, others will be texture unit limited etc.

pthiben · Jan 2, 2008

silent_guy said:
Sometimes... It really depends on where your performance bottleneck is located.

If you're completely maxing out on, say, the multipliers in the pixel shaders, then, on the same HW, you'll render that part at the speed no matter which API you're using.

In practice, a scene has different parts with different properties. Some will be floating point limited, others will be texture unit limited etc.

A pretty interesting read is here where profiling was done:
http://developer.nvidia.com/docs/IO/8230/BatchBatchBatch.pdf

The mentioned hardware is a bit old now, and it lacks statistics about D3D10 (and vista) but it's still a good read

suryad · Jan 3, 2008

Thanks for the link to the pdf. I understand for the most part but the term 'batches' as I am understanding refers to 'jobs for drawing objects'?

pthiben · Jan 4, 2008

suryad said:
Thanks for the link to the pdf. I understand for the most part but the term 'batches' as I am understanding refers to 'jobs for drawing objects'?

On the 1st page, it is written that "Every DrawIndexedPrimitive() is a batch". For non D3D users, DrawIndexedPrimitive is the "draw call" referred in the article, i.e. the CPU asking the GPU to draw the triangles (the "objects" as you call it) it prepared for rendering.

suryad · Jan 4, 2008

pthiben said:
On the 1st page, it is written that "Every DrawIndexedPrimitive() is a batch". For non D3D users, DrawIndexedPrimitive is the "draw call" referred in the article, i.e. the CPU asking the GPU to draw the triangles (the "objects" as you call it) it prepared for rendering.

Thanks!

Rodéric · Jan 10, 2008

Here's something from Tom Forsyth about Draw Calls (& state changes) costs w/ D3D9:
http://tomsdxfaq.blogspot.com/2006_04_01_archive.html

I know the kernel mode switch is expensive, although it seems it might not be the most expensive operations performed during a Draw Call.

A quick summary of Draw Call costs would be :
D3D9 > OpenGL >= D3D10

(I'd like to spend some time making my own Draw Call speed tests in a real case scenario/game, however I don't have any time atm.)

hoho · Jan 10, 2008

Rodéric said:
A quick summary of Draw Call costs would be :
D3D9 > OpenGL >= D3D10

I wouldn't be so sure about that. I remember one NVidia GPU architect-engineer once said that D3D10 will be much better than D3D9 in terms of draw call cost but it won't be as good as OpenGL. Unfortunately I couldn't find the post where he said it.

Demirug · Jan 10, 2008

hoho said:
I wouldn't be so sure about that. I remember one NVidia GPU architect-engineer once said that D3D10 will be much better than D3D9 in terms of draw call cost but it won't be as good as OpenGL. Unfortunately I couldn't find the post where he said it.

Maybe this person was talked about the overhead of a single call down to the API. As Direct3D 10 still uses a runtime between the application and the driver there is some additional overhead for bookkeeping and some data conversion.

But developers normally refer to something different when the talked about “Draw call overhead”. This overhead includes the processing time for any pipeline updates that prepare the draw call. Direct3D 10 has an advanced here as you need less API calls to do the same. Additional API features like state objects and resource views allow the driver to move work from runtime to creation time.

It is not a surprise that OpenGL 3.0 aims for a very similar overall design.

Rys · Jan 11, 2008

Demirug said:
I am not sure what you are mean with “format conversion” here.

Gah, I meant to reply to this at the time. I meant changing the representation of a surface on disk to something the GPU likes to consume, before its sent to the hardware.

dkanter · Jan 19, 2008

Thanks for putting this together. It's a very interesting introduction before I start diving into IEEE papers.

DK

The Technology of a 3D Engine - Part One

Davros

Demirug

suryad

silent_guy

pthiben

suryad

pthiben

suryad

Rodéric

a.k.a. Ingenu

hoho

Demirug

Rys

Graphics @ AMD

dkanter

Similar threads