multicores cpus & gaming (Tim Sweeney interview inside)

cristic · Mar 14, 2005

Link.

I've found it a little bit disappointing, meh... to general....

Moderators: sorry if this is the wrong part of the forum to post such things...

Killer-Kris · Mar 15, 2005

Finally, keep in mind that the Windows XP driver model for Direct3D is quite inefficient, to such an extent that in many applications, the OS and driver overhead associated with issuing Direct3D calls approaches 50% of available CPU cycles.Hiding this overhead will be one of the major immediate uses of multi-core.

I had never heard that before. How much truth is there to that?

CMAN · Mar 15, 2005

If Microsoft of the IHB were able to make Direct3D calls more efficient could we see large performance gains in CPU limited situations such as UT2004, etc?

Pete · Mar 15, 2005

Didn't Bungie complain about huge D3D overhead WRT Halo performance? D3D inefficiency doesn't sound implausible.

Edit: Right, it was probably Gearbox, and it may well have referred to shaders.

AlNom · Mar 15, 2005

Pete said:
Didn't Bungie complain about huge D3D overhead WRT Halo performance? D3D inefficiency doesn't sound implausible.

You mean for Halo PC? I recall Gearbox complaining about it when they were working on "fastShaders"

Geo · Mar 15, 2005

Errm, didn't WinXP performance nearly catch Win98 the last time anyone did an analysis? Surely not 50% range?

Inane_Dork · Mar 15, 2005

Killer-Kris said:
I had never heard that before. How much truth is there to that?

Well, considering the source, I'd say it's accurate to within 5%.

Think about what D3D has to do for you on PC. All the drivers, graphics card hardware and CPU instruction sets added together make the API and HAL functions of D3D take a long, long time (comparitively). Think about it. It's something like Renderware for the myriad of graphics devices out there.

Sweeney has probably been under the Xbox's and Xenon's hoods more than any console before them, so he now probably has a more educated perspective on how optimized D3D can be in an ideal environment.

jvd · Mar 15, 2005

i would think that by simply having the os and dx on one core and letting the game have full use of the second core and we will see major performance gains .

Then as time goes on and tools get better and developers get more acustome d to dual cores and tri cores and whatever the engines and threading will be more and more complex but easier and easier to do .

I mean once they get over the second core hump it should be easy to move to tri and quad core cpus

Demirug · Mar 15, 2005

Yes, the XP model is very inefficient.

The main problem is the doppel encode/decode.

First the runtime encode everything in a commandbuffer. This buffer is than send to the driver (kernelmode). The driver need to decode everything and reencode it to an memoryblock the GPU can read. Finally the GPU decode this buffer an execute the commands.

With longhorn this is changed.

The runtime redirects the calls direct to a driver in the usermode. This driver than can build the commandbuffer for the GPU. This save one encode/decode operation.

The Halo problem was not caused direct by Direct3D. They used the effect framework of Direct3D Extension. The version of this framework the used have a very inefficient statemanagment behavoir. Newer version are much better.

AlNom · Mar 15, 2005

Demirug said:
The Halo problem was not caused direct by Direct3D. They used the effect framework of Direct3D Extension. The version of this framework the used have a very inefficient statemanagment behavoir. Newer version are much better.

Ah, thanks for clearing that up.

So there's no way to fix this issue externally (besides recoding the game) is there...

Is this part of the reason why some ports from the xbox seem a little lacking in performance except when they are openGL on PC?

mjtdevries · Mar 15, 2005

Anybody here with a modern SMP configuration who can confirm that large overhead caused by DX?

Seems to me that the overhead of DX will be in different threads than the game itself. So that would mean that you should already get a large performance boost even if you run today's single threaded games on a SMP config.

I have my doubts whether DX will really have such a big overhead.

It's a shame that Anandtech didn't bother to test the impact of multicore on todays games, using a dual cpu system. (And that he didn't bother to interview Brad Wardell who has been using multithreading from the very beginning)

Demirug · Mar 15, 2005

If you have a single threaded game everthing run in this thread.

The only exception is "Software Vertex Processing". In the case you have a multi CPU system the runtime can use both CPUs to do the job.

Pete · Mar 16, 2005

Thanks for the explanations, Demirug.

mjtdevries · Mar 16, 2005

@demirug

When my singlethreaded applications uses the DX API. The DX code itself could spawn new threads.

Do I understand from your reply that DX itself is also singlethreaded?

That would mean that not only game developers will have to work on multithreaded code, but Microsoft has a lot of work to do too.

Demirug · Mar 16, 2005

mjtdevries said:
@demirug

When my singlethreaded applications uses the DX API. The DX code itself could spawn new threads.

Do I understand from your reply that DX itself is also singlethreaded?

That would mean that not only game developers will have to work on multithreaded code, but Microsoft has a lot of work to do too.

DX methods run in the thread that calls them. If you have a singelthreaded game everthing is done in this thread.

As DX use a statemodel you can not use multi threads to make rendering faster because every time only one of your thread can change the states and start drawing. If you do this from more than one thread you will end in chaos.

The performance problem that we have with DX today is not based on threading. The problem is the encode/decode process and the expansive switches between user and kernel mode. This should changed with longhorn. MS try to reduce the overhead to <=10% of the current overhead we see with the XP drivermodel.

WGF 2.0 could reduce this even more because it should possible to do more work with less calls.

ShootMyMonkey · Mar 22, 2005

There's still only one GPU and it's centralized, so you can't really individually control the states of each pipeline within the GPU. So no matter what, DirectX or no DirectX, you'd still have to put all rendering in a dedicated thread, and at best, issue stuff to that thread from various other ones. I still don't entirely like the idea of issuing from multiple threads, especially when you start worrying about things like alpha blending and multi-pass rendering where order is important.

However, there was that old blog entry (I forget whose) where the guy said that MS is supposedly providing full hardware documentation for Xenon. That would mean that developers don't have to use DirectX at all, and can write their own drivers and API layer. Still, that's not something you can do until you have the real hardware on your desk, so no chance of that in the first round of games... but for the future, you never know. I can imagine a lot of people finding it very much worth the effort. I really don't have faith in the idea that MS could make DX run fast enough given the fact that DirectX is supposed to be a catch-all API to suit everybody's needs. You can't please everybody. That's why engines like Gamebryo and Renderware perform so terribly out of the box -- they're too generalized.

DegustatoR · Mar 23, 2005

Demirug said:
Yes, the XP model is very inefficient.

The main problem is the doppel encode/decode.

First the runtime encode everything in a commandbuffer. This buffer is than send to the driver (kernelmode). The driver need to decode everything and reencode it to an memoryblock the GPU can read. Finally the GPU decode this buffer an execute the commands.

With longhorn this is changed.

The runtime redirects the calls direct to a driver in the usermode. This driver than can build the commandbuffer for the GPU. This save one encode/decode operation.

Does that mean that we'll see performance improvements in todays D3D games on Longhorn?

cristic · Mar 23, 2005

Most likely.

multicores cpus & gaming (Tim Sweeney interview inside)

cristic

Killer-Kris

CMAN

Pete

Moderate Nuisance

AlNom

Moderator

Geo

Mostly Harmless

Inane_Dork

Rebmem Roines

jvd

Demirug

AlNom

Moderator

mjtdevries

Demirug

Pete

Moderate Nuisance

mjtdevries

Demirug

ShootMyMonkey

DegustatoR

cristic

Similar threads