games and multi-threading

3dcgi

Veteran
Supporter
I've been wondering about how a game would be multi-threaded to take advantage of dual processors or Intel's Hyper-Threading technology. I've read that Quake3 has dual processor support, but that it doesn't really help performance.

Does anyone know what parts of the game are most able to be split into parallel threads? Collision detection, animation, AI, etc. Obviously this will vary somewhat from game to game.

My interest isn't just informational. I've got dual processors for running 3ds max and I hope that someday #2 will be able to help out with games.

Thanks.
 
I have a dual processor machine and I play Quake 3, I definately notice a difference, it's during the busy scenes where my FPS don't hit the floor.

As for multi-threading in games, games have been multithreaded for some time, just a matter of how their multi-threaded. You have to have the threads run largely independantly to "support" multi processor machines. Otherwise, you'll hurt performance.
 
3dcgi said:
Does anyone know what parts of the game are most able to be split into parallel threads? Collision detection, animation, AI, etc. Obviously this will vary somewhat from game to game.

My interest isn't just informational. I've got dual processors for running 3ds max and I hope that someday #2 will be able to help out with games.
Thanks.

as Saem pointed out, parallelism essentially boils down to statisctics of the work flow, in order to have a performance gain from separating into threads. it's not always worth to put into different threads those things which seem naturally concurrent at first glance, but rather things which may not seem so naturally concurrent but are more or less a bottleneck in the workflow. and this may require for radical redesigns of your algorithms and control flow in general. you need to be able to distinguish the differentiable 'work blocks' in the overall work flow (as well as the sub-blocks in the bigger blocks), take them as separate tasks, and then be able to reassemble them back in the most parallel-friendly way, preserving the original concet (read functionality) of the code. once you've done all that you just put them in their respective threads and make sure that those actually co-work :)
 
I found out Dungeon Siege is multi-threaded and there is a definite performance increase with dual processors. I tested by setting the processor affinity after loading the benchmark. The benchmark numbers don't do justice to the true visual difference.

Average frame rate with dual processors - 31.35 fps (vsync was on).
Average frame rate with one processor - 28.28 fps (vsync was on).
The minimum frame rate was 6.00 fps for both runs, but dual processors had far less studders.

I believe this was with a resolution of 1024x768 or higher. When I tried to manually set the benchmark resolution I got a black screen. Dual processors might make a larger difference at a lower resolution because I'm sure in many cases I'm limited by my G400 Max. My processors are 733MHz.

I still don't know what portions of the game are multi-threaded. Does anyone know?
 
Remember that supporting hyperthreading (which should have huge market saturation by the end of next year thanks to Prescott) and multithreading (which will probably never affect more than a small fraction of the market) are different things.

With multithreading, the point of congestion is main memory access. Your two threads, running on different CPUs, have their own execution units and cache.

With hyperthreading, things are trickier. There's no doubt that you can tease out significantly more performance with two well thought-out threads, but figuring out where the bottlenecks are takes some time. If two threads are banging the integer ALUs at the same time, you're going to see a 0% (or worse) performance benefit. Does anyone know whether VC++ 7.1 will support HT hints and compile-time optimizations?
 
I agree that multithreading and hyperthreading work differently and that their bottlenecks are thus different, however I believe that just by writing code to take advantage of one method the other will show some benefit. It just might not be as optimal as possible.

Also, I looked more into how Dungeon Siege is multithreaded and I found this quote from the planet dungeon siege forum.

DS is multithreaded, and will take advantage of multiproc big time. In addition to the primary thread where all the logic and UI runs, we have a world streamer thread that loads and sometimes generates objects and terrain to load, there are a couple sound threads from Miles, and then DirectPlay uses some too (for background network updates I think).

Hyperthreading might help a lot here since the threads are doing very different things.

I'm still curious if anyone knows of any games that actually multithread the rendering engine instead of just splitting off sound and data handling.
 
Oompa Loompa said:
Remember that supporting hyperthreading (which should have huge market saturation by the end of next year thanks to Prescott) and multithreading (which will probably never affect more than a small fraction of the market) are different things.

With multithreading, the point of congestion is main memory access. Your two threads, running on different CPUs, have their own execution units and cache.

The problem of congestion is severe on systems where the processors share a common bus (Intel typical). It is somewhat reduced on the EV-6 bus, although the CPUs are still is connected to the same memory channel. It is greatly reduced on systems like the Hammer, where each added processor also add an independent memory controller and path, ie bandwidth is increased linearly with the number of CPUs. Interlocks and contention is not totally solved, but the problem is pretty much non-existant compared with multiprocessor systems that connect to the same bus and memory. (On Hammer multiprocessor systems, even single threaded problems will enjoy great benefits from the increased memory subsystem performance.)

Entropy
 
Since almost all games are graphics card limited now, it seems to me that none of this would have much effect. And I'm not sure it's worth it for developers to spend too much extra time implementing it (especially considering that very few people have multiprocessor systems). Not that it hurts anything if they do it, though, and people with multiprocessors deserve to benefit from them.
 
Please point out which games are "graphics card limited"?

I believe the CPU is still the most important part, followed closely by it's memory sub-system and the amount of RAM and then depending on the game the video card.
 
Entropy said:
It is greatly reduced on systems like the Hammer, where each added processor also add an independent memory controller and path, ie bandwidth is increased linearly with the number of CPUs.
No argument, but for gaming purposes, expect threading optimizations to be directed almost exclusively at hyperthreading. Intel will probably ship 8 million HT-enabled P4s next year, wheras the SMP market (be it P3, Xeon, K7, or K8 ) will continue to be a non-gaming niche.

With respect to K8, I hope AMD knows what they are doing. If all they can do is hang on to their existing market, a lot of their effort will have been misplaced. x86-64 is not aimed at the mid-range desktop consumer.
 
Oompa Loompa said:
No argument, but for gaming purposes, expect threading optimizations to be directed almost exclusively at hyperthreading. Intel will probably ship 8 million HT-enabled P4s next year, wheras the SMP market (be it P3, Xeon, K7, or K8 ) will continue to be a non-gaming niche.

Threads are threads. (Er, um, not really, but for arguments sake :)) Intels' approach allows them to potentially benefit from threading on a single CPU, which is neat. Demonstrated gains so far has been modest to non-existant though, but should improve a bit as time goes on. AMD took another approach, and made multiprocessing really simple by putting all the necessary logic on-chip. Building a dual+ hammer board is not much more difficult than building a single processor MB (although I fully expect dual CPU boards to come at a premium anyway unless competition heats up in that area.) AMD users will only reap the benefits of multithreaded applications if they use a multiprocessor system, but will on the other hand do so more than is possible on a hyperthreaded CPU, making dual CPU setups more desireable than they are today. So maybe dual systems won't be so unlikely if clawhammerDP prices are modest. Furthermore, a hammer MP system has the additional benefit that the available bandwidth to the CPU increases, thus even single-threaded applications will benefit from being run on a dual+ system unlike todays typical x86 SMP. That is a tangible, real-world benefit. Gamers will take notice, I'll wager. It would seem that Intels' hyperthreading initiative could actually help drive multiprocessing more into the mainstream.

With respect to K8, I hope AMD knows what they are doing. If all they can do is hang on to their existing market, a lot of their effort will have been misplaced. x86-64 is not aimed at the mid-range desktop consumer.

Hmm. According to AMD roadmaps the hammer is aimed at all traditional market niches, completely replacing the K7 from mobile applications and upwards within a year from its' initial launch. Judging from die-size, it will be modestly more expensive to manufacture than the current Athlon design - the difference in logic gate count is relatively small. Since it will be manufactured using higher density processes, yield/wafer is even likely to increase eventually.
AMD will probably do fine with the hammer, unless they run into major manufacturing problems. However, their road to the point where the hammer is a steady source of revenue looks bumpy at the moment.

Obligatory 3D content :) : The ability of the gfx-cards to handle increased amounts of geometry data is growing faster than that of the host CPU/datapath system. It might be worthwhile to discuss how 3D-engines need to be adjusted in order to minimize the amount of data that has to be touched by the CPU.

Entropy
 
Since this is clearly off-topic, I won't comment further except to suggest that it will take some sort of miracle for AMD to drive multiprocessing into the mainstream. IMHO.
 
Oompa Loompa said:
Since this is clearly off-topic, I won't comment further except to suggest that it will take some sort of miracle for AMD to drive multiprocessing into the mainstream. IMHO.

You are probably right, since multiprocessing will always come at some extra cost, and most users don't push the single processing limits at any given time. But it could grow from a small niche to a somewhat wider niche to include gamers since these are people who would actually benefit from the additional performance.

Besides, it's not all that off topic. After all, 3D-engines do not run on the gfx-cards alone. The tendency of this forum to focus on the GPU
sometimes becomes a bit myopic. Also IMHO, of course.

Entropy
 
Saem said:
Please point out which games are "graphics card limited"?

I believe the CPU is still the most important part, followed closely by it's memory sub-system and the amount of RAM and then depending on the game the video card.

I don't know what games you play but I totally disagree with your assessment. I'd put the memory subsystem at the very bottom. The difference between DDR and PC133 memory in most games is miniscule. The sites show you all those whacked out graphs running at 640*480 and come to the conclusion that "memory makes a huge difference". Funny how at 1024*768 and above you don't see it. Same with CPU.

I've also seen almost little or no performance difference between 256 MB and 512 MBs. It's just not an issue in almost any games. There was even a test run on some site that showed almost no difference in performance in several games when running with 256 MBs and 512 MBs of memory.

Right off the top of my head I can name Tribes 2, Serious Sam, Quake 3, Wolfenstein (and almost every game based on Q3 engine), UT2k3 (based on the performance test, on all but the fastest cards) as being graphics card limited. The simple truth is if you have a 1 GHz+ processor these games will run almost identically. Obviously if you run in 640*480 this won't be the case, but now almost everyone runs at 1024 or even 1280.

Hint: if you put a new graphics card in and a game runs faster it's GRAPHICS CARD LIMITED. In a game that is not limited by graphics card (UT) you'll get the same FPS with a Geforce 2 MX as with a Geforce 4. If games are not currently graphics card limited, then why are ATI, Nvidia and Matrox (and others) coming out with faster cards. How do all those Ti4600s sell when all games are CPU limited? Please explain your reasoning...

In my experience you benefit a LOT more from buying a fast graphics card and a mid-range CPU, than buying a high-end CPU and a mediocre graphics card.
 
You make a very good point there. The Geforce 4 is cpu limited obviously, so is the Geforce 3 on my system mainly because games become faster with every few mhz increase in processor speed compared to GPU and gfx card memory increase.
 
Nagorak said:
Right off the top of my head I can name Tribes 2, Serious Sam, Quake 3, Wolfenstein (and almost every game based on Q3 engine), UT2k3 (based on the performance test, on all but the fastest cards) as being graphics card limited. The simple truth is if you have a 1 GHz+ processor these games will run almost identically. Obviously if you run in 640*480 this won't be the case, but now almost everyone runs at 1024 or even 1280.

Hint: if you put a new graphics card in and a game runs faster it's GRAPHICS CARD LIMITED. In a game that is not limited by graphics card (UT) you'll get the same FPS with a Geforce 2 MX as with a Geforce 4. If games are not currently graphics card limited, then why are ATI, Nvidia and Matrox (and others) coming out with faster cards. How do all those Ti4600s sell when all games are CPU limited? Please explain your reasoning...
The point of faster video cards is to allow the user to enable more features and use higher resolutions without sub-par performance. You can make nearly any game on the market video card limited if you increase the resolution or turn on 4x (or more) AA. Does this mean this is the typical case? Hardly.

Also, just because a game runs faster with a faster video card, doesn't mean you are wholly video card limited. Frame rates given by benchmarks are generally the average over a number of frames. Most times, there are places in the benchmark where the CPU is more important than the video card and vice versa.
In my experience you benefit a LOT more from buying a fast graphics card and a mid-range CPU, than buying a high-end CPU and a mediocre graphics card.
That might suit your needs, but not everyone's. Read Beyond 3D's 2.53 P4 review for examples like Dungeon Siege or Jedi Knight II (based on the Quake 3 engine, BTW) which are largely CPU limited.
 
I've been impressed that the Quake 3 engine is flexible enough to be both CPU and GPU limited. That's a good thing because it means that no matter which you upgrade your performance improves.
 
OpenGL guy said:
That might suit your needs, but not everyone's. Read Beyond 3D's 2.53 P4 review for examples like Dungeon Siege or Jedi Knight II (based on the Quake 3 engine, BTW) which are largely CPU limited.

Go over to Anandtech, in a recent review they show CPU scaling graphs.

http://www.anandtech.com/showdoc.html?i=1608

Almost all games improved very little from 1200-1700 MHz in speed. I'm talking at most 5-10 FPS, whereas moving to a faster graphics card resulted in a much more noticeable jump. It's not just my opinion that you're better off with a fast graphics card and mid-range system, it's simply the truth.

Let's see here: In the UT performance test there is NO IMPROVEMENT whatsoever with clock speed unless you have a GF4. Obviously the graphics card is more important in this case.

Serious Sam shows no improvement whatsoever with CPU speed increase.

Jedi Knight and Commanche are CPU limited, I'll give you that.

Quake 3 and Wolfenstein do run a little faster as processor speed increases, but just moving up to a faster graphics card results in a much more noticeable improvement. A 1267 MHz w/GF4 is faster than 1733 w/Radeon 8500/GF3 so I don't see how you can claim "processor is more important". If you have less than a GF3 performance doesn't increase AT ALL.

Looking forward to tomorrow's games, they're all going to push more polys. That's going to cause graphics cards to choke and perform equally no matter how fast your processor is.

So yeah, you can upgrade your processor and get a whopping 5 FPS increase. And the point is, IF they did introduce multiprocessor support for many games, it would just propel them straight into the graphics card wall (even if they aren't already hitting it), so you'd see little benefit from it.

Also as far as DUNGEON SIEGE... don't make me laugh. If you have serious performance problems with that game, you must be running on a P200 MMX w/Voodoo 1. Not only don't you need as many FPS in that game as you would in an FPS, but compared to other games its simple enough that performance shouldn't even be an issue. I'm sure you're right though, a P4 2.7 Ghz will give you more FPS in DS...I can only ask, why? By the way, look at why DS is CPU limited: simple low poly scenes, fairly low resolutions (no higher than 1024).

Anyway, what it really comes down to is multiprocessor systems are <5% of the gaming market (if not <1%), so that's why the feature isn't supported better.
 
Back
Top