Vysez said:
[Actually, a game has a lot of threads, TnL/ 3DSound/ complex physic/ complex AI, etc. Therefore, a more than 2 core CPUs, on a console makes a lot more sense than the same Multi-Cores on a PC, since a PC would very rarely have more than two resources demanding threads running at the same time (Except PC games... If they were optimized, though).
Yes a game has a lot of potential threads, but they are not independent - they are synchronous.
In the PC example that I gave I am running DVD decoding on one CPU, and running something like IE with a flash animation on the other. Neither of these has any dependency on the other, hence they can execute completely asynchronously.
In the case of TnL, 3Dsound etc. they all have to operate in a synchronous way. In the most basic example of this, my sound has to be synchronised with the action on screen (ie. the TnL and graphics).
One result of this is that I can't process an arbitrary amount into the future to use my execution resource fully - the total amount of sound processing that I can do is limited by what has happened in the game, which is running at a fixed rate.
What's the result of this? Well, let's say I make the simplest split of work possible - sound on one processing unit (PU), Tnl on another, physics on another etc. We can quickly see that my overall throughput may well become tied to the longest processing time for one task on any single PU - the other PUs are relying on finishing that task to generate the data for the next set of tasks they have to do (I have to know where my monsters will be and what they are doing before I can decide what sounds occur etc.)
So this is inherently inefficient - some of my PUs will finish ahead of others and start to starve, which gives me poor utilisation. If my longest task is itself parallelisable then I might be able to mitigate this and rebalance things by using multiple PUs for this task, but if the gating task is serial then I can't.
I might want to dynamically redistribute tasks to get a better execution balance, but now I run into the communication issue - to switch tasks around between PUs I have to communicate that appropriate state for that task around - redistributing information. The amount of information this requires may vary from small to very large, depending on the task. If it's very large then I can't redistribute it very often (or maybe not at all). If all my state for all tasks is static, and I have a large local store on each PU, then I may be able to keep the state for all my different tasks on every PU locally and switching overhead is low. If my state is dynamic and changing then I may not be able to get around the overhead of copying data around, because I can't keep an up-to-date copy on all my PUs all the time. If my total states are too large to be held in local memory then I will need to swap out the state for one task to swap in the state for another, and what happens if later I decide I want to run that first task again? Yup - I need to swap it back in, so I may need to add some sort of hysteresis to my distribution of tasks to avoid thrashing back and forth between states, causing unnecessary copies.
This is why the case of a multi-threaded, multi tasking OS can map well to multiple cores, whereas trying to parallelise what, in essence, is a single task - "Run a game" - can cause a lot more headaches.
As I said, theorically and in the absolute, you're right. But personaly, I was arguing the subject in context of today's market and today's R&D problems.
So, in other words, we agree, we're just discussing the subject from different POV.
Agreed. (At least to some extent
)
The types of opportunities for multiple cores on the PC desktop are very different from those in consoles, and getting good performance boosts is (relatively speaking) a simple thing for multiple independent tasks. So today's market and R&D problems are definitely pushing towards a migration to multiple cores on the modern PC desktop.
This is not at all the same thing as the expected multi-processing in a console environment, and I don't think the pressures are the same, which is why I thought the reference to Intel's current plans was not necessarily valid in the space of this debate.
- Admitting that you can get better performance running independent applications on a PC desktop with multiple cores is one thing - here a multiple core has many advantages, not least that it can actually _avoid_ context switches that need to occur on the single core when moving from task to task.
- Saying that you can get better performance from multiple cores by extracting parallelism from a single task with multiple dependent communicating subtasks is a different matter, and far more complex. While it's certainly possible, it's not easy.