Tim Sweeney interview with state of multicore programming on consoles+engine stuff

Rangers

Legend
http://interviews.teamxbox.com/xbox/2169/Epic-Games-Tim-Sweeney-Interview/p1/


In the past, you have commented about the fact that we now have multiple cores in a CPU, but the software has yet to catch up with the hardware; that the traditional programming tools were not designed with multithreading in mind. You also said you were doing some research on that front. Any update on your efforts?

Tim S: Yeah, well, programming for multicore architectures is hard. If programmers had our way, we’d just program single-thread application forever, because it’s much easier. But it’s also clear that there’s an irreversible hardware trend towards multicore, because that’s the only way to deliver maximum power economically. You just can’t make ever higher gigahertz rates—at some point your CPU becomes a microwave and melts your computer. Multicore is here to stay.

What we’ve found in this generation—and here are some scary numbers—is that writing an engine system designed for multicores, that can scale to multiple threads efficiently, takes about double the efforts as single threaded. It takes double the design effort, implementation effort, lifetime support effort, debugging…all the costs metrics multiplied by a factor of two for multicores. That’s pretty expensive, but ends up being bearable.

Whereas, some of the other hardware trends are even worse than that, like programming for Cell, we found, had a [five-times] productivity divisor. It’s five times harder and that really starts to hit…you have to question whether it’s economically viable for mainstream developers to put real effort into it at that point. And then GPUs are trying to take a non-graphics algorithm and run it on the graphics processor currently. Given the limitations of those languages, we found that the multiplier there is 10x or more, which is well out of the realm of economic viability.


Anyways, there are a few interesting general replies from Tim in there, probably nothing this forum doesn't know but a good summation of some things I thought.
 
Last edited by a moderator:
Found this part interesting:
What do you think of this trend: nVidia bought Ageia, Intel acquired Havok? Do you think physics on a GPU makes sense? I mean, isn’t like you’ll always want all the power of the GPU for your graphics?

Tim S: Physics is a highly parallel task. It’s one of the few things in a game engine that’s so embarrassingly parallel that it scales up to a GPU architecture very easily—almost as easily as graphics. So, GPU-based physics is a great thing, and there’ll also be some interesting hardware architectures coming along that can do physics well also…as well as physics on multicore CPUs, so it works everywhere it’s parallel.

We’re really agnostic on that. We’ve been very happy with the Ageia physics API, and we know that other people are equally happy with the Havok API. They’ll be ported to different architectures that, if it’s running on the GPU and it achieves excellent performance, we’ll be thrilled with it…take advantage of it in UE3. I think that’s very viable.

What’s more a question is, can you take the other systems in a game engine and run them on the GPU? Because the GPUs are evolving more and more in general computing functionality, while CPUs are, at the same time, moving to more cores and more GPU-like vector units, so at some point there’s going to be this clash where they come together, and it’s not clear which model is going to be superior at that point.

Currently, I see a much clearer and lower-risk path to CPUs taking over graphics and everything else than I see GPUs taking over computing, because CPUs have solved so many of the hard problems of computing properly over the past few decades that this involves massive engineering efforts: cache-coherency protocols, the cache architectures in general, extracting high performance out of random branching code. There’s really lot of research there, and the GPU makers will have a lot catching up to do, whereas the CPU people…well, what do they need that the GPU has? Wider vector units and better vector instruction sets are really the only thing there. CPU makers definitely have a lead at the moment—but, of course, it all comes down to execution.
 
This part is cool:

Tim S: Wow. Our engine-feature prioritization comes from all three, but our biggest source of inspiration has always been the next major game we’re working on. In this case, these new UE3 features were inspired by Gears of War 2. That’s been in development forever since the first Gears shipped, but we didn’t announce it until recently. The physics-destruction system was designed for that. The soft-body system became a high priority when we planned out the features of the game.

We haven’t said anything about the game yet, except for it exists, but you can extrapolate in your mind what some of these features will mean for [Gears of War 2].

Gears of War 2 getting cool tech confirmed.
 
I think we have a pretty good idea of how they're going to be using the tech from the meat-cube demo. :p

edit: I'll discuss the Gears 2 implications in the games forum...
 
Physics is a highly parallel task.

:LOL:
I wonder if mathematics is also "highly parallel task"?
What about chemistry?
 
This presentation is on how to make some aspects of constraint solver and collision solver highly parallel.
It has nothing to do with the funny statement about "parallel physics".

If collision isn't part of physics, I don't know what science to put it under... Part way through the presentation you can clearly see they're sorting objects that collide and do not collide for parallel processing. So what exactly was funny :?: :???:
 
perhaps you could let the rest of us in on the joke? I see nothing amusing there.

Say, chemistry, isn't it also a "highly parallel task"?

Not to mention that most of the algorithms that are used to solve physical systems are ODE/PDE solvers which are not that "parallel" at all.
You can make them parallel by using simpler, more straightforward = slower algorithms.
 
:???: Physics = physical interactions, a readily parallel problem that comes under the Physics school of science.
 
:???: Physics = physical interactions, a readily parallel problem that comes under the Physics school of science.

Err, lighting is also "physical interaction". Basically everything you know about is a "physical interaction".

And, as I've already said: it's not parallel.
 
Err, lighting is also "physical interaction". Basically everything you know about is a "physical interaction".
I was explaining use of the term physics in computer games. Your definition is inappropriate for the context, and there's nothing silly about Tim Sweeney's or anyone else's comment about physics being a parallel task. And in fact lighting using a physics model is also extremely parallel. You can trace light-rays per pixel in parallel.
 
Yeah, wtf? Lighting (as is done in real-time computer graphics) is also an extremely parallel task - hence pixel shaders calculating light on each pixel independently, run in parallel..

Anything where you have a large number of independent objects being run through the same or similar formulas is going to be an easy candidate for parallelism. That can be physics, lighting, shading, i'm sure a bunch of different chemistry calculations and.. whatever else you can think of that meets this criteria.
 
Last edited by a moderator:
Parallel Programming

Tim S: Yeah, well, programming for multicore architectures is hard. If programmers had our way, we’d just program single-thread application forever, because it’s much easier.
...
What we’ve found in this generation—and here are some scary numbers—is that writing an engine system designed for multicores, that can scale to multiple threads efficiently, takes about double the efforts as single threaded. It takes double the design effort, implementation effort, lifetime support effort, debugging…all the costs metrics multiplied by a factor of two for multicores. That’s pretty expensive, but ends up being bearable
As someone who has been working with parallel programming for a much longer time than consumer multi-chip solutions have been available, I actually take a good bit offense from this section of the interview (or any statement about anything being 'hard' really as I usually adopt the mindset of "it's not hard, you just don't understand it well enough to realize how easy it is!"). For the longest time when I've had to develop single-threaded applications I've always run into sections of code that I immediately recognize as being able to easily split into separate, independent tasks that seem to naturally suggest parallelism and I've always had to force my brain into ignoring this and instead write it as serial code.

I think the main issue most programmers switching over to parallel programming are having is it's just new to them. It requires a completely different mindset than serial programming with a whole host of considerations you have to keep in mind that most people just aren't used to thinking about. They write a bunch of code they think should work in parallel and then when they get data corruption, dead-locks, race conditions et al they chock it up to "random unpredictable errors" and tear their hair out spending hours debugging the simplest code trying to find fixes (or 'workarounds' for what they consider to be inherit platform bugs that are really just their own errors in logic). When you've been working with this type of thing for a very long time, these bugs start to become a rare case and not a common, everyday case as they are with people new to parallelism. These bugs are also fairly obviously and immediately fixed by people with experience as all they have to do is look at what's happening and can quickly reduce the list of possible suspects to a handful of options and have it fixed in about the same amount of time any other programming bug would take to fix.

I've gotten to the point where I'm so used to thinking in parallel that I don't personally feel I spend any hugely noticeably longer amount of time developing parallel applications as doing the same in serial. I'm sure under the surface it's requiring slightly higher efforts than purely serial programming but I certainly wouldn't say it's twice as much. After a while it just seems to come naturally.. but maybe that's just me.
 
Developers like Insomniac are probably at the head of all of this and say some interesting things in their presentation on Resistance 2 development:

"Isn't it harder to program for the SPUs?
No."

"But isn't programming for the SPUs different?
The SPU is not a magical beast only tamed by wizards.
It's just a CPU
Get your feet wet. Code something.

Highly Recommend Linux on the PS3!"

Conclusions

* It's not that complicated.
* Good data and good design works well on the SPUs (and will work well anywhere)‏
-Sometimes you can get away with bad design and bad data on other platforms
-...for now. Bad design will not survive this generation.

*Lots of opportunities for optimization.

A PC/360 developer responded to this presentation (developer for the PC/360 game Prey):

"Sure you can just about get away with bad code now on the 360"

"Regardless of managed memory or cached memory, the concepts and methods Mike has presented is highly portable. In the case of cached memory, that method results in optimized cache locality and cache utilization (something extremely important when multiple threads are sharing L1 on a single core, and multiple cores are sharing L2), and a predictable way to optimally prefetch. Good data locality, minimal sync points, branch elimination, and vectorization are all required to be able to extract great performance out of the 360 as well."

"Multi-processing is not new
Trouble with the SPUs usually is just trouble with multi-core.
You can't wish multi-core programming away. It's part of the job." http://www.insomniacgames.com/tech/articles/0208/files/insomniac_spu_programming_gdc08.ppt
 
Back
Top