SPU usage in games

Shifty Geezer · Jul 6, 2007

patsu said:
We have all of our animations running on the S.P.U.s of the Cell's chip because you couldn't draw armies or basically animate armies

Rest of interview here: http://www.gamepro.com/news.cfm?article_id=110368

Don't really agree with this. What's Lair doing with armies that other systems can't handle? Kameo should a couple thousand clones with cloned animations, which is all Lair appears to be.

Nesh · Jul 6, 2007

Probably they are talking about efforts to make teams of armies act independently.

Many games showed hordes of enemies on screen before anyways, like Dynasty Warriors.

I am not sure how well LAIR is doing it. Perhaps its the amount of enemies calculated in both huge distance and foreground while doing different things in teams and all rendered simultaneously

I think that the game that showcases the best the SPU's ability to calculate huge amounts of enemies independently is HS.

nAo · Jul 6, 2007

I'd like to see RSX and CELL both shading the same image in a deferred renderer, with each processor working on different screen areas + load balancing.

archangelmorph · Jul 6, 2007

Titanio said:
Not sure if I agree. We've had a variety of examples from papers and presentations where SPUs were being used, but being idled a lot of the time also. Not of games specifically, because no developer has given us that level of insight, but from various physics demos, the 'lots of ducks' demo etc. and the reasons as to why that happened probably arise in some games too.

You can be sure most of these games indicating x% of SPU usage are probably spreading that over all SPUs, so in such cases you'd be having a lot of idle-time there..most of it would be idle-time, in fact.

Yeah but optimisation allows you to look at this kind of scenario and say "ok well we have all this idle time available at period intervals perframe.. so what can we do with it?"

Maybe setup a s seperate set of software threads which spend some CPU time processing some other sub-system (e.g. cloud simulation for example) and then get the SPu to check if there's other work that's available when it's idling..

The size of each interval and the frequency of them per frame can determine to some degree the amount of extra work that can be done but so balancing and reshuffling can help to make room for whatever you need sometimes..

In short, A full blown production level game engine would work as much as possible to squeeze as much as possible, the work onto each clock cycle.. The only event this doesn't happen is when all the sub-systems (you planned, designed and accounted for..) are in place and your engine is running as planned.. At this point your not going to go back and say "ok so what else can we cram into this thing to use up more cycles?" because it does everything you intended and in reality you probably don't have the time/budget available to do it (especially if the added feature would remain unused since the development schedule doesn't allow enough art time to support it in the game)..

Terarrim · Jul 6, 2007

IGN: Was there anything particular about the PS3 that you could utilize greatly in the game?

Tikkanen: We are pushing the hardware more that any other PSN-game we are aware of. For example the physics, collisions and particle effects run natively on Cell. We control the graphics at the lowest available level for maximum efficiency.

We are able to get over 10,000 active objects with physics and collisions and over 75,000 particles simulated and drawn @60fps. That said, we were unable to use all the available processing power from Cell for this game, so for the next game there are still plenty of reserves left

http://ps3.ign.com/articles/794/794165p2.html

Above is in regard to StardustHD

Killzone 2 quote:

In this talk, we will discuss our approach to face this challenge and how we designed a deferred rendering engine that uses multi-sampled anti-aliasing (MSAA). We will give in-depth description of each individual stage of our real-time rendering pipeline and the main ingredients of our lighting, post-processing and data management. We’ll show how we utilize PS3’s SPUs for fast rendering of a large set of primitives, parallel processing of geometry and computation of indirect lighting. We will also describe our optimizations of the lighting and our parallel split (cascaded) shadow map algorithm for faster and stable MSAA output.

http://www.killzoneunit.com/kz/?p=647

almighty · Jul 6, 2007

nAo said:
I'd like to see RSX and CELL both shading the same image in a deferred renderer, with each processor working on different screen areas + load balancing.

What bennefits and gains in performance would Cell bring to different parts of the rendering engine?Gains in pixel?, vertex?, shadows? or All? And how easy would that split rendering be to implement?

This threads like a reference libary of all SPE usage, sweet

rekator · Jul 6, 2007

nAo said:
I'd like to see RSX and CELL both shading the same image in a deferred renderer, with each processor working on different screen areas + load balancing.

May be in your next game?

Titanio · Jul 6, 2007

archangelmorph said:
Yeah but optimisation allows you to look at this kind of scenario and say "ok well we have all this idle time available at period intervals perframe.. so what can we do with it?"

Maybe setup a s seperate set of software threads which spend some CPU time processing some other sub-system (e.g. cloud simulation for example) and then get the SPu to check if there's other work that's available when it's idling..

Sure, in fact the Lots of Ducks demo was used as an example of squeezing more out of fewer SPUs via a management system (might have been spurs) rather than spreading across more SPUs and idling them for longer.

But if you've no plans to use you're freed up SPU power, you might as well leave things as is.

nAo said:
I'd like to see RSX and CELL both shading the same image in a deferred renderer, with each processor working on different screen areas + load balancing.

I don't know about the latter bits..working on different screen areas and general load balancing, but Alan Heirich was/is doing work for SCEA on load balancing pixel shading across Cell and RSX in a deferred renderer. I'm sure you remember that though..can't remember if the paper was published or not.

nAo · Jul 6, 2007

almighty said:
What bennefits and gains in performance would Cell bring to different parts of the rendering engine?Gains in pixel?, vertex?, shadows? or All? And how easy would that split rendering be to implement?

This threads like a reference libary of all SPE usage, sweet

CELL could help RSX in rendering gbuffers removing primitives that don't contribute to the final image, moreover CELL could share with RSX the burden of rendering a frame via full screen passes (remember that is a deferred renderer)

Cheezdoodles · Jul 6, 2007

almighty said:
What bennefits and gains in performance would Cell bring to different parts of the rendering engine?Gains in pixel?, vertex?, shadows? or All?

SPE's are vertex beasts...

nAo · Jul 6, 2007

Titanio said:
I don't know about the latter bits..working on different screen areas and general load balancing, but Alan Heirich was/is doing work for SCEA on load balancing pixel shading across Cell and RSX in a deferred renderer. I'm sure you remember that though..can't remember if the paper was published or not.

LOL! I don't remember that! I'm really getting old...
Anyway..if CELL could also take care of full screen effects then one might overlarp CELL and RSX rendering times with CELL doing full screen effects while RSX is generating next frame g buffers,shadow maps and occlusion terms, at cost of having one sync point between the 2 processors.
If we also have transparent stuff to render then we probably need to sync processors a couple of times per frame.

almighty · Jul 6, 2007

nAo said:
Anyway..if CELL could also take care of full screen effects then one might overlarp CELL and RSX rendering times with CELL doing full screen effects while RSX is generating next frame g buffers,shadow maps and occlusion terms, at cost of having one sync point between the 2 processors.
If we also have transparent stuff to render then we probably need to sync processors a couple of times per frame.

Thats sounds like 1 pain in the ass to get working properly

patsu · Jul 6, 2007

Shifty Geezer said:
Don't really agree with this. What's Lair doing with armies that other systems can't handle? Kameo should a couple thousand clones with cloned animations, which is all Lair appears to be.

F5's comment is (has always been) controversial. I think they are refering to the entire army of little men, wooden boats, flying dragons and gigantic monsters (on both sides) all at the same time; and their approach to the problem. The larger monsters should be complex to process (and I'm curious about their geometry work). As for AI, the little men seem repetitive, but I don't know how well the bosses behave.

nAo · Jul 6, 2007

almighty said:
Thats sounds like 1 pain in the ass to get working properly

totally agree

but I guess it can make sense for some games

Cheezdoodles · Jul 6, 2007

Shifty Geezer said:
Don't really agree with this. What's Lair doing with armies that other systems can't handle? Kameo should a couple thousand clones with cloned animations, which is all Lair appears to be.

Hey, NNN has up to 1-2k enemies on screen, with induvidual AI. Shitty game, but still, i agree with you, Lair hasn't shown anything that would say only can be done with the cell

almighty · Jul 6, 2007

Ostepop said:
Lair hasn't shown anything that would say only can be done with the cell

What about the sheer number of on screen objects and there AI and the culling its doing aswel? Surely that takes some serious hourse power?

Titanio · Jul 6, 2007

nAo said:
LOL! I don't remember that! I'm really getting old...

Here's a link to the abstract if you're interest..

http://www.aurorasoft.net/symposiums/ccs06/presentations.php

Re. Lair and games with 'clones', they're difficult to compare without knowing how much processing is going on per 'clone', so to speak. Although I thought I read Lair has armies of up to 10,000 soldiers...but again, that's difficult to really compare against. I'd hazard to say anything about "only possible on Cell" at this point...it'll happen over time, no doubt, but I think we'll have to see more of what this generation has to offer before we can draw a line with a good degree of confidence.

betan · Jul 6, 2007

Shifty Geezer said:
Don't really agree with this. What's Lair doing with armies that other systems can't handle? Kameo should a couple thousand clones with cloned animations, which is all Lair appears to be.

He says sometimes AI can take up to a SPU implying most of the time it does not. So yes it looks like AI part of Lair is certainly doable in other systems. Many individual SPU tasks should be doable in other systems (possibly even easier).

Titanio · Jul 6, 2007

Coincidentally, Insomniac was asked this question in an interview GameDaily has just put up. They sort of wonder about the merit of sticking a single number on 'usage'.

BIZ: This is not the easiest question to answer, but how much of the PS3's power do you think you were using with Resistance and how much more are you using for Ratchet?

TP: Honestly, we don't know. You can't look at something as complex as the PS3 and put numbers on how much power you're using. What I can say is that when we finished Resistance, we did feel like we had only scratched the surface of what we could do. We were using the SPUs and that's a key to having a game that runs fast on the PS3, but there were a lot of things that we knew we could improve, and we have been improving them on Ratchet. And even with Ratchet, we're still seeing more and more things we can do; it's kind of like peeling off the layers of an onion.

[James Stevenson, Community Relations Manager, chimes in] Our tech team is constantly looking at new ways to optimize the engine. They're finding new ways to utilize things, new ways to get things working better... just look at the difference between the screenshots we've put out between Resistance and Ratchet and you can see we've made a lot of jumps. And we'll continue to refine the engine. Look at the PS2 and God of War II, for example, compared to the first-gen PS2 games. There's a lot more power to unlock on the PS3.

http://biz.gamedaily.com/industry/feature/?id=16729&page=1

patsu · Jul 7, 2007

betan said:
He says sometimes AI can take up to a SPU implying most of the time it does not. So yes it looks like AI part of Lair is certainly doable in other systems. Many individual SPU tasks should be doable in other systems (possibly even easier).

I can understand Shifty and Ostepop's points. The issue is with F5's statements. They are not detailed/exact enough to articulate the real meat. I am hoping the real game will also include "Behind the Scene" and "Developer Interviews" to elaborate on the finer details.

These guys presented 2 sessions in SIGGRAPH. So it would be great if they can help mortals like me understand their work better.

With the latest SpiderWasp screenshot they released, I am ever more anxious to see the entire game in action. I agree with the notion that Lair is impressive because it can handle so many details all at the same time at good framerate even though individually they are doable on existing systems.

SPU usage in games

Shifty Geezer

uber-Troll!

Nesh

Double Agent

nAo

Nutella Nutellae

archangelmorph

Terarrim

almighty

rekator

Titanio

nAo

Nutella Nutellae

Cheezdoodles

+ 1

nAo

Nutella Nutellae

almighty

patsu

nAo

Nutella Nutellae

Cheezdoodles

+ 1

almighty

Titanio

betan

Titanio

patsu

Similar threads