1st International Symposium on CELL Computing

Jawed · Jul 3, 2006

Ah well, 5 SPEs is a lot. How do you run the rest of the game?...

I think we're stuck trying to guess at where the bottlenecks lie.

Deferred lighting, because it uses multiple render targets, is very bandwidth/fillrate intensive. I can't tell whether this Cell+RSX technique is attacking that problem specifically or whether it's using the simplified data structures that deferred lighting produces as intermediate steps as input into a compute-intensive phase that runs on Cell (leaving RSX to start work on the next frame's render targets).

Jawed

Titanio · Jul 3, 2006

Jawed said:
Ah well, 5 SPEs is a lot. How do you run the rest of the game?...

Assuming they're free, on the other SPE and PPE?

One wonders how games ever got by before without multiple 3.2Ghz processors..

But seriously, I imagine the figure with 5 SPEs is their best result, hence its inclusion in the abstract (perhaps performance doesn't scale with a 6th, or maybe it's doing something else asides from core shader computation). And I'm sure that this would still be of benefit with fewer SPEs - although again, we're stuck guessing as to how performance might scale with different numbers of SPEs and/or how many you would need to employ to see a benefit (if more than one is practically required).

chris1515 · Jul 3, 2006

Titanio said:
Assuming they're free, on the other SPE and PPE? One wonders how games ever got by before without multiple 3.2Ghz processors..

But seriously, I imagine the figure with 5 SPEs is their best result, hence its inclusion in the abstract (perhaps performance doesn't scale with a 6th, or maybe it's doing something else asides from core shader computation). And I'm sure that this would still be of benefit with fewer SPEs - although again, we're stuck guessing as to how performance might scale with different numbers of SPEs and/or how many you would need to employ to see a benefit (if more than one is practically required).

They can use 5 SPEs for real time cutscene with the one of the SPE uses for sound and the last reserved for OS.

Titanio · Jul 3, 2006

I've opened a dialogue with the author, btw. Just started by confirming two things (that may have been more or less obvious), but a) the gop figure is just the 5 SPUs (no PPE involvement, apparently) and b) the GPU this figure is "comparable with" is a 6800 that they're testing in a workstation.

If anyone has any other questions (Jawed?), I'll relay them..

Jawed · Jul 3, 2006

I suppose a list of the data being shared between RSX and Cell would be a start. Something that's easy to list and let us fill in the details

or at least scratch our brains

Jawed

Urian · Jul 3, 2006

I can see a lot of developers using the extra power for postprocessing image FX. It is going to be great if they use some tricks from cinema combined with the real time graphics made by the Shader GPU.

GPU: Renders the image.
5 SPE: Postprocesses the image.

mckmas8808 · Jul 3, 2006

Urian said:
I can see a lot of developers using the extra power for postprocessing image FX. It is going to be great if they use some tricks from cinema combined with the real time graphics made by the Shader GPU.

GPU: Renders the image.
5 SPE: Postprocesses the image.

Why not use 2 or 2.5 SPEs?

Urian · Jul 3, 2006

mckmas8808 said:
Why not use 2 or 2.5 SPEs?

Developers can use the numbers of SPE that they want, this is part of Cell architecture flexibilty.

I believe that the T-Buffer idea from 3Dfx can be rescued but programable and with more powerful FX for a better image quality.

Shifty Geezer · Jul 3, 2006

These results indicate that a hybrid solution in which the Cell and GPU work together can produce higher performance than either device working alone

You don't say. And in more breaking news, scientists have discovered that if you have two men digging a hole, the hole gets dug faster than one man on his own.

Jawed said:
Ah well, 5 SPEs is a lot. How do you run the rest of the game?...

It was 85Hz though (well, best case figure...). Scaling back to 30 fps it might not be so demanding. But I'd want to see what it's actually bringing to the table. I suppose if you're only rendering visible pixels, you're saving a lot of wasted pixel shading. So if you're not drawing 2 out of 3 pixels from a conventional renderer, that's 3x as much pixel shading joy you can apply per pixel. Or waste a few more cycles on branching shader perhaps. Ultimately, it'll be one more tool for devs to choose when balancing gameplay and visuals.

Titanio · Jul 3, 2006

Shifty Geezer said:
It was 85Hz though (well, best case figure...). Scaling back to 30 fps it might not be so demanding. But I'd want to see what it's actually bringing to the table. I suppose if you're only rendering visible pixels, you're saving a lot of wasted pixel shading.

This can be achieved more or less by other means also, though. I think "what it brings to the table" and how valuable that could be, depends on how much load it takes off the GPU beyond that saving.

mckmas8808 · Jul 3, 2006

Titanio said:
This can be achieved more or less by other means also, though. I think "what it brings to the table", depends on how much load it takes off the GPU beyond that saving.

So if it takes say 15 gop (whatever that means) from the RSX because Cell can do it, what will the RSX do instead? Will it do extra stuff that it wouldn't have been doing before?

Fafalada · Jul 4, 2006

Jawed said:
I dare say you could have a lot of fun constructing a tile based deferred renderer in software on Cell, with a deferred lighting engine used to shade each pixel in a tile.

People have done it on PS2, SPEs should be all the more natural fit.
Of course, with PS2 the idea was that you could leverage FP pixel shading in a system with no pixel shading capabilities at all

, so there was at least one compelling reason for research in that direction.

Jawed said:
I can't tell whether this Cell+RSX technique is attacking that problem specifically or whether it's using the simplified data structures that deferred lighting produces as intermediate steps as input into a compute-intensive phase that runs on Cell (leaving RSX to start work on the next frame's render targets).

Abstract specifically refers to pixel shading running on SPEs - so I'd guess it's the latter.

codewarrior · Jul 4, 2006

rsx push the slow perpixel displacement mapping and texturing, cell doing on the 128 bit lighting and tone mapping or hdri and some post effects

Arwin · Jul 4, 2006

I just now watched the 10 minute behind the scene video on Warhawk, which is really cool because you really do get behind the scenes, i.e. briefly see some compilers at work, people working on source code, and so on. And they do mention a lot of things, like the separate programs they develop, and that every time they do graphics stuff, they can choose between doing them on Cell or on RSX depending on what seems best suited for it. It seems, in other words, to touch on the current discussion.

http://www.beyond3d.com/forum/showthread.php?p=786262#post786262

Actually, yes, they are on gametrailers, see the 6 behind the scenes videos.

http://www.gametrailers.com/gamepage.php?id=1681

Then again, most of you probably saw this already.

Acert93 · Jul 4, 2006

Arwin said:
and that every time they do graphics stuff, they can choose between doing them on Cell or on RSX depending on what seems best suited for it. It seems, in other words, to touch on the current discussion.

IRC, the statement from the Warhawk crew was specifically about vertex shading.

Arwin · Jul 4, 2006

Acert93 said:
IRC, the statement from the Warhawk crew was specifically about vertex shading.

Yes, it seems your right, I think they say the pixel shading is all done on the RSX ...

Titanio · Jul 4, 2006

Arwin said:
Yes, it seems your right, I think they say the pixel shading is all done on the RSX ...

The clouds' pixel shading is done on Cell

But no, they're not arbitrarily shifting around workloads from RSX to Cell etc. They've pin-pointed specific things that are suited to Cell, to do on Cell.

It does raise an interesting question regarding approaches to rendering on Cell, though - do you try for more general-case sharing of work, or do you play to each chip's strengths with the rendering on each (each doing specific things)? I'm hoping both roads will be explored.

sonyps35 · Jul 4, 2006

But Warhawk doesn't look that amazing. So..I dunno.

Do Crysis with this technique and I'll be impressed (to say the least LOL)

inefficient · Jul 4, 2006

Titanio said:
The clouds' pixel shading is done on Cell

But no, they're not arbitrarily shifting around workloads from RSX to Cell etc. They've pin-pointed specific things that are suited to Cell, to do on Cell.

It does raise an interesting question regarding approaches to rendering on Cell, though - do you try for more general-case sharing of work, or do you play to each chip's strengths with the rendering on each (each doing specific things)? I'm hoping both roads will be explored.

From WH Lead Programmer - Bruce Woodard.

"There is actually some overlap in the processing between the RSX and what you can achive on an SPU. So its kind of up to us. In terms of where can I skin my geometry - should I put it on the vertex program, or should I put it on the SPU. So we can actually, given the situation, decide where to put these things."

Titanio · Jul 4, 2006

sonyps35 said:
But Warhawk doesn't look that amazing. So..I dunno.

But the clouds do, IMO

They're probably the single best looking thing about the game. Put it this way, the game would probably look worse without this.

Warhawk really just proves the concept, it doesn't represent the best it could offer in other hands.

inefficient - in terms of geometry processing, there's more flexibility there, sure (although you still have to write different code for different chips). But rendering through shading and/or writing out pixels is another kettle of fish.

1st International Symposium on CELL Computing

Jawed

Titanio

chris1515

Titanio

Jawed

Urian

mckmas8808

Urian

Shifty Geezer

uber-Troll!

Titanio

mckmas8808

Fafalada

codewarrior

Arwin

Now Officially a Top 10 Poster

Acert93

Artist formerly known as Acert93

Arwin

Now Officially a Top 10 Poster

Titanio

sonyps35

inefficient

Titanio

Similar threads