IBM Cell paper from GSPx

chris1515

Legend
Supporter
I'm sorry if the new is old but IBM have four papers coming from presentations at GSPx last years:

http://www-306.ibm.com/chips/techlib/techlib.nsf/products/Cell_Broadband_Engine

White paper about FFT
http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs/0AA2394A505EF0FB872570AB005BF0F1/$file/GSPx_FFT_paper_legal_0115.pdf

White paper about Cell and DRM:
http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs/3F88DA69A1C0AC40872570AB00570985/$file/GSPx_CellSecurityArch.ibm.pdf

White paper about An Implementation of the Feldkamp Algorithm for Medical Imaging on Cell:
http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs/5B1968BDD8D11639872570AB005A3A39/$file/GSPx2005paper%20for%20sdk.pdf

Another presentation of the TRE:
http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs/05CB9A9C5794A5A8872570AB005C801F/$file/2056_IBM_TRE.pdf

A revision to a Peter Hofstee document:
http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs/D21E662845B95D4F872570AB0055404D/$file/2053_IBM_CellIntro.pdf

The date of all the white papers is 31 october. I found the documentation by chance.
 
Nice find, thank you!

The Terrain Rendering Engine paper has a bit more detail than I remember from previous papers. The suggested "future work" is also interesting..

Embed the computed depth information for
each pixel in the returned framebuffer. By
preloading this depth information into the zbuffer,
we can combine the ray-cast scene
with GPU rasterized objects.


Level of Detail rendering.

Interactive modifications of the terrain. This
will enable us to simulate shock waves, add
fluid simulation to the lakes, avalanches to
the snow, and other simulations that require
real-time change in terrain geometry.

It'd be pretty sweet if they did this and made their work public again. I know we've discussed the potential for sharing rendering work between Cell and a GPU (RSX), but they seem to be looking at a more general route than has been generally considered - compositing full Cell-rendered and GPU-rendered frames (as opposed to Cell taking some completely independent part of rendering like shadows or transparencies).
 
Titanio said:
Nice find, thank you!

The Terrain Rendering Engine paper has a bit more detail than I remember from previous papers. The suggested "future work" is also interesting..



It'd be pretty sweet if they did this and made their work public again. I know we've discussed the potential for sharing rendering work between Cell and a GPU (RSX), but they seem to be looking at a more general route than has been generally considered - compositing full Cell-rendered and GPU-rendered frames (as opposed to Cell taking some completely independent part of rendering like shadows or transparencies).

Ive always known about that way of doing it ;)
 
!eVo!-X Ant UK said:
Ive always known about that way of doing it ;)


Well of course, I mean it has been proposed before, it is doable. But I think there has been scepticism about it. I mean there's scepticism about it taking on any traditional rendering tasks in games, let alone this. That's why it's interesting to see these guys following that line of thinking - and moreover, it'd be interesting to see their results (even if it is "just" a tech demo).
 
The obvious source of much skepticism is that fact that you need some Cell left over for actual gameplay! The suggestions of this paper are compositing GPU objects, without mention of application in games.

It would seem embedding the data as suggested would allow a GPU to composite objects over the top of the Cell terrain rendering, but wouldn't free up any Cell resources. Thus use in games of a raytracing method combined with GPU scanline rendering is still going to be a processing sink gobbling up processor resources. Eg. For a flight sim, where this terrain would be ideally suited, you could add RSX particles and plane models (let's say lots of space aliens instead of a one-on-one dogfight), but there wouldn't be any Cell capacity left over for gameplay, AI and audio (unless rSX does the audio ??). Perhaps some workarounds, like rendering only every second pixel in a checkboard pattern and interpolating the missing pixels, could find their way into games though, such that this terrain rendering could be applied to Magic Carpet's next-gen sequel?
 
Shifty Geezer said:
The obvious source of much skepticism is that fact that you need some Cell left over for actual gameplay! The suggestions of this paper are compositing GPU objects, without mention of application in games.

It would seem embedding the data as suggested would allow a GPU to composite objects over the top of the Cell terrain rendering, but wouldn't free up any Cell resources. Thus use in games of a raytracing method combined with GPU scanline rendering is still going to be a processing sink gobbling up processor resources. Eg. For a flight sim, where this terrain would be ideally suited, you could add RSX particles and plane models (let's say lots of space aliens instead of a one-on-one dogfight), but there wouldn't be any Cell capacity left over for gameplay, AI and audio (unless rSX does the audio ??). Perhaps some workarounds, like rendering only every second pixel in a checkboard pattern and interpolating the missing pixels, could find their way into games though, such that this terrain rendering could be applied to Magic Carpet's next-gen sequel?

This is all true. And we can't say that although the PPE is not involved in rendering here, it's free for gameplay or the like, necessarily, because one of the PPE threads here is used for "frame preparation and SPE communication".

However, i think it'd be interesting to see what you could do in terms of beyond GPU rendering activity with, say, 4 SPEs and "some of the PPE", if you wanna put it that way. Leaving you with some of the PPE and 2 SPEs for the rest (assuming a 6 SPEs are available to devs). We should remember that the PPE alone is quite a lot more powerful than the CPUs in the current systems, for example, if considering how much "game" you could afford.

(Of course, perhaps we should be thinking differently - not reserving SPEs for certain tasks, but for example, running the rendering stuff across all of them for a certain portion of frametime before switching to non-rendering tasks. All depends on what would be most efficient or fastest..).
 
Last edited by a moderator:
It sounds to me that you can use a cluster of SPUs for raycasting, i.e. for a first pass hidden surface removal algorithm in CELL. Then submit the visible geometry to RSX, thus reducing overdraw and saving bandwidth. Kinda coarse deferred shading...
 
It sounds like a little more than HSR though..

we can combine the ray-cast scene with GPU rasterized objects.

It seems to be talking about synthesising the final frame out of CPU and GPU rendered elements.
 
Last edited by a moderator:
Jaws said:
It sounds to me that you can use a cluster of SPUs for raycasting, i.e. for a first pass hidden surface removal algorithm in CELL.
The TRE isn't performing the same object ray-collisions that a real game scene would need. It's taking predictable data in uniform batches and processing. The moment you go thrashing around in RAM to see whether or not a ray intersects one of the 300 triangle meshes, I don't know that Cell is any better at a prepass than RSX would be.

To me, it's clear that the points on that paper of future work are just a matter of composition using the depth buffer. Render terrain with raytracing. Render Alien Scum with GPU. Overlay the GPU output onto the terrain image and for each pixel, compare the Z of the Alien Scum object with the Z of the terrain point, and where Alien Scum Z > Terrain Z, show the terrain pixel instead of the Alien Scum pixel.
 
Shifty Geezer said:
The TRE isn't performing the same object ray-collisions that a real game scene would need. It's taking predictable data in uniform batches and processing. The moment you go thrashing around in RAM to see whether or not a ray intersects one of the 300 triangle meshes, I don't know that Cell is any better at a prepass than RSX would be.

I wasn't referring to the TRE algorithm. I was referring to a HSR algorithm. A first pass, coarse HSR algorithm with SPUs, then let the RSX with it's H/W z, tackle the scene with fewer z samples. Otherwise, if SPUs could do complete HSR, then RSX would effectively be a deferred renderer... I dunno if enough SPUs are available for this though...

Shifty Geezer said:
To me, it's clear that the points on that paper of future work are just a matter of composition using the depth buffer. Render terrain with raytracing. Render Alien Scum with GPU. Overlay the GPU output onto the terrain image and for each pixel, compare the Z of the Alien Scum object with the Z of the terrain point, and where Alien Scum Z > Terrain Z, show the terrain pixel instead of the Alien Scum pixel.

Yeah, essentially blending 2 backbuffers...
 
and some hotchips papers :)

Super Companion Chip with Audio Visual Interface for Cell Processor
http://www.hotchips.org/archives/hc17/2_Mon/HC17.S1/HC17.S1T3.pdf

Programming and Performance Evaluation of the CELL Processor
http://www.hotchips.org/archives/hc17/2_Mon/HC17.S1/HC17.S1T4.pdf

A novel SIMD architecture for the Cell heterogeneous chip-multiprocessor
http://www.hotchips.org/archives/hc17/2_Mon/HC17.S1/HC17.S1T1.pdf

Cell Broadband Engine Interconnect and Memory Interface
http://www.hotchips.org/archives/hc17/2_Mon/HC17.S1/HC17.S1T2.pdf

btw, where can i find the paper "H.264 Video Encoding Algorithm on a Cell Processor" of GSPx 2005 ?
 
Thanks for the link. I don't find the paper about H2.64 video encoding and cell. But the document of Toshiba about Performance and programming evaluation of the Cell processor gives an example of H264 encoder CELL implementation.
 
Last edited by a moderator:
Back
Top