Alan Heirich's paper on PS3 Deferred Shading (Cell pixel shading)

Well, if necessary, I'd like to be able to use the SPEs for other stuff also? So in other words, either give me RSX in Linux, or give me two Cells. ;)
 
~400 ops is what would be called a ~100-instruction pixel shader, which is about middle-of-the-road for current game renderers. Definitely not "too complex", nor overly simplistic.
 
Fable 2

~400 ops is what would be called a ~100-instruction pixel shader, which is about middle-of-the-road for current game renderers. Definitely not "too complex", nor overly simplistic.

I think for Fable 2 they have 200-instruction pixel-shader
 
They claim to be executing complex shaders. Perhaps it is 450 ops per pixel, and you'd get much better performance (in terms of less resources used) with more realistic shaders?

Although if this performance is accurate, who needs RSX in Linux?!
Unfortunately these ops only include point sampling of textures, and brute force DMA transfers of texels for each pixel (as in the paper) would waste a lot of bandwidth unless you implement a virtual cache for texel reuse, which is doable but consumes cycles. So take all the texturing operations into account and you'd be lucky to reach 1/10th of RSX's texturing ability.

There's a thread in B3D about that Barry Minor demo where he gives some performance figures of enabling texturing in a long mathematical shader. I did some calcs, and it worked out to 400 MTex/s IIRC.

Then there's triangle setup and rasterization with perspective correction. Who knows how this performs on Cell, as the paper above uses RSX to do that. I have a feeling that Cell will be fast enough (i.e. Gpix/s range), but I'm not sure.

Using RSX to handle both of these is the best way to use Cell for deferred pixel shading. There's a thread with nAo talking extensively about this, and even using RSX for pixel shading alongside Cell.
Well, if necessary, I'd like to be able to use the SPEs for other stuff also? So in other words, either give me RSX in Linux, or give me two Cells. ;)
There's nothing limiting you that way. It just means that Cell will divide its time between shading and everything else with perfect load balancing. At worst frame time is slowed by a factor of 2, and at best you barely notice the lack of a second Cell devoted to graphics.
 
But the fetching from textures in a deferred renderer is typically performed during the rasterization stage, with various G-buffer surfaces holding albedo and normal map textures. Only shadowmaps would need to be sampled from the proper shading pass.
 
But the fetching from textures in a deferred renderer is typically performed during the rasterization stage, with various G-buffer surfaces holding albedo and normal map textures. Only shadowmaps would need to be sampled from the proper shading pass.
You're right, and I acknowledged this when mentioning the use of RSX to handle textures and rasterization. I was replying to Shifty's "who needs RSX in Linux!" post.

I'm not convinced about Cell's efficiency with shadows in such a scheme, though. A texture cache is really important there to avoid redundant loads, and a software cache consumes cycles with each access. It's probably better to use RSX and copy the shadowing term over.
 
I don't know much about ACM TOG, but that impact factor is kinda too high. Where did you get it from?
http://portal.isiknowledge.com/portal.cgi/jcr which is AFAICS the authoritative source for this sort of thing.

Impact factors computed from citeseer are only fair if the number of papers represented there are independent of the impact of the venue ... which I doubt is the case (I think the people publishing in the higher impact graphics journals are actually more likely to publish their papers online, since most of the citations actually come from the lower impact journals that would in the end negatively impact the impact factor).

On second thought though TOG might be a bad example, the Siggraph conference proceedings are also published as an edition of the TOG and I don't know if they exclude the papers from it.

IEEE Trans. Vis. Comput. Graph. has 1.794.
 
Last edited by a moderator:
How likely is this to be used in games? Is KZ2 doing this at all?

And somebody has to ask, could Xcpu do something similar in anyway? Maybe the VMX units?
 
I would very much like to know how a deffered renderer works, does anyone have a link where rendering techniques are reviewed? I'd love to learn more on this subject. Sounds so techy and interesting hehe
 
I would very much like to know how a deffered renderer works, does anyone have a link where rendering techniques are reviewed? I'd love to learn more on this subject. Sounds so techy and interesting hehe
Look through the past Beyond3D articles. Deano Calver wrote one about deferred shading. Nvidia has a paper on their developer site as well.
 
For some reason, when reading the Deano Calver article on the subject after reading the nVidia paper, the Mac G5 unveiling started to play infront of my mind where they compare photoshop editing on the Nemo poster on a PC and the G5 ;)

Would it be a correct oversimplification to understand it that deferred rendering groups and renders objects in layers with lightning and effects applied after the image has been composed?
 
Back
Top