In terms of rendering power what would Cell be equal to?

About texturing: if you want to do that on Cell, you have to turn things around. Load (part of) the texture, together with a bunch of pixels that access it and do those all together.

So, instead of simply taking a pixel and seeing what resources you need to shade it, you load the resources you can and hunt for the pixels that use those. You need to change your data structures accordingly.

A rasterizer and shader for Cell would look quite different from one used for a GPU, or even an off-line one.
 
Again that demo used 15 CELLs so it doesn't mean much when you're talking about ray tracing using ONE CELL. What can you do with one CELL doing raytracing at 30fps? A spinning cube?

BTW yes it is my opinion that a single CELL is about equal to a PowerVR DC chip when emulating it.

Well then it's my opinion that if you ignored some features (32 bit textures, bilinear filtering, etc) and just dealt with the PS3 emulating the DC, it could emulate it at full speed. I have no backing for this statement.
And to go even further, I think with PS2 level IQ, Cell alone could do graphics that would rival current gen games with a custom graphics engine for Cell.

BTW, why such the reverence for the DC graphics chip? Raycasting or not (isn't that what ID used in wolfenstein?) , it's not some mythical chip from its era that outperformed everything for years to come. If anything, I'd say comparisons to the PS2 are more accurate since PS2 also lacked many standard gpu features; how good could God of War look on just a Cell?
 
The Dreamcast's CLX2 was further ahead of its time than any other graphics processor, leapfrogging the competition by about twelve months.

It released in volume production around September 1998 and didn't see a comparable competitor until the high end TNT2s descended into a lower price tier near the end of 1999. Until that time, no comparably affordable graphics processor matched its performance.

Kutaragi said that Sony did consider using a second Cell chip as a GPU initially but realized through research that it wouldn't be up to the task.
 
That's like saying rasterizing solid color polygons(which Cell would coincidentally be very fast at) is the same thing as rasterizing with textures and shading. They are "similar" after all.

In context to that RT demo, it functions the same as RC eg casting realtime hard shadows.

And actually I seem to remember PVR2 HSR was not akin to raycasting, but with tile-depth buffer, might be confusing it with some other chip though.

Infinite Planes have nothing to do with tile rendering. The Neon and Kyro series dropped it in favor of standard triangles. The only chips to use IP is the PowerVR DC chip and previous versions PCX1/2. IP allows casting of shadows from any object onto any other object.

I think he's just really very confused - That seven million polys per second (Which I believe should actually be six) is the absolute maximum draw rate.

IIRC Infinite Planes doesn't incur a performance hit since that is the method PowerVR DC handles polygons. The TnL rate of the SH-4 CPU is 10 million polys/sec. The reason why the PowerVR DC was designed with a triangle setup "limit" of 7 (not 6) million polys/sec is because the DC only had enough memory to hold 7 million polys of geometry data.

Raycasting or not (isn't that what ID used in wolfenstein?) , it's not some mythical chip from its era that outperformed everything for years to come.

First of all I didn't say the PowerVR DC chip outperformed everything. The reason why I chose it for comparison is because it used Infinite Planes equations to describe polygons instead of using the standard triangle method kind of like how the NV1 used quadratic equations. This allowed it to cast shadows from any polygon onto any other polygon like what happens when you use raytracing. BTW Wolfenstein didn't use IP, if it did there would be realtime shadows and selfshadowing everywhere.
 
Infinite Planes have nothing to do with tile rendering. The Neon and Kyro series dropped it in favor of standard triangles. The only chips to use IP is the PowerVR DC chip and previous versions PCX1/2. IP allows casting of shadows from any object onto any other object.

"Infinite planes" is not ray casting. It's more akin to extruded shadow volumes. It's just one way to do shadows. Ray casting can do a lot more than shadows.
 
pjbliverpool said:
However I read the question as asking, "if Sony had used a second Cell in PS3 instead of RSX, what GPU would that have given the same results as".
I think the pre-RSX GPU would have been enough of a headache for some people out there, though in retrospect (PS3 launching when it did) I almost wish we got that.

archie4oz said:
Of course that would just engender a whole new class of stupid comparisons...
Well it would keep people busy for years running damage control over the specs and trying to wrap their heads around the concept, that alone almost makes it worth it.

Capeta said:
In context to that RT demo, it functions the same as RC eg casting realtime hard shadows.
According to IBM, the RT demo raytraced everything - not just shadows.
Second, CLX2 doesn't function as anything by itself - so if you want to be buttheaded about this, without supporting hardware, CLX2 can render approximately nothing, hence Cell wins.
kthxbye.
 
IAccording to IBM, the RT demo raytraced everything - not just shadows.
Second, CLX2 doesn't function as anything by itself - so if you want to be buttheaded about this, without supporting hardware, CLX2 can render approximately nothing, hence Cell wins.
kthxbye.

You mean 15 CELLs win. :LOL:
 
I think the pre-RSX GPU would have been enough of a headache for some

Pre-RSX GPU? I thought RSX was made especially for the PS3 from the get-go? Sorry if I have my "facts" wrong. I read Sony stated that they made Cell and RSX from the get-go to work hand in hand via FlexIO?

Can you dwelve more into this pre-RSX gpu? The rumored dual Cell setup? :devilish:
 
You mean 15 CELLs win. :LOL:
If CLX2 can render nothing on it's own, than Cell wins. One Cell. So would a 68000. And moving on with your comparison, an explanation of Infinite Planes would be useful to determine it's relevance. There's no consensus on what it is. I could only find this on the subject...

http://www.ping.be/~pin10741/ISPexpl.htm

AFAICT Infinite Planes is a technique for modelling geometry with a different mathematical representation that triangle meshes. This makes surface intersection faster to calculate than wading through triangle meshes testing triads of vertices. If so, there's nothing stopping Cell from implementing that method itself, and I would expect it to handle that incredibly quickly.

As for optimal texturing, perhaps using the DC method, each texture could have a list of pixels that access it with each pixel added to the list as it's found in a first pass. A second pass could fetch each texture in turn and apply it to the pixels in its list. Perhaps four SPEs could store a quarter of each texture to keep it local in SRAM (maximum 1 MB limit, ignoring code requirements) and a fifth SPE requests interpolated data from these SPEs, or somesuch. Trying to keep the data local.
 
Not quite sure how being more programmable equals being more powerful. Otherwise one could claim a 100Mhz Pentium is more powerful than R600. Its certainly more flexible, but would that flexibility allow it to render (in what ever way you like) graphics better than NV2a, or R300 etc...

I guess im interested in that question because of the early rumors that Sony actually considered using another Cell or 2 as the GPU for PS3. I think that may have been debunked though, not sure.

Good post.

As for the use of 2 Cells, I read it some time ago in an interview with Ken Kutaragi where he stated they where to use 2 Cells (will digg up the interview). One Cell as CPU and one as GPU.

As for the romour of 4 Cells,it started many years ago becouse of the claims Ken Kutaragi made about the PS3 being able to compute 1 TFLOPS . Hence 4x256 equals ~1 TFLOP. Although it was some time ago it could have been that Sony/K.K. said a single Cell would be 1 TFLOP but.... :???:
 
There was talk independently of 1Tflop etc. but I think the '4 Cell PS3' speculation was also largely fueled by those early Cell patents..and the confusion of explanatory patent illustrations for actual product plans (one of which featured a 4-Cell system that looked a little like it could be a games console).

As for usage of Cell as a GPU, I also recall that Kutaragi interview where he acknowledges that at one point they considered using Cell as the system's GPU. Though by the sounds of things that idea wasn't entertained for very long..I know there are some developers here who are quick to rubbish the notion that it was ever seriously on the table.
 
If CLX2 can render nothing on it's own, than Cell wins. One Cell. So would a 68000.

RSX can't render anything on it's own either so 286>RSX...ok. :LOL:

AFAICT Infinite Planes is a technique for modelling geometry with a different mathematical representation that triangle meshes. This makes surface intersection faster to calculate than wading through triangle meshes testing triads of vertices. If so, there's nothing stopping Cell from implementing that method itself, and I would expect it to handle that incredibly quickly.

As for optimal texturing, perhaps using the DC method, each texture could have a list of pixels that access it with each pixel added to the list as it's found in a first pass. A second pass could fetch each texture in turn and apply it to the pixels in its list. Perhaps four SPEs could store a quarter of each texture to keep it local in SRAM (maximum 1 MB limit, ignoring code requirements) and a fifth SPE requests interpolated data from these SPEs, or somesuch. Trying to keep the data local.

Wolfenstein 3D at 500fps...
 
Pre-RSX GPU? I thought RSX was made especially for the PS3 from the get-go? Sorry if I have my "facts" wrong. I read Sony stated that they made Cell and RSX from the get-go to work hand in hand via FlexIO?

Can you dwelve more into this pre-RSX gpu? The rumored dual Cell setup? :devilish:

nAo talked about the pre-RSX GPU some months ago, it was something called RS and he made it clear that RS had nothing to do with Cell technology.
 
RSX can't render anything on it's own either so 286>RSX...ok. :LOL:
Yes, that's just as true.
Wolfenstein 3D at 500fps...
So rather than contribute a useful link to where you get your understand of Infinite Planes from to help consider their applicability to Cell versus PowerVR and compare performance, you post this useless remark. Not very constructive, is it?

You're saying Cell can't match PowerVR DC. You've said it uses IP tech. Now go on to further your argument as to why Cell can't use that same technology to outperform PowerVR.
 
As for optimal texturing, perhaps using the DC method, each texture could have a list of pixels that access it with each pixel added to the list as it's found in a first pass. A second pass could fetch each texture in turn and apply it to the pixels in its list. Perhaps four SPEs could store a quarter of each texture to keep it local in SRAM (maximum 1 MB limit, ignoring code requirements) and a fifth SPE requests interpolated data from these SPEs, or somesuch. Trying to keep the data local.
Agreed. That would be the way to do it. But, when you've done that first pass, you also know which part of each texture is going to be needed. So, you only need to load the parts of those textures that are used.

Further, you value math way above data lookups, so you would want a texture compression that handles fairly small blocks of data. So you can load all the bits of texture needed to process the batch of pixels in a compressed form, and simply calculate the value you need from that. They won't be decompressed and stored up front.

And you would stream that, so you make a workflow based around the blocks of texture needed for that batch of pixels, and start loading them in the first SPU untill it's full. You continue loading the other blocks needed into the next SPU, etc. In the mean time, the first SPU can start processing it's pixels, and when it's done, it streams the results to the next one. Etc.

That would be very efficient on Cell, and allow for quite a lot of texturing done in real time.
 
There was talk independently of 1Tflop etc. but I think the '4 Cell PS3' speculation was also largely fueled by those early Cell patents..and the confusion of explanatory patent illustrations for actual product plans (one of which featured a 4-Cell system that looked a little like it could be a games console).

As for usage of Cell as a GPU, I also recall that Kutaragi interview where he acknowledges that at one point they considered using Cell as the system's GPU. Though by the sounds of things that idea wasn't entertained for very long..I know there are some developers here who are quick to rubbish the notion that it was ever seriously on the table.


there was also reportedly an 8-CELL CPU - 72 processors total;
8 PowerPCs/PPEs plus 64 APUs/SPEs

http://www.computerpoweruser.com/ed...=articles/archive/c0305/01c05/01c05.asp&guid=


PlayStation 3 Patents Tip Sony's Hand

Sony engineers received a patent in September that shows they're serious about using cell computing, which uses dozens of processors to work on computer tasks, in its next-generation PlayStation 3 video game console. Observers believe that Sony will use 72 processors on a single chip for the main microprocessor of the PS3. Nine of those will be PowerPC control processors, which each control eight-vector processors.

(that article had a mistake, it would've been eight PowerPCs, not nine)


http://www.firingsquad.com/news/newsarticle.asp?searchid=4905

With the PS 3, Sony will apparently put 72 processors on a single chip: eight PowerPC microprocessors, each of which controls eight auxiliary processors. Using sophisticated software to manage the workload, the PowerPC processors will divide complicated problems into smaller tasks and tap as many of the auxiliary processors as necessary to tackle them. As soon as each processor or team finishes its job, it will be immediately redeployed to do something else.

now unlike the PR-speak of E3 2005 where Sony boasted of PS3 having 2 Tflops,
the above 8-CELL configuration would've provided a "real" theoretical peak of 2 Tflops ;)


something like that would've been impossible on 90nm or even 65nm.

but on 32nm or smaller for PS4, maybe ?
 
Further, you value math way above data lookups, so you would want a texture compression that handles fairly small blocks of data. So you can load all the bits of texture needed to process the batch of pixels in a compressed form, and simply calculate the value you need from that. They won't be decompressed and stored up front...
You could probably come up with a different texture storage format. Load it in as a large bitmap, but cut into into tiny portions of maybe 32x32 to fetch just enough as is needed and keep it in LS. However, a problem with that would be interpolating on the texture-tile boundaries. A solution could be to have the texture tiles overlap by a pixel or two, so a 32x32 txture tile is actually 36x36, with only the inner 32x32 accessible and the outter margin used for interpolation.

So, who's going to start work on the B3D Cell Rendering Engine :D
 
Back
Top