what could be done in realtime on a 16 TFLOPs workstation ?

I was just wondering what could be acomplished on the Cell based workstations that are coming down the road.

we have info that already, as expected, the first prototype Cell based workstations are already out, pushing 2 TFLOPs. call it late 2004.

by late 2005 or early 2006, or at the latest, mid 2006, I'd expect the 16 TFLOPs Cell workstations to be out. just think what could be done on one of those, assuming it has the proportional graphics processing capabilities to back it up. I mean an equally large amount of graphics processing resources, more than PS3 will have.

we've seen a little bit of what the ~97 GFLOP GSCube could do:
*1.2 billion flat shaded polys/sec.
*many tens of millions of fully featured polys/sec
*maybe upto 300 million semi featured polys/sec
(from what i recall of the guys that did realtime Antz demo)

a 16 TFLOP workstation has ~167x more theoretical FP speed than GSCube and over 2500x more than PS2.

I don't want to hype Cell too much, but it would be / will be exciting to see what can be done on those forthcoming 16 TFLOPs workstations. it's much more believable than say, distributed computing for realtime rendering over the internet.

It would be so incredibly cool if you could group consumer PS3s together to get the same performance as those workstations. going by what was done with those 60 or 70 PS2s in Illinois at UIC, I don't think what I am thinking of is too far fetched.



sorry, excuse this poorly written mess of a post...just thinking of the possibilities

:oops:
 
Those workstations aren't going to be 16Tflops, jesus. Just three or four of them would nab one of the highest spots in the Top500 list of supercomputers, machines that cost hundreds of millions of $. Hell, even the Japanese Earth Simulator "only" manages around 35Tflops, and it fills up A HOUSE of cabinets and support equipment.

Of course, a machine like the Earth Sim's likely to have much higher sustained calculation ability than a couple Cell workstations, but even raw numbers makes this a rather silly proposition.
 
Guden Oden said:
Those workstations aren't going to be 16Tflops, jesus.

Why not? Technology does evolve and the 16TFlop number is from an official Sony Press Release.

The companies expect that a one rack Cell processor-based workstation will reach a performance of 16 teraflops or trillions of floating point calculations per second.

Maybe at the end of next year, place 50 to 500 of the top 500 supercomputer list will be cell based workstations ... who knows, but I'm sure we wil know pretty soon, only a few months from now. Of course that list would be really funny filled almost completly with Cell workstations. ;)

Fredi
 
The question is, just how big is this rack? It sounds like a mighty cumbersome workstation, I wouldn't like to try lugging it to a LAN. :)
 
A "rack" is pretty much an industry standard size when speaking of computer/server type of equipment, no? If it's a "rack", then it is fairly given that this was meant to be installed in a "cabinet", eh? ...and no, you don't lug around racks or cabinets to LAN parties. Typically, this isn't the type of stuff "gamers" run around with. :oops:
 
Well, as I understand it a 'rack' and a 'cabinet' are the same thing. So, is this the typical 42Ux19", or something larger? Whatever the story, it's not exactly what springs to mind when you hear the word 'workstation'.
 
The original poster's question is something I thought also myself. IF that PDF talking about the workstation was authentic, AND the cell rack workstation actually pulls 16Tflops, what kind of performance would that give gamewise?

I for one, am still curious as to how such parallel computing will work with something as dynamic as game programming and real-time 3D rendering.
Does TBR work with such distributed computing? or are there obstacles? Are there other types of rendering that could benefit from such architecture? Raytracing?
Of course, there are many other types of processing besides rendering that could benefit from such power, distributed or not, such as character animation, AI and so on...
There are just so many factors that have to be taken in consideration besides raw floating point operations per second, like the algorithms, memory access, and not to mention the GPU that would be worthy of such massive processing power.


I'm no expert, and most of the times one of such discussions start here, we hear the most extreme responses, and in the end, I'm still pretty much in the dark.
 
I really have no practical experience with racks or cabinets, either, but my impression was that "racks" are like "blades" that are "stacked" into a "cabinet". (Yeah, the quotes are outta control, right?) You can be assured that a rack will be of certain width and may resemble a blade wrt thickness. Some rack components may occupy n-times a single rack thickness, however. From what I've read in this forum so far, it is still unknown if this Cell rack is a "single" or some multiple of a single in terms of thickness. Either scenario seems plausible at this point, imo.
 
randycat99 said:
I really have no practical experience with racks or cabinets, either, but my impression was that "racks" are like "blades" that are "stacked" into a "cabinet". (Yeah, the quotes are outta control, right?) You can be assured that a rack will be of certain width and may resemble a blade wrt thickness. Some rack components may occupy n-times a single rack thickness, however. From what I've read in this forum so far, it is still unknown if this Cell rack is a "single" or some multiple of a single in terms of thickness. Either scenario seems plausible at this point, imo.

Are "you" "sure"?
I'm pretty "sure" that a Cell "rack" is "made" of "a" lot of "cabinets" to achieve the "16" Tflop figure"."

(Just playing)
 
Ok, my bad! ;) Maybe it is some sort of cabinet beast. How many slots are in a rack, anyway? 10? 20? Maybe they stuck 16 of these blades in a rack, which would suggest 1 blade can give 1 TFLOP? ...or maybe they are thick blades, so 8 fit in a rack, which would suggest 2 TFLOPs/blade? Just throwing out some possibilities...
 
8u-2.jpg


8U rack = 19"W*14"H*20"D

One CELL = 256GFlops @ 4GHz

16TFlops = 64 CELLs

1U rack = 8 CELLs

8U rack = 16TFlops ..as above...maybe...
 
PC-Engine said:
Realtime Monte Carlo metrolpolis light transport algorithms at 30 fps, VGA resolution, physics, AI, animation...

Is the graininess in that pic a product of MCMLT?

Yes, it is only sampling a few of the potential light paths hence the monte carlo part.

Edit: Here's an example of another method they compare it to there that you can see what happens with not as good sampling with same computational time (their algorithm figures out what paths are important and then perturbs them slightly to get similar important paths instead of wasting time with paths that never get seen).



fig5a.jpg
 
Back
Top