Jawed said:Except that at best the P4 in this comparison could be 10GFLOPs (see the graph I posted earlier).
So 15x or more supposed theoretical.
5x actual simply demonstrates, to me, what a poor architecture Cell must be. Running at 1/3 efficiency? Laughable. Particularly for something that is so strongly suited to it. Supposedly.
Jawed
Hold on a moment, you're comparing Cell's theoretical peak floating point performance to achieved floating point performance on a P4 in a particular task, and saying that represents the theoretical gulf between them? What?
A 3.2Ghz P4's theoretical floating point peak is about 12.5Gflops, I think (?). They achieved 8Gflops in that instance. Or ~64% of the peak. With specialised code, in the same task, a SPU came to ~74% of its peak. All this tells us is for this task, you'd want to write "specialised" code to get the best out of the SPU, but that even with non-specialised code, you're still doing better than the P4 (in absolute terms, not relative to peak. And you're going to have 6 or 7 or 8 SPUs). This is hardly surprising given how different the programming model on a SPU is vs what that library code would be used to.
Acert93 said:But without knowing what a comparable CELL Blade costs it is hard to say. Are we comparing a server with 512MB of memory or 16GB? Is this a task where we need 4 dual cores, or will 2 dual cores (or 4 single cores) work, or even daisy chaining?
The question was how much it'd cost you, if you wish from a dollar perspective, to get similar performance to this. It's a theoretical of couse, you're not going to be running the same demo all day, we're just assuming for the purposes of this comparison that that's all you're interested in. And compare the cost of the processors alone.
Acert93 said:And those are questions every IT guy has to ask.
You're addressing points for the sake of it, I clearly demonstrated that I was admittedly looking at this from a slightly different perspective than that of the IT guy later in that paragraph.
Acert93 said:I would not even venture there because this benchmark doesn't tell us anything about gaming.
It tells you that if you ever have cloth simulation in your game - directly of this kind, or if you wanted to be a little speculative, of other, simpler kinds of cloth - it'll fly on a system with Cell vs an equivalently clocked P4. Not really surprising, but it's good to have a little more solid information. That'd be one part of a mix of tasks in the game, but it'd be one that'd contribute to any speed advantage Cell might have overall.
Last edited by a moderator: