First Cell Benchmarks

ATI-liens

Newcomer
Borderline spam/troll post :cry:

Not really it's a fair opinion. Just like some people think the Wii is a Novelty.

I think the cell is powerful i'm not bias i just don't think it's what most people are making it out to be (When considering gaming.

The way it is being portrayed is as if we will see movie like graphics soon. I think we will see very good graphics out of the PS3 but i wouldn't get too over hyped. Im not convinced that it will exceed the X360's visual potentials.
 

rounin

Veteran
Not really it's a fair opinion. Just like some people think the Wii is a Novelty.

I think the cell is powerful i'm not bias i just don't think it's what most people are making it out to be (When considering gaming.

The way it is being portrayed is as if we will see movie like graphics soon. I think we will see very good graphics out of the PS3 but i wouldn't get too over hyped. Im not convinced that it will exceed the X360's visual potentials.

You're bringing in elements that have nothing to do with this thread and attempting to start implications that have nothing to do with this particular section of the console forum. Please refrain :!:
 

Fox5

Veteran
My opinion:
Cell like architectures are a clear conceptual win over what's come before.
As far as Cell itself goes, I'll wait until devs have more time with it before I declare it a performance success, failure, or break even (with a $1 billion investment). Certainly there are loads that cell does very well with, but they generally they're the same types of loads a gpu does well with, gpgpu may win out over cell or vice versa. Cell in ps3 as opposed to some other cpu will probably end up being the better choice for the ps3, but I'm more concerned about what gets adopted by the scientific research community. (since the ps3 design decisions are already done, but where computing as a whole goes in the next few years is still unknown)
 

patsu

Legend
I saw the benchmarks almost a month ago, the PPE was slightly out performed by a 1.6Ghz G5.

Remember each SPE is nothing like the PPE. They don't use VMX and are not great at double precision.

Cell is a big hype.

In general, the PPE is similar in performance to 1 Xenon core. I believe the latter has some extended VMX instructions, 128 registers but more limited hardware threading (Someone please correct me if I'm wrong).

The SPE has an SIMD engine strapped to it and you don't need double precision for gaming ? It's faster than the PPE when the local store is used effectively, and there are 7 of them in PS3. Use the search to find comments from the devs on SPE.
 
Lock if old:

・Dhrystone v2.1
PS3 Cell 3.2GHz: 1879.630
PowerPC G4 1.25GHz: 2202.600
PentiumIII 866MHz: 1124.311
Pentium4 2.0AGHz: 1694.717
Pentium4 3.2GHz: 3258.068

・Linpack 100x100 Benchmark In C/C++ (Rolled Double Precision)
PS3 Cell 3.2GHz: 315.71
PentiumIII 866MHz: 313.05
Pentium4 2.0AGHz: 683.91
Pentium4 3.2GHz: 770.66
Athlon64 X2 4400+ (2.2GHz): 781.58

・Linpack 100x100 Benchmark In C/C++ (Rolled Single Precision)
PS3 Cell 3.2GHz: 312.64
PentiumIII 866MHz: 198.7
Pentium4 2.0AGHz: 82.57
Pentium4 3.2GHz: 276.14
Athlon64 X2 4400+ (2.2GHz): 538.05


source: http://rian.s26.xrea.com/nicky.cgi?DT=20061121A#20061121A

Wait so the Wii's CPU comparable to a 2 ghz Pentium 4?

It's drystone was over 1600 dmips as well.

Sorry to go off topic..
 

ADEX

Newcomer
I saw the benchmarks almost a month ago, the PPE was slightly out performed by a 1.6Ghz G5.
Using benchmarks which were a lot better tuned for the G5...
e.g. Look at the stream figures, Cell is capable of a *lot* better and has been measured as such.

Remember each SPE is nothing like the PPE.

Correct, they are also 1/4 of the size, faster and use at most 1/4 of the power.

They don't use VMX and are not great at double precision.

They use an ISA remarkably like VMX (not surprising given VMX was their starting point).
Theoretical double precision isn't that exciting but it's still comparable to other processors.
The difference is in actual problems, Cell can get a lot closer to it's theoretical maximum than other processors.

Cell is a big hype.

I wrote a paper on Cell a couple of years back and was accused of hyping Cell with made up benchmarks (actually they were estimates based on the theoretical figures). The umpteen research papers that have come out about Cell have subsequently confirmed what it is actually capable of - It actually exceeded all my "hype" estimates, in some cases by several times.
 
Wait so the Wii's CPU comparable to a 2 ghz Pentium 4?

It's drystone was over 1600 dmips as well.

Sorry to go off topic..

That Wii dhrystone number was obviously just fabricated by using the GameCube CPUs dhystone number. 1125 Dmips @ 485mhz = 1691 Dmips @ 729mhz.

Dhrystone numbers only tell us one small part of the performance story. They only tell us about integer performace - there are no floating point operations in that benchmark. Wii's CPU should theoretically show good results in the Dhrystone thanks in part to the fact it's an Out-Of-Order CPU.

But PS3 has 8 cores, 360 has 3, Wii only has 1. In that PS3 dhrystone number above, that is just the PPU's Drystone number - only a small fraction of the full capability of the chip. And of course when it comes to floating point performance, Wii's CPU performance is completely dwarfed by single core
performance of either a Cell or Xenon CPU.
 
That Wii dhrystone number was obviously just fabricated by using the GameCube CPUs dhystone number. 1125 Dmips @ 485mhz = 1691 Dmips @ 729mhz.

Dhrystone numbers only tell us one small part of the performance story. They only tell us about integer performace - there are no floating point operations in that benchmark. Wii's CPU should theoretically show good results in the Dhrystone thanks in part to the fact it's an Out-Of-Order CPU.

But PS3 has 8 cores, 360 has 3, Wii only has 1. In that PS3 dhrystone number above, that is just the PPU's Drystone number - only a small fraction of the full capability of the chip. And of course when it comes to floating point performance, Wii's CPU performance is completely dwarfed by single core
performance of either a Cell or Xenon CPU.

Thanks but that's not what i asked.:D

I said is the Wii's CPU comparable to a a 2 ghz pentium 4?

Oh and why would you think the Wii dmips number would be made up?

BTW, I was already aware that floating point calculations are not included in dhrystone and that one of the main reasons why the ps3 uses spies is because it inflats the systems floating point numbers dramactically.

I just wanted to know how it stacks up to the pentium 4 at 2ghz and the Main ps3 CPU without the spies in general performanec but I seemed to have answered my own question.
 
Last edited by a moderator:
Thanks but that's not what i asked.:D

I said is the Wii's CPU comparable to a a 2 ghz pentium 4?

No.

Oh and why would you think the Wii dmips number would be made up?

Because no one has ever actually run and published a Dhystone score for the Wii CPU. People with access to the machines are under NDA. The Gamecube on the otherhand, people have run Linux on. And people have benchmarked it.

BTW, I was already aware that floating point calculations are not included in dhrystone and that one of the main reasons why the ps3 uses spies is because it inflats the systems floating point numbers dramactically.

SPUs also "inflate" integer performance by the same logic. Thinking that SPUs only are good at floating point operations is a misunderstanding.
 

rounin

Veteran
I wrote a paper on Cell a couple of years back and was accused of hyping Cell with made up benchmarks (actually they were estimates based on the theoretical figures). The umpteen research papers that have come out about Cell have subsequently confirmed what it is actually capable of - It actually exceeded all my "hype" estimates, in some cases by several times.

Is your paper online? If so or if not, please share :!:
 
No.



Because no one has ever actually run and published a Dhystone score for the Wii CPU. People with access to the machines are under NDA. The Gamecube on the otherhand, people have run Linux on. And people have benchmarked it.



SPUs also "inflate" integer performance by the same logic. Thinking that SPUs only are good at floating point operations is a misunderstanding.


Ha ha. I never said that. I said "one of the main reasons" Not the only reason.
 

patsu

Legend
Cell SDK 2.1 released

http://www-03.ibm.com/developerworks/blogs/page/powerarchitecture?entry=sdks_from_alphaworks_cell_be

The Cell Broadband Engine Software Development Kit version 2.1 is here and it completely replaces SDK 2.0. For the busy developer, though, you need to know that the biggest change is a move from Fedora Core 5 to Fedora Core 6 -- this will necessitate upgrading the operating system environment first and then upgrading to the newest SDK. The SDK 2.0 binaries will stay up for only 60 days, so you should upgrade at your earliest convenience.

You can download the free (necessary) Fedora Core 6 operating system here.

The still definitive tutorial by Sean Curry, "An introduction to the IDE for the Cell Broadband Engine SDK," will also be upgraded to match the new SDK (the upgrade is mostly screenshots and version numbers) -- although the individual instructions aren't substantially different from version 2.0, it is always a good idea to re-familiarize with the basics.

...

* upgrading of Linux kernel to 2.6.20 with enhancements for preemptive scheduling of SPE tasks, SPE logical affinity support, and improved performance via 64 KB Local Store page mapping.
* standardization on the new SPE Run-time Management Library (libspe2). The older and less functional libspe 1.x is being deprecated in this release.
* migration of example libraries and code to libspe2. A migration guide is provided to help move existing applications to libspe2.
* enhancements and improvements to the Accelerator Library and Framework (ALF), including additional examples that use ALF
* improvements and additions to SIMD math library
* addition of SIMD MASS and vector MASS libraries for SPE
* addition of example benchmarking code to measure and report on the performance of a representative set of DMA operations
* addition of GNU GCC, XL C/C++ compiler, and Full-System Simulator support for an enhanced CBEA-compliant processor with a fully-pipelined, double-precision SPE
* addition of a sample DMA channel profiling tool
* support for cycle count-profiling of code running on the SPE using OProfile
* addition of the Cell Performance Counter utility, which can be used to monitor and count cell performance events
* improved PPE model in the Full-System Simulator for better performance correlation across the Cell Broadband Engine
* improved integration between Full-System Simulator and Eclipse IDE for Cell Broadband Engine
* addition of Linux man pages for some libraries and tools
* upgrading of XL C/C++ compiler version to 0.8.2
* upgrading of binutils version to 2.18 prerelease
* upgrading of GDB version to 6.6
* upgrading of newlib version to 1.15.0.
 
Last edited by a moderator:

patsu

Legend
For those who are keen on this sort of things...

The other Cell 'versus' thread prompted me to take another quick look at what's out there now. I focused my search on Toshiba and found some more Cell open source software info:

On CE Linux (Linux in embedded products)...

* "Cell related Open Source Activities and PS3 as Cell Development" by Hiroyuki Machida -- Dated December 8, 2006 (Japan Technical Jamboree #12)

* "Cell broadband engine, SPE assisted user space device driver" by Hiroyuki Machida et al.
-- Dated February 22nd (Thu), 2007 (Japan Technical Jamboree #13)

* Sijam (Some Japanese company specializing in Cell related developement -- among other things): http://blog.cell.sijam.com/cell/linux/ (http://blog.cell.sijam.com/). Contains some info about Toshiba's Celleb reference platform.

Translated link: http://translate.google.com/transla...&hl=en&ie=UTF-8&oe=UTF-8&prev=/language_tools
 
  • Like
Reactions: one

DJ12

Veteran
Hmm, looks like 6/7 SPEs is the where the dramatic performance improvements seem to stop.
Maybe there was a method to Sony's madness at cutting an SPE.
 
Hmm, looks like 6/7 SPEs is the where the dramatic performance improvements seem to stop.
Maybe there was a method to Sony's madness at cutting an SPE.

I think you are reading the graph wrong.

Speed up every time you add a SPU is almost perfectly linear (and even super linear in some cases)

It's just the completion time that looks logarithmic. Because obviously just as going from 50 secs to 25secs is doubling the performance, going from 5 seconds to 2.5 seconds is doubling performance. Only getting 25 secs faster "looks better" on that graph than going from 5 to 2.5.
 

Titanio

Legend
http://dl.alphaworks.ibm.com/technologies/cellsw/cellFMwhitepaper.pdf

financial market applications on CELL with comparision with x86 multi core

Maybe it is needing a new thread.

Somewhat coincidentally, Fraunhofer ITWM will be showing some stuff running on a PS3 cluster next week, including financial apps.

They will demonstrate a finance application running on a Playstation cluster. The cluster is integrated with Fraunhofer's Grid Middelware PHASTGrid, which parallelizes the application and keeps all the Cell SPEs busy. The compute-intensive application calculates the value of equity derivatives and is broadly used in finance. A single Cell SPE outperforms a 3.0 GHz single-core Xeon processor.

http://www.hpcwire.com/hpc/1627281.html
 
Top