Oh noes! Can of worms!
If you're asking this question then I assume you didn't understand the calculations nor the architecture?
Okay, firstly, the Anand comparison neglects the SSE and 3DNow vector execution units. Taking these into account,
P4 = (2+4)x3.8 GHz ~ 23 GigaComps/sec
A64 = (3+4)x2.8 Ghz ~ 20 GigaComps/sec
Xenon = (2+4)x3coresx3.2 GHz ~ 58 GigaComps/sec
CELL = (2+4)x1PPEx3.2GHz + (4)x7SPEx3.2GHz ~ 109 GigaComps/sec
And before anyone asks,
Xenos ~ (5)x48x0.5Ghz ~ 120 GigaComps/sec
GTX512/RSX ~ as above ~ 128 GigaComps/sec
An important metric missing is branch prediction capability.
Out-of order CPUs (P4, A64)> in-order CPUs (Xenon, CELL) > SM3.0 GPUs (Xenos, GTX512/RSX)