Theoretical Performance is not everything

Acert93 · Jun 29, 2005

Anand's recent article did bring something up worthy of discussion. I have expressed this before, but I believe this point is frequently lost. But Anand gave yet another good example, so I think it bears exploring some more:

http://www.anandtech.com/video/showdoc.aspx?i=2461&p=5

Anand said:
Another way to look at this comparison of flops is to look at integer add latencies on the Pentium 4 vs. the Athlon 64. The Pentium 4 has two double pumped ALUs, each capable of performing two add operations per clock, that's a total of 4 add operations per clock; so we could say that a 3.8GHz Pentium 4 can perform 15.2 billion operations per second. The Athlon 64 has three ALUs each capable of executing an add every clock; so a 2.8GHz Athlon 64 can perform 8.4 billion operations per second. By this silly console marketing logic, the Pentium 4 would be almost twice as fast as the Athlon 64, and a multi-core Pentium 4 would be faster than a multi-core Athlon 64. Any AnandTech reader should know that's hardly the case. No code is composed entirely of add instructions, and even if it were, eventually the Pentium 4 and Athlon 64 will have to go out to main memory for data, and when they do, the Athlon 64 has a much lower latency access to memory than the P4. In the end, despite what these horribly concocted numbers may lead you to believe, they say absolutely nothing about performance. The exact same situation exists with the CPUs of the next-generation consoles; don't fall for it.

Other examples being the Xbox1 and PS2 (PS2 had 2x the Floating Point performance), GPUs (which frequently do not behave in games how theoretical FLOPs or shops would have us indicate) and so forth.

I really think far too much has been made of certain metrics. This gen it is FLOPs with the constant bickering over over 218GLFOPs and 115GFLOPs.

Before this generation is was Polygons-per-second, before that how many MIPs and bits, and so forth. Every generation has a buzz word that people slice, dice and present as the end all conclusion of how said platform should perform in the real world. Yet time and time again we find that these "uber stats" paint a very limited picture of real world scenarios.

Multithreading is going to be a pain this generation. The 360 (3 physical cores) and PS3 (8 physical cores) are demanding a LOT from developers to maximize their potential. Further, the PPC cores in these chips are not very robust or feature rich and the SPEs are even less so.

But there is a lot of power in there. Obviously both MS and Sony believe more FLOPs were needed (they just disagree with the balance of GP and FLOPs performance). But measuring these platforms by theoretical peaks in best case scenarios tells us very little about the real world. Like the P4 / Athlon example Anand gives, two very similar chips (x86) with different peaks running the same code can return widely different results than we would expect based on a single metric.

And sometimes we wont know the real story for years. Maybe this can put some of the arguements on this forum into perspective. Just because the XeCPU may have more general purpose performance on paper or the CELL may have more FP performance on paper does not tell us much about how they will do in real life.

About the most telling metric I can find is that the CELL is 50% larger than the XeCPU (250M transistors vs. 165M). That and the 3.2GHz. But those don't tell us much at all

Carl B · Jun 29, 2005

I agree with the general premise of the post, but that being said, Acert do you have something that gives the official transistor count of the XeCPU as 165 million? I thought that was still in speculation.

Acert93 · Jun 29, 2005

xbdestroya said:
I agree with the general premise of the post, but that being said, Acert do you have something that gives the official transistor count of the XeCPU as 165 million? I thought that was still in speculation.

Not off hand. It was one of the numbers released in the MTV-E3 timeframe and was mentioned in semi-official channels. i.e. not in the spec sheet, but mentioned in the same breath as other information that was accurate.

I know I did not invent it, if that is what you are asking

I just happen to remember all the numbers that are released... DD1 CELL 234M, DD2 CELL 250M, RSX ~300M, Xenos 232M + 105M, NV40 222M... I don't know what it is, but these types of number seem to get stuck in my head :?

Farid · Jun 29, 2005

Care to remember the point of this thread Acert? Because at first sight, you're just stating the obvious.

But you have to remember that we are on a technology enthusiast board and therefore, you can't expect people to wait a few years and collect a lot of informations before they express their opinions and specualtions on a subject.

Also, I almost skipped your post, Acert. Why did you have to start your thread with "Anand bringed something up worthy of discussion". Why?!?...

Carl B · Jun 29, 2005

LOL, Acert I didn't think you invented it, it's just since there has not been a public airing of the chip - a la Cell - I didn't know if those numbers had been officially released or what the deal was. I know after E3 I read some sites with transistor numbers for the XeCPU, but later on they ended up coming under fire. But if they gave that number during the MTV show, that's good enough for me, probably...

Acert93 · Jun 29, 2005

Vysez said:
Care to remember the point of this thread Acert? Because at first sight, you're just stating the obvious.

I am not sure I follow you here. What point are your saying I should remember? If you are talking about the fact that harping on any single number as an indication of a systems performance, then no, I am not stating the obvious IMO. I WISH it was obvious, but I have been here long enough to know it is often lost. It is a habitual sin of not only this generation but every generation before it. I thought Anand gave a wonderful example everyone on the forum could conceptually understand and thought it was worthy of a thread to discuss

But you have to remember that we are on a technology enthusiast board and therefore, you can't expect people to wait a few years and collect a lot of informations before they express their opinions and specualtions on a subject.

And that is fine, I speculate with the best of them.

I guess my post was a call back to conservatism. Being technology enthusiests I think we should be more responsible at times... maybe? Or more balanced? I dunno. I try to be. I have taken my fair cracks at MS and Sony although I am on record liking both designs. But I think the tendancy is to gravitate to the assumption that a system with "X amount more theoretical performance" as being faster. Now it is obviously hard for us to benchmark these things because they almost don't even exist. So SOME speculation is necessary.

But on the other hand there are enough examples of "Chip #1 does 384 Bungholios, and Chip #2 does 1,024 Bungholios, but in real life Chip #1 kills Chip #2. Why? More effecient chip design, it is adapted better to real world situations, more cache, better overall system balance, better memory access, etc..."

Does not really matter WHAT the "Bungholio" is, it is pretty clear every generation has a couple of them. This geneation already has FLOPs, DPs, and HDR resolutions.

These metrics are valuable to say the least.

But they are not even realworld performance numbers (still all stuff on the drawing board and paper specs). And they don't tell us anything about how they will perform in a real world environment.

The P4-vs-Athlon example Anand gave is a perfect example of this as they are two chips competing in the same market that run the exact same code. And somehow the theoretical champ gets his rear end handed to him in many situations.

I guess it is natural to be optimistic, and very few of us are electrical engineers designing chips and/or have a good idea of what bottlenecks or irrelevant and which are vital (even if they appear trivial on paper). But sometimes I think it is good to be reminded of this point.

Actually the point would have been better made at E3... sorry my timing sucks

Also, I almost skipped your post, Acert. Why did you have to start your thread with "Anand bringed something up worthy of discussion". Why?!?...

I only wanted serious posters like you to continue reading

Anyhow who could make it past the Anand reference after his last 2 consoles articles and actually read the post gets a sticker

Shifty Geezer · Jun 29, 2005

Acert93 said:
But on the other hand there are enough examples of "Chip #1 does 384 Bungholios, and Chip #2 does 1,024 Bungholios

Your memory fails you. Chip #1 actually does 428 Bungholios, and Chip #2's 1024 Bungholios was for a 16 multiprocessor array.

nelg · Jun 30, 2005

Bungholios give me gas.

Theoretical Performance is not everything

Acert93

Artist formerly known as Acert93

Carl B

Friends call me xbd

Acert93

Artist formerly known as Acert93

Farid

Artist formely known as Vysez

Carl B

Friends call me xbd

Acert93

Artist formerly known as Acert93

Shifty Geezer

uber-Troll!

nelg

Similar threads