ihamoitc2005 said:
XeCPU has 6 threads sharing just 1MB L2, average of 183kb/thread leading to higher latency GDDR3 which sounds like cache-misses waiting to happen. I would be very surprised if anyone got decent performance using all the cores.
Each thread wont necessarily be the same size. You will have small threads and large threads. e.g. with an SPE you get 256K. That is it. So if you have two "applets" -- one 128K and the other 384K, you are in trouble. So the 384K applet does not work and the 128K applet leaves unused memory.
With Xenon you have more flexibility in this regards. Further, cores can easily share and work on the same information. There is also a danger of equating threads with cores. It has been expressed by some developers that the second thread on a core will frequently be of the same nature.
The two philosophies are different. What I have noticed is that there are clear lines drawn... e.g. you stick your nose up at the idea of the Xenon getting decent performance, yet the same thing has been said about CELL.
As a developer told me multithreading is hard work, and each processor offers its own twist on how to solve that problem. There will be areas where each excells and where each fails, and it wont be easy on either one.
The search feature returns some good information (although far too much to read in even a week and it is cluttered with a lot of trash). But needless to say some devs have spoken up and the CELL model does pose some hurdles and problems (especially with data it will work well with, if the PPE is running your OS and delegating to the SPEs that does not leave a lot of extra room to work with).
I think this thread shows that there are different opinions, and even more that each architecture will benefit certain chores and will favor different programmers. CELL is very nice for a PS2 dev who has had to work hard with EE. Xenon is similar to the model PCs have gone and has more research and will appeal to PC developers.
Ultimately it will come down to tools. Your AAA dev houses have the money, time, people, tools, to crack the case. The real issue is 98% of your developers are NOT AAA guys. They are good--and they make great games!--but they don't have the advantage as the big guys with 250-400 member teams and 5 development studios to share information, code, and resources. Developers need tools to help them get the most out of BOTH platforms.
So whoever makes their platform most approachable to the most developers, and allows them to get the most performance out of their work, wins a magor victory. OBVIOUSLY that answer wont be the same for every dev or every title. The tradeoff of work/power may be great in one title but insuffecient in another.
So tools and architecture are both important factors, more so than "peak" performances. Which reminds of a similar scenario on the PC. Similar architectures, but not the same, can return results that are contradictory to the theoretical performance of a chip. The below quote is a good example of this:
Another way to look at this comparison of flops is to look at integer add
latencies on the Pentium 4 vs. the Athlon 64. The Pentium 4 has two double
pumped ALUs, each capable of performing two add operations per clock, that's
a total of 4 add operations per clock; so we could say that a 3.8GHz Pentium
4 can perform 15.2 billion operations per second. The Athlon 64 has three
ALUs each capable of executing an add every clock; so a 2.8GHz Athlon 64
can perform 8.4 billion operations per second. By this silly console
marketing logic, the Pentium 4 would be almost twice as fast as the Athlon
64, and a multi-core Pentium 4 would be faster than a multi-core Athlon 64.
Any AnandTech reader should know that's hardly the case. No code is
composed entirely of add instructions, and even if it were, eventually the
Pentium 4 and Athlon 64 will have to go out to main memory for data, and
when they do, the Athlon 64 has a much lower latency access to memory than
the P4. In the end, despite what these horribly concocted numbers may lead
you to believe, they say absolutely nothing about performance. The exact
same situation exists with the CPUs of the next-generation consoles; don't
fall for it.