What if x360 used a dual core AMD or Intel proc. instead...

Tahir2 said:
AMD could not sell the main CPU to any console manufacturer at this time. Even if AMD sold the AMD FX-60 at $100 a piece it could not do it. AMD sells everything it produces in its Dresden Fab, both Fab30 and Fab36. It will continue to sell everything it makes for the foreseeable future. Buying an AMD processor for an expected multi-million selling console is an investment into a new AMD fab and this is something that neither MS nor Sony nor Nintendo would do for a third party.

And neither would AMD allow a third party like Sony or Microsoft or Nintendo make a fab to produce its processor technology. I could see AMD selling a third party its processors but that is about all. Far too risky to allow anyone to learn how to make your processors on the manufacturing side... this is not a 12 month cycle product - this is a generation of processors that lasts for 5 years.

Anyway why would you need an FX-60 in there are 2.6GHz with such high power and heat requirements? Even an X2-4200 would do nicely with 1MB of total cache and a 2.2GHz processor speed.

Anyway it didn't happen so I guess we can stop dreaming.

If I remember correctly, Sony actually made some investment in IBM's East Fishkill facility as a part of the CELL deal. At various points, AMD has also had deals with IBM (K6) and UMC (Athlon) for manufacturing, though these have never really gone through.
 
ERP said:
I actually disagree with this, the issue isn't "multithreading" , the issue is in order execution vs out of order. I suspect PC's won't ever forgo the OOOe on their path towards parallelism, in that arena it would be totally none sensical decision. In the console space it's justifyable to some extent.

We're actually in agreement then. I read your first reply with the tradeoff with OOOe for more ALUs as questionable. I'm sure you'd prefer 3 OOOe cores as opposed to 3 IOe cores. But with around 165 million transistors, would you prefer 2 G5s @ 2 GHz or 3 PPEs @ 3.2 GHz? And in 5 years?
 
Jaws said:
We're actually in agreement then. I read your first reply with the tradeoff with OOOe for more ALUs as questionable. I'm sure you'd prefer 3 OOOe cores as opposed to 3 IOe cores. But with around 165 million transistors, would you prefer 2 G5s @ 2 GHz or 3 PPEs @ 3.2 GHz? And in 5 years?

I'm not going to get into processor A vs B, and it wouldn't be a 1 for 1 trade.

I don't even have a preference, like every other dev you just deal with what you've got.

But there are a lot of "arm chair devs" on these boards who look at numbers and spec sheets and Oooh and Aaaahhand don't realise how much these cores have traded away to get the raw numbers where they are.

Unlike many speculators oin the board I do not consider Cell or X360 CPU's to be "miles ahead of Intel and AMD" or architectural leaps. They have chosen a very radical set of tradeoffs.
 
edit:

Missed a sentence that made everything else not make sense and had to put it in.

end edit:

Xenon:

In-order execution
threads flip flop in execution on a core
1 instruction per thread or 2 instructions for 1 thread = 2 instr issue/clock
3 cores
1 MB cache

Cell:
IoE
PPE core - like a Xenon core but not exactly
SPE - IoE
1 thread per SPE - issue 2 instr/clock
512K cache for the PPE
256 SRAM local store for each SPE

Quite frankly it's obvious these CPU are very dissimilar to common desktop parts. It should be obvious they require different approaches to coding for them. There aren't libraries of code sitting around for them, mature highly efficient compilers, optimized tool chains etc....

Was it...or is it even reasonable to expect they would handle x86 code with the same efficiency? I mean really. One could argue Cell nearly requires a change in coding philosophy all together!

I don't know how anyone could do anything BUT expect less the optimal performance in this situation. It boggles my mind.

---------------------------------------------------------------

I have some difficulty with how I think power is being looked at in this discussion. I feel that power is being confused with performance here where what's really being asked is whether a dual core AMD or Intel chip would have been a better performer in the next gen consoles.

Here's how I see it. Power is only an element of performance.

Performance = Power + Efficiency + Effort/Skill/Talent

Performance is the end result...how awesome or underwhelming the results on screen are. (be they games or otherwise but since we're talking about consoles...)

Power is what the hardware can do. It's no more complex than that. Power is what the "honest" numbers tell you the hardware is capable of...it's just the theoreticals.

Efficiency is the other aspect of a hardware's design. It's the measure of how good the HW is put together. It's how much the HW does on it's own to perform well when put to task.

Effort, Skill and Talent is the human factor. HW will do nothing without a person directing it to do something useful. HW by itself is worthless. What HW does for a person or group of individuals up to what the HW is capable of and does well by default is driven by the amount of effort put into getting it to do something useful. If the person(s) have skill it can take less effort and/or define where effort should be focused at all. If the person(s) have talent even less effort to the same ends are needed or it can be that the HW is used to do useful things others cannot achieve despite their skilled and best efforts.

---------------------------------------------

MS, Sony and Nintendo each had different results in mind when thinking about the hardware they would obtain for their next consoles. They all sought the best/correct performance for the cost they were willing to expend.

Would a dual core AMD or Intel chip be better performers in their consoles than what they presently have? Well let's see what Cell and Xenon have going for them.

Power-wise --- yes. We are all familiar with the theoretical capabilities of Xenon and Cell.

Efficiency-wise --- no. In order execution. High thread to amount of available cache ratio or disjoint incoherent on chip memory. Limited branching support tied with single thread execution on SPEs in Cell.

Effort/Skill/Talent-wise --- Yes. It is hard to measure the human factor but I think history and the environment in which these CPUs will be placed give us a guideline. It was only last generation I believe that console CPUs featured OoOe processors and before then developers still got good results in their games. In a more general sense history shows console developer's code to the HW capabilities and are successful in bringing the HW potential out. This is particularly true of PS2 developers in recent history. This is driven by competition for good results on a fixed spec and for the more respectable, a drive to advance the skill craft...while they're making money. It is very significant that console developers have no hard limit on how far their efforts to push the hardware can go. PC developers do push the boundaries but in the overall must temper their efforts to coincide with an acceptable level of power the average PC gamer will have in their rig. PC developers also would have to expend much more effort to optimize for dozens and dozens of different PC parts and then a nearly endless number of different configurations at the HW and SW level. This is why history shows PC developers do not push HW to it's max...doing so is a daunting task.

This is not the PC domain. There is no giant code base to go against. There is no huge HW installed base to consider. MS, Sony and Nintendo have as always the option to start completely anew with everything this generation and developers fall in line. I would say this has not been the case in the PC domain for a very very long time.

The way I see it in relative terms it's:

AMD/Intel
less performance = less power + greater efficiency + equal effort(The necessary effor will be made)

vs.

Xenon/Cell
more performance = more power + less efficiency + equal effort(The necessary effort wil be made)

I think Sony/MS have gambled that the human factor is greater than the efficiency of the hardware so that it was better to go for more powerful HW that should be able to hold it's own several years down the road vs. PC CPUs. This at a cost comparable to or less than using a dual core AMD/Intel chip that would be outclassed six months to a year from now tops in the PC domain.
 
Last edited by a moderator:
ERP said:
Unlike many speculators oin the board I do not consider Cell or X360 CPU's to be "miles ahead of Intel and AMD" or architectural leaps. They have chosen a very radical set of tradeoffs.

Careful, soon enough someone will come out and brand you a lazy bastard if you continue to shatter those people's preconceptions. :|
 
ban25 said:
At various points, AMD has also had deals with IBM (K6) and UMC (Athlon) for manufacturing, though these have never really gone through.
And currently have one with Chartred with which they share/export their manufacturing techniques, ie. process and automated precision manufacturing. However it seems that Chartred is focusing on a lowpower process so the only chips we're likely to see from that plant (IF we get to see them at all) is mobile/turion.
 
maaoouud said:
And currently have one with Chartred with which they share/export their manufacturing techniques, ie. process and automated precision manufacturing. However it seems that Chartred is focusing on a lowpower process so the only chips we're likely to see from that plant (IF we get to see them at all) is mobile/turion.

For the record though, 'Turion' chips aren't actually different architecturally from Athlon64 - Turion is just a platfrom name as Centrino is. Unlike Intel's Pentium M and Pentium IV, AMD's mobile and desktop (and server) are all K8.
 
ERP said:
I'm not going to get into processor A vs B, and it wouldn't be a 1 for 1 trade.

I don't even have a preference, like every other dev you just deal with what you've got.

But there are a lot of "arm chair devs" on these boards who look at numbers and spec sheets and Oooh and Aaaahhand don't realise how much these cores have traded away to get the raw numbers where they are.

Unlike many speculators oin the board I do not consider Cell or X360 CPU's to be "miles ahead of Intel and AMD" or architectural leaps. They have chosen a very radical set of tradeoffs.
Hmmm, I see many of these amd/intel cpus choking(single digit 1-2fps) dealing with the physics of a few cloned objects(half life 2), not sure how modern the cpus, but probably not too old(In any case not like a few months gonna give 15-30x peformance jump...). Anyway, the cell's shown itself able to deal with what looked like far far more, about what looked like a hundred objects, which seemed to me to be of similar geometry complexity, + fluid physics + projectile physics + cloth physics + advanced eyetoy functionality, dunnoh if it was dealing with the gphx calculation also during that demo but it probably was too, all with super smooth super high fluid framerate... all while significantly underclocked compared to the final ps3 version.

The trade-offs maybe radical, but it seems to me they're worth it when you look at the results in the apt application. Throw programmer ease out the window, and compromise performance/features here and there, but deliver the power, that's the right way to do things. People have to learn they're gonna have to earn it ;)
 
Just 1 ?

Will the massive use of midleware like UE3 make the life easier in this terms (assuming it is otpimised for IO execution in Cell/Xenon), or for each game devs will need to make all the work again (or even/just part of the work)?

I also wonder how CPUs from the end of the year will be for games considering that we will recive new architetures ...
 
Last edited by a moderator:
Blazkowicz_ said:
PowerPC is just an instruction set. G5 is not faster than Athlon64 for instance. And x86 have been RISC underneath for ten years, giving them the speed of RISC with the code compacity of CISC But it doesn't matter that much in the end.

Keep in mind, X360 core is not G5. Imagine a VIA C3 at 3.2ghz, or a 486 at 3.2ghz with SSE2.. not that sexy looking eh?

Now that the run to higher frequency has ended, we went from meaningless MHz to meaningless gigaflops..
though the new consoles have at least the required bandwith.

in-order is a 12 year leap backwards , though the chip will really fly with heavy number or threads (say 6). But to have so many threads well balanced and coordinated for gaming.. what a headache it will be. Looks like a good CPU for servers or supercomputing.

OK let´s go step by step:

a) "PowerPC is just an instruction set". According to William Stallings in his "Computer Organization and Architecture" (Im translating since my book isnt in english) : "The Power PC architecture descends directly from IBM 801, RT PC and RS/600 wich is also calles an implementation of the POWER architecture. The first implementation of the Power PC architecture, the 602, has a superscalar design much smilar to RS/600"

b) " G5 is not faster than Athlon64 for instance. And x86 have been RISC underneath for ten years, giving them the speed of RISC with the code compacity of CISC " . Certainly the custom Power PCs of the 360 are far more recent processors than G5s. It is true that AMD and Intel have implemented RISC techniques, but the x86 archietcture is not specifically designed to take full advantage of RISC, while Power PC does just that.

c) "Keep in mind, X360 core is not G5. Imagine a VIA C3 at 3.2ghz, or a 486 at 3.2ghz with SSE2.. not that sexy looking eh? :)" Thats a useless comparison.

d)"Now that the run to higher frequency has ended, we went from meaningless MHz to meaningless gigaflops..though the new consoles have at least the required bandwith." Neither of them were or are meaningless, they are a matter of fact, and if you can come up with better ways of measuring theoretical performance I would be happy to hear them.

e)"in-order is a 12 year leap backwards :), though the chip will really fly with heavy number or threads (say 6). But to have so many threads well balanced and coordinated for gaming.. what a headache it will be. Looks like a good CPU for servers or supercomputing"
Multicores unfortunately arent just the future for supercomputers but for processors in general. How is this something that is so bad with the 360 processor?

I own an a64 2800+ I think it could barely do 2 gigaflops even overclocked.
 
Looking at this, one could easly deduce that it would be a very bad option a top of the line AMD/Intel, as we can see a ~30M trasistores in logic (+~120M in cache (2Mgs) =~150M total) at 2Ghz beating one that have ~110M in logic (+120M in cache=~230M) 3,2Ghz; and still be almost as good as one with even more transistores in logic (probably in the memory controler I guess), althought less cache, + BW.

This probably mean that any of those CPU uses a lot of silicon that is useless for games, so it would be a total waste, meybe we can still ask if they did the best design.

PS more benchemarks
 
Last edited by a moderator:
...but then Carmack could sing the praises for the console he had always dreamed for, and then make/port games effortlessly! :p
 
> a) "PowerPC is just an instruction set". According to William Stallings in his "Computer Organization and Architecture" (Im translating since my book isnt in english) : "The Power PC architecture descends directly from IBM 801, RT PC and RS/600 wich is also calles an implementation of the POWER architecture. The first implementation of the Power PC architecture, the 602, has a superscalar design much smilar to RS/600"

What's your point?

> b) " G5 is not faster than Athlon64 for instance. And x86 have been RISC underneath for ten years, giving them the speed of RISC with the code compacity of CISC " . Certainly the custom Power PCs of the 360 are far more recent processors than G5s. It is true that AMD and Intel have implemented RISC techniques, but the x86 archietcture is not specifically designed to take full advantage of RISC, while Power PC does just that.

The cores in the 360 are far less complex than the PPC 970...2-issue, no OOOE, weak branch predictor, etc. Their IPC is significantly lower, and I think if you look, you'll find plenty of benchmarks to support this.

Secondly, this RISC/CISC debate is very much a grey area. Now clearly, a modern x86 implementation like the Athlon 64 will very easily beat the pants off any other architecture in single-threaded integer workloads. If you look at floating-point, the POWER5+ is king of the hill in SPECfp, but Opteron is really not that far off -- especially from a price/performance ratio.

> c) "Keep in mind, X360 core is not G5. Imagine a VIA C3 at 3.2ghz, or a 486 at 3.2ghz with SSE2.. not that sexy looking eh? :)" Thats a useless comparison.

No, but imagine something like a Pentium MMX (P55) at 3.2 GHz -- i.e. superscalar and in-order.

> d)"Now that the run to higher frequency has ended, we went from meaningless MHz to meaningless gigaflops..though the new consoles have at least the required bandwith." Neither of them were or are meaningless, they are a matter of fact, and if you can come up with better ways of measuring theoretical performance I would be happy to hear them.

Hz is not a measure of performance, only frequency.

> I own an a64 2800+ I think it could barely do 2 gigaflops even overclocked.

Theoretical peak is 7.4 single-precision GFLOPS.
 
ban25 said:
> a) "PowerPC is just an instruction set". According to William Stallings in his "Computer Organization and Architecture" (Im translating since my book isnt in english) : "The Power PC architecture descends directly from IBM 801, RT PC and RS/600 wich is also calles an implementation of the POWER architecture. The first implementation of the Power PC architecture, the 602, has a superscalar design much smilar to RS/600"

What's your point?

> b) " G5 is not faster than Athlon64 for instance. And x86 have been RISC underneath for ten years, giving them the speed of RISC with the code compacity of CISC " . Certainly the custom Power PCs of the 360 are far more recent processors than G5s. It is true that AMD and Intel have implemented RISC techniques, but the x86 archietcture is not specifically designed to take full advantage of RISC, while Power PC does just that.

The cores in the 360 are far less complex than the PPC 970...2-issue, no OOOE, weak branch predictor, etc. Their IPC is significantly lower, and I think if you look, you'll find plenty of benchmarks to support this.

Secondly, this RISC/CISC debate is very much a grey area. Now clearly, a modern x86 implementation like the Athlon 64 will very easily beat the pants off any other architecture in single-threaded integer workloads. If you look at floating-point, the POWER5+ is king of the hill in SPECfp, but Opteron is really not that far off -- especially from a price/performance ratio.

> c) "Keep in mind, X360 core is not G5. Imagine a VIA C3 at 3.2ghz, or a 486 at 3.2ghz with SSE2.. not that sexy looking eh? :)" Thats a useless comparison.

No, but imagine something like a Pentium MMX (P55) at 3.2 GHz -- i.e. superscalar and in-order.

> d)"Now that the run to higher frequency has ended, we went from meaningless MHz to meaningless gigaflops..though the new consoles have at least the required bandwith." Neither of them were or are meaningless, they are a matter of fact, and if you can come up with better ways of measuring theoretical performance I would be happy to hear them.

Hz is not a measure of performance, only frequency.

> I own an a64 2800+ I think it could barely do 2 gigaflops even overclocked.

Theoretical peak is 7.4 single-precision GFLOPS.

a) Isnt what I mean obvious ?, he said that PowerPC was "just an instruction set". It is a completely differet architecture.

b)"The cores in the 360 are far less complex than the PPC 970...2-issue, no OOOE, weak branch predictor, etc. Their IPC is significantly lower, and I think if you look, you'll find plenty of benchmarks to support this" I would like to see some evidence for this.

c)"Hz is not a measure of performance, only frequency." thats something you should tell whoever put in the same level mhz and gigaflops .

d) Acording to the Sciencemark test I jut ran, single precision would be around 5.9 (doble precision 1.8) gigaflops, but even if it were 7.8 that doesnt compare to over 30 gigaflops of EACH core.
 
thekey said:
a) Isnt what I mean obvious ?, he said that PowerPC was "just an instruction set". It is a completely differet architecture.

b)"The cores in the 360 are far less complex than the PPC 970...2-issue, no OOOE, weak branch predictor, etc. Their IPC is significantly lower, and I think if you look, you'll find plenty of benchmarks to support this" I would like to see some evidence for this.

c)"Hz is not a measure of performance, only frequency." thats something you should tell whoever put in the same level mhz and gigaflops .

d) Acording to the Sciencemark test I jut ran, single precision would be around 5.9 (doble precision 1.8) gigaflops, but even if it were 7.8 that doesnt compare to over 30 gigaflops of EACH core.

Evidence of the architectural details or the performance? Because the architectural details are common knowledge so you really should find that yourself. And with knowledge of those details certain conclusions do seem obvious.

p.s. you should be comparing the Pentium 955EE in SP SSE3 FLOPs to cell or Xenon. Then you should consider what Carmack said about them not approaching their theoretical peaks in anything but trivial benchmarks. I think the results should end up fairly similat and thats in the area where cell/xenon excel. But afterall commen sense could have predicted that since how on earth could the trailing horse suddenly take a commanding lead when faced with the challenges of low power/heat output, low cost and huge mass production with a brand new architecture?
 
> a) Isnt what I mean obvious ?, he said that PowerPC was "just an instruction set". It is a completely differet architecture.

I don't think you understand the terminology here. Xenon is an implementation of the PowerPC Instruction Set Architecture (i.e. the part visible to the programmer).

> b)"The cores in the 360 are far less complex than the PPC 970...2-issue, no OOOE, weak branch predictor, etc. Their IPC is significantly lower, and I think if you look, you'll find plenty of benchmarks to support this" I would like to see some evidence for this.

Take an EE class or at least inform yourself of the Xenon architecture.

> d) Acording to the Sciencemark test I jut ran, single precision would be around 5.9 (doble precision 1.8) gigaflops, but even if it were 7.8 that doesnt compare to over 30 gigaflops of EACH core.

I said *theoretical* performance. Your ScienceMark test (GEMM, I suppose) is measuring actual performance with all the constraints that includes (like memory bandwidth).
 
But afterall commen sense could have predicted that since how on earth could the trailing horse suddenly take a commanding lead when faced with the challenges of low power/heat output, low cost and huge mass production with a brand new architecture?

X86s are designed for branchy integer code and they're faster than anyone else on it.
These processors were not designed for that sort of work. For vector processing it's the x86 processors are the trailing horse.

The developers who do vector work on Macs are not exactly happy with the switch to Intel as even optimised SSE only gets around half the performance of VMX code.
 
you should be comparing the Pentium 955EE in SP SSE3 FLOPs to cell or Xenon. Then you should consider what Carmack said about them not approaching their theoretical peaks in anything but trivial benchmarks. I think the results should end up fairly similat and thats in the area where cell/xenon excel. But afterall commen sense could have predicted that since how on earth could the trailing horse suddenly take a commanding lead when faced with the challenges of low power/heat output, low cost and huge mass production with a brand new architecture?

The P4 you're discribing can pull off 12-15GFlop/s while the Xenon can pull off 83GFlop/s and the Cell even an astounding 155GFlop/s.
So how on Earth would you call this fairly similar?
Bare in mind these are sustained max results by both Cell and Xenon and not peak results.
Cell's peak were even 199-200GFlop/s which could only be done under certain conditions hence peak.
For Xenon there weren't any peak measurements I only read they got 83GFlop/s but I think they could something like 100GFlop/s in peak performance.

Thinking the Xenon and especially Cell are not much better than their PC and PowerPC counterparts is really dumb.
Xenon is so powerfull it calculates normal gaming worlds with complex AI, physics, 5.1 sound (on it's own) and on top off that assists the GPU.
Just do 5.1 sound on your dual core PC and see how performance comes crashing down.
Or just let your CPU run some terrain demo's like the CPU demos in 3Dmark2006 and see your CPU crash down to 1fps or even less.
Now let's go on Cell.
There are videos floating out there of Cell processing a medical image (I forgot the name of the company but frequent visitors know what I'm talking about) against a supercomputer and while Cell processes the image very quick the supercomputer takes a very long time.
Also on IBM there are benchmarks of 1 SPE against a PowerPC 970 at 2,7GHz where the SPE is always much faster and in some things is even 50 times faster.
Cell or Xenon worse than an Amd 64 or P4?
I don't think so.
 
It would be trivial to create a situation where Cell and Xenon would perform 50x slower than a P4 or A64.. one example would be running Windows XP. ;)

To all without a funny bone I'm kidding!!
 
Back
Top