AMD to announce Phenom X3 during IDF?

ShaidarHaran · Sep 19, 2007

pjbliverpool said:
You've got to be kidding? Just because Xenon runs at a higher clock speed doesn't mean anything for its performance. I am pretty sure a Xenon core is a LOT slower per clock than even a P4 so a comparison to a Phenom is way out there. You only have to look at its lack of cache, lack of OOOE, lack of execution resources (other than SIMD) and 2 issue width to see why clock speed is a poor measure of comparison to any modern desktop CPU.

I recall from a pretty old set of tests Xenon benching at around 1.6Ghz G5 territory (at best) for single threaded code which is actually pretty good but clearly far slower than a 2Ghz Phenom with a much higher IPC (than G5).

Sure Xenon will be heavily optimised for in a console environment but if Phenom were in its place the same would hold true for it too. Im sure all that cache and those SSE3 units can be made to do some pretty nifty things if coded to specifically.

Thats not to suggest Phenom would be better as a console CPU of course. As we all know the impracticalities of heat, power, cost and most obviously of all, time, prevent that from being remotely possible.

Good job failing to see the forest for the trees. Look at the big picture, then come back and re-post.

Of course clockspeed has an impact on performance. This shouldn't even have to be explained...

OoO has little impact on non-branchy code performance (i.e., the vast majority of game code), or did you forget about Cell? Remember, we're talking about a console here, not a general-purpose PC.

Single-thread performance means very little in a closed architecture platform designed around a multi-thread MPU (6!), with thread-friendly software optimized further for thread parallelization.

pjbliverpool · Sep 19, 2007

ShaidarHaran said:
Of course clockspeed has an impact on performance. This shouldn't even have to be explained...

Clock speed has an impact on performance when looking within the same architecture. Using it as a comparison point across architectures is next to useless and given how different Xenons architecture is to a desktop CPU it should be obvious that its a very poor measure. Much worse in fact than trying to claim a P4 is faster than a Core 2 because it runs at a higher clock speed.

OoO has little impact on non-branchy code performance (i.e., the vast majority of game code), or did you forget about Cell? Remember, we're talking about a console here, not a general-purpose PC.

What does Cell have to do with it? The vast majority of Cells power comes from its SPU's and LS. The PPU is by far its weakest componenet.

And who said game code isn't branchy? We arn't talking about rendering graphics here like the Cell might be tasked to do. In the 360 thats Xenos's job. Some of Xenons biggest tasks will be AI and Physics, both of which can be extremely branchy by their very nature. Lack of OOOE, small cache and deep pipelining wil all hurt a Xenon core here while a Phenom core would be in its element.

Single-thread performance means very little in a closed architecture platform designed around a multi-thread MPU (6!), with thread-friendly software optimized further for thread parallelization.

Of course single threaded performance still means something. You can still be limited by a primary thread. But thats beside the point since both CPU's have 3 cores. Surely your not suggesting that Xenon has double the effective power per core simply because each core is capable of two hardware threads?

Im quite certain that 360 games could benefit a lot more from 3 very fast, out of order threads than 6 very slow in-order threads. How many games even use all 6 threads and then how many of them are just helper threads or threads for none critical, low performance tasks like audio?

I just can't beleive that your suggesting a 2 years more advanced, far larger, far more expensive state of the art desktop CPU would actually be slower than Xenon if properly utilised.

ShaidarHaran · Sep 19, 2007

pjbliverpool said:
Clock speed has an impact on performance when looking within the same architecture. Using it as a comparison point across architectures is next to useless and given how different Xenons architecture is to a desktop CPU it should be obvious that its a very poor measure. Much worse in fact than trying to claim a P4 is faster than a Core 2 because it runs at a higher clock speed.

I'm giving you a "big picture" example. You were implying IPC is the only measure of performance. I'm saying that IPS is a better measure, given that a low-IPC, high-clock CPU can outperform the opposite, if the clock is high enough.

pjbliverpool said:
What does Cell have to do with it? The vast majority of Cells power comes from its SPU's and LS. The PPU is by far its weakest componenet.

What does Cell have to do with it? Please. Stop and read what I'm saying and think about it in context before you respond again. Cell was used as an example of another In-Order console processor that is capable of achieving very high IPS, despite the lack of OoO. You can't see the relevance?

pjbliverpool said:
And who said game code isn't branchy? We arn't talking about rendering graphics here like the Cell might be tasked to do. In the 360 thats Xenos's job. Some of Xenons biggest tasks will be AI and Physics, both of which can be extremely branchy by their very nature.

Physics is not branchy, otherwise GP CPUs would be the only thing capable of physics processing in real-time. Clearly this is not the case, since Cell, GPUs, and PhysX hardware all excel at physics calculations, none of which are known for their branching performance relative to GP CPUs.

pjbliverpool said:
Lack of OOOE, small cache and deep pipelining wil all hurt a Xenon core here while a Phenom core would be in its element.

And run hotter, and be missing half the hardware threads of Xenon, and be more expensive to manufacture... Big picture, man.

pjbliverpool said:
Of course single threaded performance still means something. You can still be limited by a primary thread.

In a closed architecture, software developers can count on a specific set of hardware. In this case, that's 6 thread concurrent execution. A tri-core phenom can only "emulate" this behavior with time slicing, and it's still not concurrent execution.

pjbliverpool said:
But thats beside the point since both CPU's have 3 cores. Surely your not suggesting that Xenon has double the effective power per core simply because each core is capable of two hardware threads?

I'm suggesting that unless a tri-core Phenom were the original design target for XB360, software would not run as well on a redesigned 360 with a tri-core Phenom in place of Xenon. Seeing as how tri-core Phenoms are not yet available and XB360 has been out for almost 2 years, I'm going to say this is pointless mental masturbation.

pjbliverpool said:
Im quite certain that 360 games could benefit a lot more from 3 very fast, out of order threads than 6 very slow in-order threads.

Speculation.

pjbliverpool said:
How many games even use all 6 threads and then how many of them are just helper threads or threads for none critical, low performance tasks like audio?

You'd have to sample XB360 devs to get a meaningful answer to that question. XB360's design team targeted 6 threads and told devs to expect to utilize them. Any dev that doesn't code to this target is leaving performance on the table and not following XB360 software development guidelines.

pjbliverpool said:
I just can't beleive that your suggesting a 2 years more advanced, far larger, far more expensive state of the art desktop CPU would actually be slower than Xenon if properly utilised.

I'm not saying it'd be slower, what I am saying is that your idea, while interesting, is rather pointless when taken in context of reality, for all the reasons already mentioned.

pjbliverpool · Sep 19, 2007

ShaidarHaran said:
I'm giving you a "big picture" example. You were implying IPC is the only measure of performance. I'm saying that IPS is a better measure, given that a low-IPC, high-clock CPU can outperform the opposite, if the clock is high enough.

Sure it can. But your assumption that Xenon's clock speed is sufficient to overcome its signofocantly lower IPC is clearly baseless.

What does Cell have to do with it? Please. Stop and read what I'm saying and think about it in context before you respond again. Cell was used as an example of another In-Order console processor that is capable of achieving very high IPS, despite the lack of OoO. You can't see the relevance?

Just because they are both IOE does not mean you can even remotely use one to guage the performance of the other.

Like I said, Cells performance comes mainly from its SPU's and LS and even then its performance in workloads that Xenon may not necessarily be handling in the 360 due to the split of work with Xenos. Im speaking of course about Cell being used as a crutch to RSX for graphics rendering.

Cells performance in the workloads we have seen it excelling in have pretty much zero baring on Xenons performance in its own workloads within the 360.

Physics is not branchy, otherwise GP CPUs would be the only thing capable of physics processing in real-time. Clearly this is not the case, since Cell, GPUs, and PhysX hardware all excel at physics calculations, none of which are known for their branching performance relative to GP CPUs.

Of course physics is branchy. One think hits another thing, are you saying the thing that was hit can only fly off in one direction? Of course not.

Modern GPU's (the ones touted for physics performance) can handle branching just fine and you have no evidence whatsoever that the Ageia PPU doesn't also have decent branching performance. Cell is the exception but partially makes up for that with huge number crunching power.

Huge number crunching power than Xenon lacks.

And run hotter, and be missing half the hardware threads of Xenon, and be more expensive to manufacture... Big picture, man.

I suggest you read my posts more carefully. I already said at the start of this line of conversation and at least once again since that im not suggesting Phenom could be used in place of Xenon. Time, cost and power all make that impossible. This is purely a theoretical discussion about peformance. One I didn't seriously expect anyone to actually argue in favour of Xenon for.

As for the hardware threads, they are still using the same resources. Doubling your threads does not double your resources. Is the P4 faster than a Core 2 because it has Hyperthreading? No. And the difference between a Phenom and a Xenon core is far greater than between a Core 2 and a P4 core.

In a closed architecture, software developers can count on a specific set of hardware. In this case, that's 6 thread concurrent execution. A tri-core phenom can only "emulate" this behavior with time slicing, and it's still not concurrent execution.

I'm suggesting that unless a tri-core Phenom were the original design target for XB360, software would not run as well on a redesigned 360 with a tri-core Phenom in place of Xenon. Seeing as how tri-core Phenoms are not yet available and XB360 has been out for almost 2 years, I'm going to say this is pointless mental masturbation.

Thats why I said originally that im not suggesting replacing Xenon today. I was speculating what would have been possible on the 360 if it originally used a PhenomX3 and I have zero doubt that its total potential performance would have been much greater. The lack of SMT not withstanding.

Specualtion

Based on the fact that Phenom is quite obviously a vastly superior architecture on a per core basis to Xenon. Based also on the fact that its 2 years more advanced and contains more than double the transistors. Based also on the fact that previous benchmarks and developer comments about Xenon have placed its performance below that of an AthlonX2 which is only 2 cores, each of which are slower than a Phenom core.

Need I go on?

Any assertion that Xenon would be faster on the other hand is speculation based on nothing.

You'd have to sample XB360 devs to get a meaningful answer to that question. XB360's design team targeted 6 threads and told devs to expect to utilize them. Any dev that doesn't code to this target is leaving performance on the table and not following XB360 software development guidelines.

No analysis of thread usage I have ever seen has claimed to be using 6 threads all of equal power maxing out the cores. Generally you are talking 2 or 3 primary threads and then 2 or 3 low performance helper threads which is probably exactly what Xenon allows. Its not like 2 threads = 2x the performance of 1 thread and thus using 6 threads in no way means you are accessing twice the power than if you only used one thread per core.

I'm not saying it'd be slower, what I am saying is that your idea, while interesting, is rather pointless when taken in context of reality, for all the reasons already mentioned.

I stated it as an interesting idea to begin with, not something to start an argument of Xenon vs Phenom which I agree is totally pointless given the inevitability of the conclusion.

ShaidarHaran · Sep 19, 2007

pjbliverpool said:
Sure it can. But your assumption that Xenon's clock speed is sufficient to overcome its signofocantly lower IPC is clearly baseless.

Just because they are both IOE does not mean you can even remotely use one to guage the performance of the other.

Like I said, Cells performance comes mainly from its SPU's and LS and even then its performance in workloads that Xenon may not necessarily be handling in the 360 due to the split of work with Xenos. Im speaking of course about Cell being used as a crutch to RSX for graphics rendering.

Cells performance in the workloads we have seen it excelling in have pretty much zero baring on Xenons performance in its own workloads within the 360.

Of course physics is branchy. One think hits another thing, are you saying the thing that was hit can only fly off in one direction? Of course not.

Modern GPU's (the ones touted for physics performance) can handle branching just fine and you have no evidence whatsoever that the Ageia PPU doesn't also have decent branching performance. Cell is the exception but partially makes up for that with huge number crunching power.

Huge number crunching power than Xenon lacks.

I suggest you read my posts more carefully. I already said at the start of this line of conversation and at least once again since that im not suggesting Phenom could be used in place of Xenon. Time, cost and power all make that impossible. This is purely a theoretical discussion about peformance. One I didn't seriously expect anyone to actually argue in favour of Xenon for.

As for the hardware threads, they are still using the same resources. Doubling your threads does not double your resources. Is the P4 faster than a Core 2 because it has Hyperthreading? No. And the difference between a Phenom and a Xenon core is far greater than between a Core 2 and a P4 core.

Thats why I said originally that im not suggesting replacing Xenon today. I was speculating what would have been possible on the 360 if it originally used a PhenomX3 and I have zero doubt that its total potential performance would have been much greater. The lack of SMT not withstanding.

Based on the fact that Phenom is quite obviously a vastly superior architecture on a per core basis to Xenon. Based also on the fact that its 2 years more advanced and contains more than double the transistors. Based also on the fact that previous benchmarks and developer comments about Xenon have placed its performance below that of an AthlonX2 which is only 2 cores, each of which are slower than a Phenom core.

Need I go on?

Any assertion that Xenon would be faster on the other hand is speculation based on nothing.

No analysis of thread usage I have ever seen has claimed to be using 6 threads all of equal power maxing out the cores. Generally you are talking 2 or 3 primary threads and then 2 or 3 low performance helper threads which is probably exactly what Xenon allows. Its not like 2 threads = 2x the performance of 1 thread and thus using 6 threads in no way means you are accessing twice the power than if you only used one thread per core.

I stated it as an interesting idea to begin with, not something to start an argument of Xenon vs Phenom which I agree is totally pointless given the inevitability of the conclusion.

You're discussing peformance, I'm discussing feasibility. At no point have I stated Xenon is faster than tri-core Phenom. You're arguing against an argument of your own creation. I'm arguing against your mis-understanding of my argument. Tis a rather vicious and pointless exercise.

ninelven · Sep 19, 2007

Many posts earlier...

pjbliverpool said:
The size, cost, and heat of Phenom are the primary reasons for its being a poor console chip.

ShaidarHaran said:
But then you have to account for Xenon's higher clock. In the end Xenon still comes out on top.

ShaidarHaran said:
You're discussing peformance, I'm discussing feasibility.

/laughing on the inside

ShaidarHaran · Sep 19, 2007

ninelven said:
Many posts earlier...

/laughing on the inside

Yes, I remember that now. Poor choice of words on my part. I didn't mean performance-wise, just that in the context of Xenon already being the target of XB360 code it is the best choice for XB360 code, of course. I still maintain that 6 threads > 3 when you're developing code with 6 threads in mind and maximizing utilization of the available hardware.

Of course Phenom tri-core is faster than Xenon at just about any workload. It's not in the XB360 nor will it ever be, so an exact comparison cannot be made. Thus this is just mental masturbation. If anyone wants to create an XBox360 with a tri-core Phenom in it and come back with some performance numbers though, be my guest

Arnold Beckenbauer · Sep 21, 2007

Tim Sweeney about triple core CPU: http://extreme.pcgameshardware.de/showthread.php?t=1183

Neb · Sep 21, 2007

ninelven said:
Many posts earlier...

/laughing on the inside

You nailed that one down! /

AMD to announce Phenom X3 during IDF?

ShaidarHaran

hardware monkey

pjbliverpool

B3D Scallywag

ShaidarHaran

hardware monkey

pjbliverpool

B3D Scallywag

ShaidarHaran

hardware monkey

ninelven

PM

ShaidarHaran

hardware monkey

Arnold Beckenbauer

Neb

Iron "BEAST" Man

Similar threads