Is the 360 cpu limited?

Pozer · Mar 13, 2007

*lets not turn this into a Xenon vs Cell flame thread*

I've noticed while playing lots of graw2 that the textures and geometry look terrific but the frame rate stays in the 20-30fps range. The game almost feels like its CPU limited. Now one unoptimized game does not mean a system design failure. But I can't help remembering John Carmacks remarks about the Xenon being performance wise close to an athlonXP 3200.

Just want to spark some discussion.

Cheezdoodles · Mar 13, 2007

Every console is CPU limited....

John Carmack also said that the PS3 was slightly more powerful than the X360. So that would put the Cell slightly faster than a AMD XP 3200.

Shifty Geezer · Mar 13, 2007

A bad thread with a terrible answer...

Consoles are either CPU limited or GPU limited depending on what you're trying to do. If you want to add more graphics, but your GPU is already saturated, you're GPU limited. If you're wanting to run more code but the CPU is already full up, you're CPU limited. If you're wanting to create more graphics but the CPU can't create the graphics content fast enough, you're CPU limited.

As for Cell being as fast as an AMD XP 3200, let's ignore all the real-world benchmarks out there shall we, and just extrapolate a rough equivalency from one developer talking about entire systems instead of individual components.

Pozer · Mar 13, 2007

I guess a better title would be.. "Can the Xenon keep up with the Xenos"

It certainly didnt take long for this thread to degrade.

Shifty Geezer · Mar 13, 2007

It depends what you're trying to do. If you're creating procedural content on the CPU, and shading it simply on the GPU in maybe a cell shader, then chances are the GPU will spend a lot of time twiddling it's fingers. Alternaticely if youre models and animation are fairly simple that the CPU doesn't need to do too much work, but your graphics are shading extensively with advanced lighting processes, you'll find the CPU hanging around waiting for the GPU to draw the frame. The games that make the most of the hardware will balance the workload so that both are occupied, whereing you're neither GPU-bound or CPU-bound, but hardware-bound, at the limits of what the hardware can do.

one · Mar 13, 2007

How does GRAW2 compare with GRAW1?

[maven] · Mar 13, 2007

Favourably.

Pozer · Mar 13, 2007

one said:
How does GRAW2 compare with GRAW1?

The explosions look alot better. The animations look terrific. When an explosion goes off next to an enemy or in multiplayer it looks like something out of a rambo movie as they get a little air (not in overdone ragdoll physics way). Shoot someone in the leg and watch them fall onto it.

The levels are a lot more intricate and are huge! Lots of little nooks and crannies, lots of vehicles and vegetiation. The lighting system is a mixed bag. It looks terrific but make dense jungle levels near impossible to see enemies. Too much detail, with a strange LOD system and moving shadows on weird looking bushes make everything look like peppered coleslaw far away. On non jungle levels its a great effect.

There is a Dam level that has so much geometry to it, its amazing. But when you play it at 15fps you kinda wish they had removed some stuff. The netcode is improved too.

Deleted member 7537 · Mar 13, 2007

Is the 1MB of L2 Cache enough for the three G5 cores?

Carl B · Mar 13, 2007

jayco said:
Is the 1MB of L2 Cache enough for the three G5 cores?

The XeCPU is not composed of G5 cores.

almighty · Mar 13, 2007

jayco said:
Is the 1MB of L2 Cache enough for the three G5 cores?

There not G5 cores, and IMO i dont think 1mb cache between 3 cores is enough at all.

Rangers · Mar 13, 2007

almighty said:
There not G5 cores, and IMO i dont think 1mb cache between 3 cores is enough at all.

It's enough, considering the whole point of Xenos is lots and lots of execution units, but the tradeoff is to make things hard on the programmers.

Cell even moreso.

I'm rather becoming interested in GRAW2, is it easy? I thought the first one was too hard, therefore not enoyable. I keep hearing mixed messages about the difficulty of part 2.

pjbliverpool · Mar 14, 2007

Rangers said:
It's enough, considering the whole point of Xenos is lots and lots of execution units, but the tradeoff is to make things hard on the programmers.

Cell even moreso.

I'm rather becoming interested in GRAW2, is it easy? I thought the first one was too hard, therefore not enoyable. I keep hearing mixed messages about the difficulty of part 2.

Xenon isn't really flush with excecution units as I understand it. In fact all 3 cores combined are only roughly equal to an A64 in terms of seperate units (I think, its a while since I looked).

Rangers · Mar 14, 2007

pjbliverpool said:
Xenon isn't really flush with excecution units as I understand it. In fact all 3 cores combined are only roughly equal to an A64 in terms of seperate units (I think, its a while since I looked).

Hmm well, it's a pretty big chip so what is all the size for?

Anyway, I'm just going by what I've been told by reading Beyond 3D, etc :smile:

chris100 · Mar 14, 2007

pjbliverpool said:
Xenon isn't really flush with excecution units as I understand it. In fact all 3 cores combined are only roughly equal to an A64 in terms of seperate units (I think, its a while since I looked).

From Capcom's Lost planet technology document,they say if ones use XCPU six threads well,they could get about 4X Pentium 4 HT 2.8Ghz's performance.Fixed architure always makes more chances and space for improving code performance and specfic tuning to ONE system.

However,I think the most disappointed aspect is XCPU did not use OOO.

pjbliverpool · Mar 14, 2007

chris100 said:
From Capcom's Lost planet technology document,they say if ones use XCPU six threads well,they could get about 4X Pentium 4 HT 2.8Ghz's performance.Fixed architure always makes more chances and space for improving code performance and specfic tuning to ONE system.

However,I think the most disappointed aspect is XCPU did not use OOO.

Do you have a link to that?

Anyway, if they did say it, its obviously a massive distortion of the truth. Perhaps Xenon could get that kind of performance in one super scpecific and highly optimises scenario but in general it wouldn't even approach that level of performance.

3dilettante · Mar 14, 2007

pjbliverpool said:
Xenon isn't really flush with excecution units as I understand it. In fact all 3 cores combined are only roughly equal to an A64 in terms of seperate units (I think, its a while since I looked).

Perhaps in unit count, but not in functionality.
Since Xenon has an additiona int unit, there is a total of six full integer units for three cores.
A64 has three int units and three AGUs, which limits the full range they can be applied to.

For scalar FP, each VMX unit can issue one math op and one memory op.
A64 can issue one ADD + MUL + MEM.
Over three cores, Xenon can manage 3 math and 3 store ops.
A64 in one core could handle a max of 3 ops of the prescribed mix, period.

The load/store unit on A64 can handle two ops. I don't know about Xenon, but if each core's load/store can only handle one op, it's still more than A64.

Each Xenon core has its own L1, so from a cache perspective, things could be interesting. If Xenon is fully dual or pseudo-dual ported in its data caches, it would have three times the cache porting of an A64. Otherwise, a single-ported cache would leave Xenon with an additional port. Since bank conflicts can restrict A64 to a single access, Xenon's advantage is probably greater. (for the single-ported or true dual-ported instances)

On fully threaded code, Xenon is also capable of a sustained instruction issue of six instructions per clock, while A64 can only manage three.
The instructions aren't equivalent, since reg/mem ops would count as two instructions under PowerPC, but code that tries to avoid such traffic would lead to a mix more favorable to Xenon.

Of course, the catch is that A64 can throw all that hardware at one thread, while Xenon can't. Similarly, hiccups in execution are more easily hidden by A64.

inefficient · Mar 14, 2007

pjbliverpool said:
Do you have a link to that?

Anyway, if they did say it, its obviously a massive distortion of the truth. Perhaps Xenon could get that kind of performance in one super scpecific and highly optimises scenario but in general it wouldn't even approach that level of performance.

It was a misunderstanding. This is the actual quote:

Though some say the performance of the Xbox 360 CPU is not very good, according to Capcom, the performance of a single core of the Xbox 360 CPU is 2/3 of the Pentium 4 with the same clock speed. When SMT is fully exploited, about 4 times larger performance can be observed. In terms of PC it's comparable with 4 SMT threads in a dual-core Pentium 4 Extreme Edition 840 (3.2GHz).

http://forum.beyond3d.com/showpost.php?p=919807&postcount=116

So, a single Xenon core ~= 2/3 the speed of a Pentium4 at the same clock rate.

And Xenon w/ 6 threads on all 3 cores (fully exploited) ~= 4 threads on a dual core 3.2Ghz Pentium4 EE.

The 4x number was the speed of all 3 cores running 6 threads in SMT vs a single solo Xenon core running a single thread. The significance of that is that 3 cores is not just 3x faster, it is about 4x faster thanks to SMT.

Of course this is just the results Capcom was able to obtain is not necessarily the absolute limitations of the hardware.

pjbliverpool · Mar 14, 2007

Rangers said:
Hmm well, it's a pretty big chip so what is all the size for?

Anyway, I'm just going by what I've been told by reading Beyond 3D, etc :smile:

If im correct Xenon is a 165m transistor chip with 1MB of L2.

Compare that to an A64 with 106m at 1MB L2 and I guess its fairly big, although some of that size is going to come from inter-core communication I would expect.

However compared to the 233m transistors of a 1MB Athlon X2, its pretty small.

3dilettante · Mar 14, 2007

inefficient said:
The 4x number was the speed of all 3 cores running 6 threads in SMT vs a single solo Xenon core running a single thread. The significance of that is that 3 cores is not just 3x faster, it is about 4x faster thanks to SMT.

Of course that depends on whether the code being run behaves well with SMT. As part of a fixed platform, Xenon can probably expect this more often.

Is the 360 cpu limited?

Pozer

Cheezdoodles

+ 1

Shifty Geezer

uber-Troll!

Pozer

Shifty Geezer

uber-Troll!

one

Unruly Member

[maven]

Pozer

Deleted member 7537

Guest

Carl B

Friends call me xbd

almighty

Rangers

pjbliverpool

B3D Scallywag

Rangers

chris100

pjbliverpool

B3D Scallywag

3dilettante

inefficient

pjbliverpool

B3D Scallywag

3dilettante

Similar threads