AnandTech article up (X360 vs PS3)

I would like to know how well people (the media more than public - those who are reiterating MS's 'our's does better useful stuff than Cell' mantra)think XeCPU's 'general purpose' computing capability. From accounts I've heard it's not too hot (powerful by single processor standards, but not leaps and bounds; not 3x G5 by any stretch), and it'll need the same sort of low-level management (such as cache management) as Cell to max it out. Don't know if anyone working on the hardware is allowed to say though so we'll only hear rumours from unspecified sources.

Theorising really needs more details on the techs, and where Cell is open source, XeCPU isn't so we don't know the specific ins and outs needed to make a fair comparison. All we have is MS telling us it's far better than Cell for general purpose, and no real-world general purpose figures for their own chip. Not much to go on.

cobragt : PS2 didn't do 1080i. It did a 640x540 display and stretched it, interlacing the vertical - a clever trick to double vertical resolution but no higher resolution then any current console can already manage. this shouldn't be confused with atrue 1080i display.
 
mckmas8808 said:
scooby said:
Couple that with the fact the single main PPE is fairly weak, and I think that's a big potential weakness there, you have a weak PPE with SPEs's that are quite limited in what they can and can not help with.

So Scooby what I get from you is that the CELL processor is pretty much crap. I mean you named almost everything that the CELL is and said it's either weak or limited.

So let me ask you what should have Sony went with? Would going with a more MS apporach helped? What if they put in a 3.2 Ghz P4 chip with the 512 memory (kinda like a super beefed up Xbox1) would that have been better? :?:

Not crap at all.

Just not the most wonderful thing since sliced bread that's all.

It has weaknesses, Xenon does as well.
 
Shifty Geezer said:
cobragt : PS2 didn't do 1080i. It did a 640x540 display and stretched it, interlacing the vertical - a clever trick to double vertical resolution but no higher resolution then any current console can already manage. this shouldn't be confused with atrue 1080i display.

There was no stretching. There was interlacing. The vertical resolution of GT4's 1080i is as high as any 1080i. Only the horizontal resolution was paltry.
 
I just skimmed it . It was blah . If he is right and it will take more work to get the performance out of the cell than the x360 cpu we may see alot of smaller devs using the x360 as thier development platform and alot of the bigger devs using it as thier base console .

Might make the console race more interesting
 
The article has little additional data in it and some specific points he made are questionable at best. So I really don't see the point in it other than "me too" journalism. That is, Anand had to have an article about it because the lack of such an article would be a black eye.


Points I dispute:
1) "With each core being able to execute two threads simultaneously, you effectively have a worst case scenario of 6 threads splitting a 1MB L2 cache. As a comparison, the current dual core Pentium 4s have a 1MB L2 cache per core and that number is only expected to rise in the future."

The comparison is flawed. P4s have how many competing threads to run? And the OS interrupting them all the time? Worst case, each thread gets 170 KB of L2 cache which is more than the X1 and it seems to do okay (the L1 caches are much bigger too).


2) "Regardless of how it is done, obviously the Epic team found the SPEs to be a good match for their physics code, if structured properly, meaning that the Cell processor isn’t just one general purpose core with 7 others that go unused."

Epic is using a physics middleware layer, so the point is kind of dulled. If Sweeney expects SPEs to perform physics work (which I think is reasonable) it's more because Ageia (sp) can make it work, not Epic.


3) "So with the Xbox 360 Microsoft used three fairly simple IBM PowerPC cores, while Sony has the much publicized Cell processor in their PlayStation 3. Both will perform absolutely much slower than even mainstream desktop processors in single threaded game code..."

So... categorizing the XeCPU as he does is a little flippant, but I digress. The main point here is that his claim is very objectionable. Given a single threaded game, I don't see my P4 2.8 GHz creaming the XeCPU or Cell. He needs way more support for this claim to ring true.


4) "With in-order execution as well as a small amount of high speed local memory, memory access becomes quite predictable and code is very easily scheduled by the compiler for the SPEs."

Memory access becomes predictable when the algorithm implemented makes predictable memory accesses. In-order + LS does not do that.


5) "Compilers are horrendously difficult to write; getting a compiler to work is a pretty difficult job in itself, but getting one to work well, regardless of what the input code is, is nearly impossible."

Compilers are definitely complex beasts anymore, but I think he overstates the difficulty a bit. Compiling for a single CPU has got to be simpler than compiling for x86. Between IBM and MS, I would imagine a decent compiler to be available for launch and it get better as time goes on.


6) "On the other hand, looking at all of the early demos we’ve seen of Xbox 360 and PS3 games, not a single one appears to offer better physics or AI than the best single threaded games on the PC today."

Launch games are always fractional representations of what the hardware can do. And anyway, I was pretty amazed at what Heavenly Sword's physics can do. As an aside, is HL2's AI a huge step up from HL1? This parallel's ERP's point.


7) "A single thread is used for all game code, physics and AI and in some cases, developers have split out physics into a separate thread, but for the most part you can expect all first generation and even some second generation titles to debut as basically single threaded games."

Animation is a big deal in terms of CPU time and it's not specifically referenced making me wonder if MS expects it to be spun off into another thread. I was under the impression that at least some PC games do this already, and I would expect it to be a huge win on the 360 where it would have its own core.


8 ) "This means that all the ops per clock could either be dedicated to geometry processing in truly polygon intense scenes. On the flip side (and more likely), any given clock cycle could see all 240 ops being used for pixel processing."

I'm not sure he realizes that the bottleneck on the GPU shifts on an intraframe basis. Games aren't necessarily fillrate bound the entire frame. And even if that's the case, the 360 has a lot of render target bandwidth.


9) "At 720p, the G70 is entirely CPU bound in just about every game we’ve tested, so the RSX should have no problems running at 720p with 4X AA enabled, just like the 360’s Xenos GPU. At 1080p, the G70 is still CPU bound in a number of situations, so it is quite possible for RSX to actually run just fine at 1080p which should provide for some excellent image quality."

This is almost a worthless point. Today's games? Today's PC games are likely to have a fraction of the demand PS3 games will have, especially after launch. I mean, did we judge the Xbox according to Quake 3 and that was it?
 
_phil_ said:
On what authority do you assume that ?
Novodex has the fastest collision detection on the market ,and is multithreaded. mark rein said Novedex told him that basically ,the cell matches their PPU.
You should not try to be technical if you haven't the knowlege ,imo...

I understand the problem isn;t that it's muti-threaded but that the SPE's have no branch prediction.

Here's is what I'm referring to:

"Collision detection is a big part of what is commonly referred to as “game physics.â€￾ As the name implies, collision detection simply refers to the game engine determining when two objects collide. Without collision detection, bullets would never hit your opponents and your character would be able to walk through walls, cars, etc... among other things.

One method of implementing collision detection in a game is through the use of a Binary Search Partitioning (BSP) tree. BSP trees are created by organizing lists of polygons into a binary tree. The structure of the tree itself doesn’t matter to this discussion, but the important thing to keep in mind is that to traverse a BSP tree in order to test for a collision between some object and a polygon in the tree you have to perform a lot of comparisons. You first traverse the tree finding to find the polygon you want to test for a collision against. Then you have to perform a number of checks to see whether a collision has occurred between the object you’re comparing and the polygon itself. This process involves a lot of conditional branching, code which likes to be run on a high performance OoO core with a very good branch predictor.

Unfortunately, the SPEs have no branch prediction, so BSP tree traversal will tie up an SPE for quite a bit of time while not performing very well as each branch condition has to be evaluated before execution can continue. However it is possible to structure collision detection for execution on the SPEs, but it would require a different approach to the collision detection algorithms than what would be normally implemented on a PC or Xbox 360."

ANwyays, my point was just that once again this is another area that will requre more manhours and development time to expoit.
 
phat said:
Shifty Geezer said:
cobragt : PS2 didn't do 1080i. It did a 640x540 display and stretched it, interlacing the vertical - a clever trick to double vertical resolution but no higher resolution then any current console can already manage. this shouldn't be confused with atrue 1080i display.

There was no stretching. There was interlacing. The vertical resolution of GT4's 1080i is as high as any 1080i. Only the horizontal resolution was paltry.
The image was stretched horizontally to fill the frame. 720 columns (or whatever it is) instead of 1080i's 1920 columns. And though GT4's output was as high as 1080i's output, it's not by some magic of the pushing the hardware. 1080i's vertical resolution just isn't a huge amount more than SDTV which current console can output to.

In essence outputting GT4 to 1080i requires little more effort than outputting to progressive SDTV. So it's not a full 1080i image (1920x540 @ 60 fps) but a scaled SDTV image (720x540 or whatever it is @ 60 fps).
 
Novedex told him that basically ,the cell matches their PPU.

That's what 's important to note ,i think .And it's not official pr.


You are telling me that without branch prediction you can't climb into a tree ?
Nice analogiy . :D
We need monkey coding ,then i guess.Else ,our evolution is compromised.

Btw ,who is still using BSP ?
even in UE3 you won't use it much if at all.The last BSP-based engine was what ? quake 3 ?
 
So are you trying to say not having branch prediction is a good thing for implementing collision detection?

I'm really missing your point here.

Lacking branch prediction will force devs to spend MORE time to make the SPE's work well at collision detaction, as well as many other tasks. I'm not a Dev but this is what I understand.

That was my original point, what's yours?
 
_phil_ said:
i'm not a tech guy .nor you are ;) ,But the BSP example is a stretch into the tech of the past.

now,this is my point,for the 3rd time :

For the whole Novodex physics, CELL = PPU.

What ppu though ?

We don't know much about the ppus except there will be diffrent configurations .

Does cell= the high end ppu or the low end ?

To cloudy the future is !
 
_phil_ said:
Btw ,who is still using BSP ?
even in UE3 you won't use it much if at all.The last BSP-based engine was what ? quake 3 ?
Don't get confused with BSP as a rendering technique and BSP as a form of spatial hierachy to speed up collisions.

A BSP has been in most of the collision systems I've written (last one shipped was Sudeki) and Heavenly Sword inherits one from Havok.
 
ERP said:
Stupid observation of the weak time.

Better physics and Ai and all that.....
Do we really see it much on PC's verus consoles?
I mean a 3+GHz Pentium is quite a lot faster than a 733 MHz Celeron, but is the gameplay experience that much better?

I guess my point is all this comparing to PC technology is moot, the single PPE is significantly faster than the PS2 or Xbox processor, so even if the games were completly singlethreaded (and decoupling graphics from logic is relatively simple) you should see a pretty significant jump.
Technology has little to do with how fun a game is to play. They can go hand in hand, but are mutually exclusive.
 
Alpha_Spartan said:
Technology has little to do with how fun a game is to play. They can go hand in hand, but are mutually exclusive.

Orthogonal, the word you want is orthogonal.

Mutually exclusive means one comes at the expense of the other. While this is sometimes true, is not really true in the general case, since you can have good technology used in a fun game. :)
 
DeanoC said:
_phil_ said:
Btw ,who is still using BSP ?
even in UE3 you won't use it much if at all.The last BSP-based engine was what ? quake 3 ?
Don't get confused with BSP as a rendering technique and BSP as a form of spatial hierachy to speed up collisions.

A BSP has been in most of the collision systems I've written (last one shipped was Sudeki) and Heavenly Sword inherits one from Havok.

Last I looked Havok was using a KdTree.


But yes BSP tree's have their place.
 
gurgi said:
BLah, blah, blah, all this techno crap is pointless
Then no offense, but you might not enjoy this community very much. Not that gameplay is irrelevant here, but b3d is very much a tech site.
Yeah, a long, long time ago, this place used to be about console developement minutiae.
But it's not anymore, so random poster calling "techno crap" is more than normal lately. It's actually the new standard, the new wave! Just don't expect Godard, Malle or Truffaut stuff, though.
jvd said:
I just skimmed it . It was blah . If he is right
Since when Anand is a reference when it comes to Consoles, new architectures and embedded architectures?
More exactly, since when Anand is a reference when it comes to anything but moneyhats?
 
DeanoC.
Ok thank ,i learned something.

JVD: there is only one PPU.


Anand publishes a console article every generation.There was heavy discution here back then.They have not a good track record of credibility in console world.
 
JVD: there is only one PPU.
But the board configurations will be diffrent . Just like video cards .

We may get a budget one with 64 megs , one with a 128 , one with 256 megs . As i said its hard to know which one .
 
Unfortunately, the SPEs have no branch prediction, so BSP tree traversal will tie up an SPE for quite a bit of time while not performing very well as each branch condition has to be evaluated before execution can continue.
Assuming that SPE branch hints are dynamic (which they may very well be ;)), they actually offer potential for better optimization then a hardware predictor in certain situations - because a programmer(and even compiler to a lesser extent) has a heck of a lot more domain knowledge of each particular branching situation then a chip does.
As far as branching alone goes, SPE could potentially outperform a PPE in situations like tree-traversal - so long as the data structures fit inside local store that is.
 
Back
Top