Book: The Race for a New Game Machine

I would say the opposite here, but oh well. :)

Smart as in they got the benefit of some of Cell R&D as well as insider knowledge however unintentional without forking over a huge investment. In the end, I guess it was IBM that probably was the biggest winner as they got the benefit of 2 revenue streams for some things that were already R&D (namely the PPE).

I do remember that MS wanted an OOE, but was more time constrained as they wanted to beat PS3 to the market. In retrospect, especially since PS3 was delayed for a year, I wonder if MS would've waited had they known that PS3 wasn't going to launch in 2005.

Performance wise, would the gains have been marginal? If I understand it correctly OOE makes it much easier to program.

Carl B, from other post read here, it really seems that you really dislike the Xenon design. Is this related mostly to the PPC derived architecture?

I'm curious as to what you think should have been done different regarding the X360 architecture as a whole. More EDRAM with increased logic I believe is a given to eliminate the need to tile. Would the EIB in PS3 be something that could have been incorporated into X360.

I'm also curious, as to what could have been different for the PS3 as well. Would EDRAM in the PS3 have made sense. Maybe have gone with a more advanced GPU and/or shared memory?

To be clear, what could/should have been different with the technology that was available at the time of development.

EDIT: Sorry you answered some of my questions while I was slowly pondering this post.
 
I'm curious as to what you think should have been done different regarding the X360 architecture as a whole. More EDRAM with increased logic I believe is a given to eliminate the need to tile. Would the EIB in PS3 be something that could have been incorporated into X360.

2 more MB of eDRAM definitely seems like it would have been an obvious spec uptick, so yes to that. Really beyond that and a different processor design, the 360 system architecting I think is plenty fine/elegant. And even though I'm sort of 'meh' on the XeCPU, it's not like it's not all relative, and it's getting the job done today so I'm not down on it as much as I might seem honestly. It just seems like something that was built to take-on Cell, when Cell itself I don't think was focused on gaming per se to begin with.

If the STI team had been designing the 'ultimate game CPU,' I don't think the Cell would have been what emerged from that at all. The Cell project was more to come up with a 'Future Processing Demonstrator,' in which I do think it was very successful. The problem for Sony was the absolute of the console economics/development cycle, and forcing by necessity the development community to embrace said 'Future Processing' model without a lot of the industry being ready. Well needless to say a lot of the community doesn't feel compelled to change their practices for the sake of the console with the lowest install base right now, and so Cell best practices have been something more or less emerging from within the Sony core studios and radiating outwards in the form of tools and such. And, yeah... a better PPE would help.

I'm also curious, as to what could have been different for the PS3 as well. Would EDRAM in the PS3 have made sense. Maybe have gone with a more advanced GPU and/or shared memory?

Well, I definitely would have thought and hoped that with the year-long delay that came up that the GPU would have been spruced up in the interim. An XDR memory controller on the GPU wouldn't have seemed so impossible either, with a shared memory pool yes... though the GDDR is cheaper.
 
On the flip side, if the STI team had been designing the 'ultimate game CPU,' I don't think the Cell would have been what emerged from that at all. The Cell project was more to come up with a 'Future Processing Demonstrator,' in which I do think it was successful. The problem for Sony was the absolute of the console economics/development cycle, and forcing by necessity the development community to embrace said 'Future Processing' model without a lot of the industry being ready. Well needless to say a lot of the community doesn't feel compelled to change their practices for the sake of the console with the lowest install base right now, and so Cell best practices have been something more or less emerging from within the Sony core studios and radiating outwards in the form of tools and such.

Wasn't it also IBM et al's bet that compilers would be a lot better at optimizing code into multiple cores by now? I was in grad school around the time Cell was announced, I seem to remember a whole bunch of articles on optimizing/multithreaded/speculative compilers coming out in 2005/2006, but I haven't heard of much concrete since then.
 
Wasn't it also IBM et al's bet that compilers would be a lot better at optimizing code into multiple cores by now? I was in grad school around the time Cell was announced, I seem to remember a whole bunch of articles on optimizing/multithreaded/speculative compilers coming out in 2005/2006, but I haven't heard of much concrete since then.

Yeah there was the 'Octopiler' project and some other stuff going on, good memory. And yeah, that stuff wasn't where anyone was hoping it would be when the architecture actually launched. In their own SDK's and such IBM has actually made some pretty significant gains the last year in terms of their offerings, and the Sony-side tools have improved a lot since launch as well, but certainly this is stuff that would have been nice (and I think was hoped for) at launch, and even then/now don't realize the level/promise of those R&D compiler projects from back in the day.
 
Maybe from a strategy standpoint it seemed smart, but the Cell PPE and the XeCPU cores are so horrid that if that's MS' reward, I'm sure they're welcome to it. The XeCPU isn't easy to code for because MS "won" these terrible cores, it's easy to code for because it's a traditional homogeneous multi-core layout with MS' dev tools on top.

I think you grossly overstate how "horrid" the chips are to develop for. I'm curious, what development experience on Xenon do you have?

Ironically, MS' 'acquisition' of the PPC tech might actually have been more a benefit to Sony that MS in the long run, as it allows for easy/basic porting from 360 to PS3. If the 360 had been rocking on a modern OOE x86 processor, ports to PS3 from 360 could have been quite cumbersome/burdened... likely to Sony's detriment, as general parity in 3rd party support is to their definite advantage right now.
This is also a huge stretch. The ISA for video games isn't that big a deal when ported, as the vast majority of a game is developed in higher-level languages. The OOE facet would make some difference, but I'm not sure it's as big as you think it is given the compilers and development experience game console developers have now. In fact, I think the 360 may benefit from the in-order cores to some extent as it permitted additional computational cores and clock speeds and the "penalty" of lacking OOE has been diminished by better compilers (and a closed box system) with more experienced developers. It was the right decision, in my opinion...
 
Wasn't it also IBM et al's bet that compilers would be a lot better at optimizing code into multiple cores by now? I was in grad school around the time Cell was announced, I seem to remember a whole bunch of articles on optimizing/multithreaded/speculative compilers coming out in 2005/2006, but I haven't heard of much concrete since then.

The gains made since 2004 on this in particular are pretty astounding. Back in 03-05 I worked at the IBM Toronto software lab with the compiler group. While I wasn't on the optimizing backend team, I worked with them in tuning customers' code (disclaimer: vast majority was not game-related) and was continually amazed how much difference even six months of development and research on the optimizing backends made. When I left in '05 it was pretty impressive, and I hear it's gotten even better since then.

This is why I'm of the opinion that the OOE/in order debate is kind of overblown. Both Cell and Xenon are really fast with the latest compilers...
 
This is why I'm of the opinion that the OOE/in order debate is kind of overblown. Both Cell and Xenon are really fast with the latest compilers...

That was always my impression, that at least the intention was to make simpler cores, get rid of the OOE and let compilers handle most of the heavy lifting. It's not really my area, but I do recall that there were some things that were really hard to figure out at compile-time.

I was also under the impression that a lot of that research wasn't lost, but rolled into optimizing OOE.

Still, I seem to remember that the general posture was trying to put less emphasis on coders learning new techniques and more on compilers taking that weight off their backs, but it's been years and I'm undoubtedly remembering it wrong.
 
That was always my impression, that at least the intention was to make simpler cores, get rid of the OOE and let compilers handle most of the heavy lifting. It's not really my area, but I do recall that there were some things that were really hard to figure out at compile-time.
A (very) rough rule of thumb is that adding OOE, with all else being equal, will on average bump performance by 50%.

On poorly optimized code or some favorable parallel loads, OOE does much better, while on highly optimized and straightline code, the advantage can evaporate or in some cases turn into a penalty.

The closed platform likely helps quite a bit, particularly in refining compilers when it comes to more aggressive optimizations, which themselves can lead to exciting levels of uncertainty when it comes to correctness (there's a nice quote on that somewhere).

There are any number of complicating factors, such as the size of the software-visible register set, the types of prefetch instructions, speculation over branches, the reordering of loads, the size of instruction and data caches, and any number of wrinkles.

There are some workloads that will always show a performance bump over what compilers can determine at compile time. For example, if dynamic branch behavior has a noticeably better prediction rate in hardware, or the ISA-visible register pool is small enough that the compiler has difficulty doing any major code optimization.

A dominant x86 console, even if in-order, could possibly have still been a stumbling block to porting, OOE or not. Other system level differences like the different optimal solutions for register allocation, the stronger x86 memory ordering, different code density, and other subtle changes could have impacted the more complex threaded software that is becoming prevalent.

One uncounted variable is time.
Improved compilers have helped mitigate the lack of OOE in the current consoles now, a number of years after the design was finalized and released.
They may well be decisively better in all respects a year or so from now.
The question is what could have developers and software architects been doing with the spare time they did not have in the years up until this point.

In the more rapidly evolving non-embedded space, the growth in hardware capability meant that deeply optimizing for a given chip would mean parity was reached by the time the chip was replaced.
The fixed platform the consoles have does help in this regard.
Then again, if Cell were magically OOE when facing its in-order competition, perhaps some of the growing pains that darkened perception of the platform would have been avoided.
 
I think you grossly overstate how "horrid" the chips are to develop for. I'm curious, what development experience on Xenon do you have?

First of all I didn't say that the chips (plural) are horrid to develop for - the XeCPU chip is straightforward enough (and I credited as much) - I was commenting on the PPE and XeCPU cores from a comparative architectural/performance perspective. I don't have any time developing on Xenon, which you know anyway, but the experiences of those who have tried and judged the XeCPU cores are plenty and numerous on these boards, and I'm certainly not echoing any fringe position here. That said I do recognize the various parameters MS was working within when they were seeking the design, and for the footprint the chip has I agree with you that it has admirable computational resources available to it relative to its 2005 contemporaries. Like I said, I give maximum respect to the tools/compilers orbiting the 360 platform and their role in aiding developers in performance extraction; it does seem that even there, however, devs that hand-tune based on the lessons of these architectures are able to surpass the compilers by a good bit still though.

With a scope that was broader, for the Cell it really was a stumble to be reliant on that core though IMO, whatever else the arguments that can be made for the final shape of the Xbox chip. As it stands now in true mixed-hardware environments, we see outboard Opterons doing the job that the PPE should be able to do for itself.

This is also a huge stretch. The ISA for video games isn't that big a deal when ported, as the vast majority of a game is developed in higher-level languages. The OOE facet would make some difference, but I'm not sure it's as big as you think it is given the compilers and development experience game console developers have now. In fact, I think the 360 may benefit from the in-order cores to some extent as it permitted additional computational cores and clock speeds and the "penalty" of lacking OOE has been diminished by better compilers (and a closed box system) with more experienced developers. It was the right decision, in my opinion...

You can call it a stretch, but I think calling it a "huge" stretch is a stretch itself. If the 360 games were architected around say some Core2Duo running at ~2.0GHz even, I think we would see that transferring that code to run at equivalent speeds on Cell would be challenging to say the least, as it would put a huge burden on the PPE and thus require an efficient offloading to the SPEs; something that frankly is not 'there' yet on the compiler side, especially on branchy un-optimized code.
 
Last edited by a moderator:
First of all I didn't say that the chips (plural) are horrid to develop for - the XeCPU chip is straightforward enough (and I credited as much) - I was commenting on the PPE and XeCPU cores from a comparative architectural/performance perspective. I don't have any time developing on Xenon, which you know anyway, but the experiences of those who have tried and judged the XeCPU cores are plenty and numerous on these boards, and I'm certainly not echoing any fringe position here. That said I do recognize the various parameters MS was working within when they were seeking the design, and for the footprint the chip has I agree with you that it has admirable computational resources available to it relative to its 2005 contemporaries. Like I said, I give maximum respect to the tools/compilers orbiting the 360 platform and their role in aiding developers in performance extraction; it does seem that even there, however, devs that hand-tune based on the lessons of these architectures are able to surpass the compilers by a good bit still though.

With a scope that was broader, for the Cell it really was a stumble to be reliant on that core though IMO, whatever else the arguments that can be made for the final shape of the Xbox chip. As it stands now in true mixed-hardware environments, we see outboard Opterons doing the job that the PPE should be able to do for itself.



You can call it a stretch, but I think calling it a "huge" stretch is a stretch itself. If the 360 games were architected around say some Core2Duo running at ~2.0GHz even, I think we would see that transferring that code to run at equivalent speeds on Cell would be challenging to say the least, as it would put a huge burden on the PPE and thus require an efficient offloading to the SPEs; something that frankly is not 'there' yet on the compiler side, especially on branchy un-optimized code.
Ok, you're looking at 2005 from 2009, and saying that MS would have been way better off going intel. However, you have to look at 2005 from 2001/2002 to see the landscape the same way MS did, and when you look at it like that, intel was entirely the wrong horse to bet on. IBM was doing amazing things with it's POWER architecture, with promises for 3GHz and above, something Intel was having huge trouble with, and the intel chips before the Core had absolutely horrendous power/performance ratios.

Now, sure, IBM lost steam, and intel unveiled Core (in January 2006, months after the XBox 360 shipped), a 32 bit dual core chip with about the same number of transistors as Waternoose, and could be run at only about 2/3 the clock speed.
With Xenon, MS got an extra core, hyperthreading (Only now reintroduced in the Intel line with the i7), and significant IP rights that were denied to them by Intel in the last generation.

(The Core 2 Duo you mention in your last paragraph was released in July 2007, with a parts cost of around $150 in bulk, so sure, it would have made an awesome XBox if it had been 2 years older, but no one would have been able to afford it)
 
Carl B and others - thanks for your insight.

I guess it really shows how much further-looking perspective is needed when designing Next Gen consoles

With regards to the X360, although I hear that Memexport and tessellation are used by developers, it seems to be underutilized or at least not what was originally envisioned.

PS3 on the other hand seem to be using the Cell/Relationship as planned.

Both consoles have met or exceeded most of my expectations this generation, other than load times. While it's not a dominant issue, it nonetheless is a pain.

For example, I'm playing thought MGS4 right now (this board convinced me to give it another go), you'd think after a mandatory install for each chapter that you'd have less to no loading after each cutscene. Although not long by this generations standard, I still find it a little frustrating.

On topic: I think the decision for MS to go with IBM occurred after the Sony-Toshiba-IBM announcement. The Intel-MS fiasco may have had something to do with them going IBM, but I think a large part of it was them knowing they could leech of the Cell R&D.

With all the lessons learned with multi-core processing this generation, the next iteration of these machines should really be amazing. The geek in me still longs for that PS9 concept that was advertised when the PS2 launched.
 
Ok, you're looking at 2005 from 2009, and saying that MS would have been way better off going intel. However, you have to look at 2005 from 2001/2002 to see the landscape the same way MS did, and when you look at it like that, intel was entirely the wrong horse to bet on. IBM was doing amazing things with it's POWER architecture, with promises for 3GHz and above, something Intel was having huge trouble with, and the intel chips before the Core had absolutely horrendous power/performance ratios.

Now, sure, IBM lost steam, and intel unveiled Core (in January 2006, months after the XBox 360 shipped), a 32 bit dual core chip with about the same number of transistors as Waternoose, and could be run at only about 2/3 the clock speed.

I don't think it's even about betting on horses or anything else; MS simply was burned with the whole Intel/NV nonsense of prior, and wasn't going to go that route again - they were going to control their IP this time. If it was about betting on horses, then honestly AMD would have seemed the obvious horse in that context, since for MS it was a very familiar environment for their team to work in, and the performance was certainly considered 'very good' at the time.

But Sony themselves looked poised to change the game by being the presumed context definer for the next era once again, and they were going to be pushing floating point like it was going out of style. So MS felt they had to have a competent offering in an area that seemed like it might define the next cycle. Well, nothing turned out as expected, but that's neither here nor there. I don't think I've expressed any confusion here as to why MS went with IBM, have I? I've simply expressed that the chain of events that happened may ironically have ultimately benefited Sony more than MS, as MS went into it planning on following the leader of the gen, and wound up essentially being the leader of the HD console segment. Had MS achieved that while having ignored the original architectural/paradigm threat Sony posed, Sony might find themselves further isolated at the moment in terms of developer ease of support.

(The Core 2 Duo you mention in your last paragraph was released in July 2007, with a parts cost of around $150 in bulk, so sure, it would have made an awesome XBox if it had been 2 years older, but no one would have been able to afford it)

July of 2006 actually, and no I'm not using the Core2 as the premise of the XBox, but as a point to highlight the complexity of porting code from x86 to Cell in a modern context. Maybe I should have used the example of an A64x2 instead, but to clarify the second part of the post is unrelated to the 360, and rather an effort to highlight/obviate that you can't just expect to port code from architecture to architecture without fear of penalty, because what you might have come up with on one, will suffer greatly going to the other, high-level languages or no.
 
That was always my impression, that at least the intention was to make simpler cores, get rid of the OOE and let compilers handle most of the heavy lifting. It's not really my area, but I do recall that there were some things that were really hard to figure out at compile-time.
That's why a lot of compiler optimizations in recent years, especially at IBM, have come from the way of performance profiling. You get both run-time and compile-time information, and the second compilation process will take advantage of that information.
 
Did someone mention the PPE in the Cell was weaker in its original design? Anyway the current PPE is enough to run apps like a web browser which I'm typing in with (though it should have had some SPE optimizations by now).
 
Did someone mention the PPE in the Cell was weaker in its original design? Anyway the current PPE is enough to run apps like a web browser which I'm typing in with (though it should have had some SPE optimizations by now).
Um, my 386/16 was good enough to run a web browser, I'd think a dual-issue, 3.2GHz PowerPC CPU wouldn't need SPE optimizations to do it...

We used Memexport extensively in the HD DVD code, but you're probably right that it doesn't get used a lot in game code.
 
An SPU is used to help speedup the Flash plugin (The H.264 decoding portion): http://www.n4g.com/News-222451.aspx

For a sense of how computationally intensive some areas of Flash 9 can be, you need only consider this little bit. According to Takase, the PS3 implementation of Flash-based playback for H.264 videos makes use of an SPU. This allows for loading up of web pages to be separate from video playback, improving framerate.

The entire web experience on the PS3 has been improved with the new firmware update. Noda cited a 2.8 times increase in Javascript speed according to Sony's own benchmarks. Takase believes PS3's Javascript performance, while not up to the level of Google Chrome, beats Internet Explorer 7. "

Don't know what PC they were using to compare.
 
Carl B and others - thanks for your insight.

I agree, Carl especially has been fantastic as usual.

On topic: I think the decision for MS to go with IBM occurred after the Sony-Toshiba-IBM announcement. The Intel-MS fiasco may have had something to do with them going IBM, but I think a large part of it was them knowing they could leech of the Cell R&D.

I don't know that this isn't true, I just don't know what you are basing this opinion on. The article (haven't read the book), doesn't indicate this nor does any of the information in this thread.

In fact, the only information I've seen from prior books and the insight in this thread is that the primary reason MS went with IBM was because they wanted to own the IP due to getting fleeced by Intel.

Making the jump from that knowledge to the conclusion that they were going to benefit from getting the same tech that Sony & Toshiba had invested so much R&D expense in is a HUGE leap.

Even this article proposes the idea that the book reveals that MS almost 'lucked' into this advantageous circumstance, not that they had planned it.
 
Oh, I don't know it as a fact.

It just seems that MS was looking for parity regarding the chips used, perhaps thinking its software/tools and Live service advantage would provide a distinctive advantage. It wasn't until Sony announced its partnership that MS decided to also go with IBM. There were other manufacturers out there that were willing to give MS what it was looking for. If memory serves me right, AMD were the hot chip makers at the time.

Additionally, MS has been relentless this gen with pursuing previous PS only exclusives to be released at the same time on X360. The mantra seemed to be "we can provide the same as the PS brand, but for less money". FXIII, GTAIV, DMC4, Tekken and others bear this out. The one big franchise that they didn't get, but was rumored for a long time was the MGS series.

All of the above is just speculation, but I find it curious how this generation has turned out. Of course, I could be way off and IBM may indeed have been the only makers willing to give MS IP ownership rights.
 
Asher said:
I think you grossly overstate how "horrid" the chips are to develop for.
That depends entirely on how low you can set the performance targets. I've seen tons of in-order CPUs over the years, and for all their faults, none of them were programmed like walking over a mine-field when it comes to performance gotchas - unlike the 360/PS3 PPCs.
 
That depends entirely on how low you can set the performance targets. I've seen tons of in-order CPUs over the years, and for all their faults, none of them were programmed like walking over a mine-field when it comes to performance gotchas - unlike the 360/PS3 PPCs.

Why's that, anything you can mention? I assume you're not referring to the arcane design of the CELL specifically, but something else?
 
Back
Top