Nintendo's (IBM's) Broadway coming to ArsTechnica soon

wireframe

Veteran
Those of you interested in the IBM chip that will power the Nintendo Revolution, called Broadway, may be interested in following this thread while you wait for the article to be published at a later date.
 
Great find Wireframe, but Hannibal's first post in that thread turned me a little off to it. I was hopign he was going to be getting a white paper or an engineering sample or something - rather he's just taking stabs at guessing what it is. I question his 65nm estimates as well, but then again I have no idea when Nintendo plans to launch.
 
I know this still speculation and it is just a possibility of a "design", but if true why would they want to put 2MB of L2 cache?, ie considering what the others are doing unless they really want t make it very easy devolopment, but it seems that it would be a wast of "performance" or money, IMO.
 
Last edited by a moderator:
pc999 said:
why would they want to put 2MB of L2 cache?, ie considering what the others are doing unless they really want t make it very easy devolopment, but it seems that it would be a wast of "performance" or money, IMO.
They'd possibly want to do it because Nintendo is a very ease-of-development-centric company these days, as opposed to highest peak performance figures. The Cube was universally recognized as a very forgiving platform, ie, it ran relatively well even if you did deliberately stupid things with it, and everything Nintendo has said about Rev speaks for the same being true for it as well, probably even more so in fact. GC had the biggest caches (by far) last time 'round, it would seem logical they'll follow a similar philosophy this time 'round.

IE: If they can build a chip with a large amount of dense SRAM cache and a decent amount of processing ability that is easy to harness efficiently, that overall is smaller and cheaper than a chip with a smaller amount of SRAM and a large amount of computing resources that are difficult to orchestrate efficiently, I think they'll pick the first option with no debate.

In part because Nintendo makes mostly cartoony games with primary colors, so you don't need a gajilloflop chip anyway, and besides, Nintendo couldn't get third-party makers to commit heavily to their machine if it required jumping through as many hoops as Xenon and Cell will...
 
Though I agree with your sentiments Guden, I don't think choice of primary colours affect CPU required. Cartoon shading isn't going to make the demands for AI and physics etc and less demanding than the same game with photorealistic graphics, though there would be savings from procedurally generated graphics.
 
I not sure if they a dev would need to spare so much time in Rev if it as two core with two treads and only 2Ghz (less latency), it would be the same than Cell PPE (but with less latency) and more than XeCPU (also with less latency), if it features also all of those features (like XeCPU (or even Gekko) which would be much easier to use than if it is a port of a PS3 game) to better use the L2 cache, would dev still having dificult to port a game:?:

Also, if I am not wrong, when well programed Gekko staded very well agaisnt XB CPU, while I agree it will not make use of procedurally generated graphics, I wonder if AI/physics/animation/audio... (and RTS will be ease to use here/physics are bad scallable with gameplay/animation should be quite interesting and challeging considering the controler)will suffer much will porting (we see GoW with one core, probably much under used, but with hard time keeping fremerate).

Anyway I am very interesting in see a chip that can make their comments (no difference at a normal TV, here I include AI/P/A/A...but less rendering ) and their case/price.
 
ok, if we assume

1) that revolution will be gc-compatible
2) it will have a PPE-style ~2GHz cpu
3) the PPE is an in-order cpu

how successfully an in-order cpu will run code meant to run ooo on an cpu only ~4 times slower, i.e. ~2GHz in-order vs. 0.5GHz out-of-order.
 
I was looking at the power consumption of the 970FX earlier and noticed how it drops off very fast with only a small speed drop. For instance at its fastest speed of 2.2Ghz the CPU consumes 48 watts and at 2Ghz it consumes 40 watts. Yet at 1.6Ghz it consumes only 17 watts! That's a 57.5% drop in power consumption based on only a 20% downclock. Could the XCPU in 360 act similarly? Assuming 3.2Ghz is the cutting edge speed for the CPU (which seems very likely) and the CPU consumes 85W at that speed. Could that mean that a 2.6Ghz version might consume less then half that (under 40W)? Not making any claims, just asking a question.
 
Last edited by a moderator:
Sure it's possible - I know that for the Athlon 64's I frequently deal with, you have to jack the wattage/voltage immensely (depending on luck) to get past the 2500 MHz barrier, but below say, 2200 MHz, the voltage requirements plummet.

For example, I can run an A64 2800 overclocked at 2100 MHz, yet undervolt it to 1.3 vs 1.5v standard. The wattage drop is very significant. To go further, I can take it down to 1100 MHz and run it at 0.9v, putting it under 20 watts probably.
 
Teasy said:
I was looking at the power consumption of the 970FX earlier and noticed how it drops off very fast with only a small speed drop. For instance at its fastest speed of 2.2Ghz the CPU consumes 48 watts and at 2Ghz it consumes 40 watts. Yet at 1.6Ghz it consumes only 17 watts! That's a 57.5% drop in power consumption based on only a 20% downclock. Could the XCPU in 360 act similarly? Assuming 3.2Ghz is the cutting edge speed for the CPU (which seems very likely) and the CPU consumes 85W at that speed. Could that mean that a 2.6Ghz version might consume less then half that (under 40W)? Not making any claims, just asking a question.

No, the reason for this power consumption is Apple.

If you have used a Macintosh with a G4 CPU at the same clock of a G5 CPU you can see that in the 99% of the cases the 2 CPU has the same power in the same conditions.

Apple canceled the PowerBook G5 with low consumption G5 first and after the Intel jump IBM has no interest to adapt the 2Ghz G5 and beyond to a mobile processor versión.
 
I can't speak to IBM's Power chips, but in all liklihood there aren't actually any physical differences between the desktop chips and the mobile chips. Again just using the AMD's as an example, the chips are the same for the most part. The only difference is that the mobile chips are chips that are binned and found to work at a certain frequency at a lower voltage requirement than the standard CPUs. Thus the only difference is the voltage and clock they are set at, not the fact that it's a 'different chip' or anything like that.
 
Teasy though not dealing directly with PPE style cores or other Power derivatives, like the 'G' class chips used in Apple products, here's the schmoo plot from Cell and it's SPE's:

cell-8.gif


If viewed in terms of percentages - or just watts in absolute terms - it becomes readily apparent that wattage skyrockets when higher voltage is applied. The 'sweet spot' for these chips, depending on what you need them for, is to try and get as high a speed as you can at as low a voltage as you can. If nothing else, it's clear why Sony will be running Cell at 3.2 GHz - because at that speed they might be able to pull off a 0.9 voltage. Don't know what the PPE needs though... they may need to raise it higher anyway, but hopefully not above 1.0.
 
Last edited by a moderator:
Teasy said:
I was looking at the power consumption of the 970FX earlier and noticed how it drops off very fast with only a small speed drop. For instance at its fastest speed of 2.2Ghz the CPU consumes 48 watts and at 2Ghz it consumes 40 watts. Yet at 1.6Ghz it consumes only 17 watts! That's a 57.5% drop in power consumption based on only a 20% downclock. Could the XCPU in 360 act similarly? Assuming 3.2Ghz is the cutting edge speed for the CPU (which seems very likely) and the CPU consumes 85W at that speed. Could that mean that a 2.6Ghz version might consume less then half that (under 40W)? Not making any claims, just asking a question.

I doubt 3.2 GHz speed is the cutting edge speed. Its most likely a sweet spot for price/performance/power. Cutting edge speed most probably be higher.

Power is definitely lower as you lower voltage, but how good of performer is PPE at lower clockspeed ? I mean some core like P4 are just design for high clockspeed to get its performance.
 
Guden Oden said:
They'd possibly want to do it [2MB L2] because Nintendo is a very ease-of-development-centric company these days, as opposed to highest peak performance figures.
Given that you could probably trade 1MB of L2 for another PPE core, and that most developers are going to be targeting 512K of L2 or less per core, it doesn't seem very likely that they'd make that choice.
 
darkblu said:
how successfully an in-order cpu will run code meant to run ooo on an cpu only ~4 times slower, i.e. ~2GHz in-order vs. 0.5GHz out-of-order.
Given they are instruction level compatible I don't see any problems there. Not to mention 750x series OOOe isn't exactly earthshakingly efficient to begin with.

Still, if Nintendo opts for in-order CPU I'll be kinda dissapointed. Don't see the reasoning for 2MB of cache either, I'd imagine 970FX even on slightly lower clock(1.6Ghz?) with 512-1MB of cache would tend to perform better in game apps.
Thermal/cost characteristics may not be in favour of that though - I wouldn't know.
 
Fafalada said:
Given they are instruction level compatible I don't see any problems there. Not to mention 750x series OOOe isn't exactly earthshakingly efficient to begin with.

well, the problem is that 750x's oooe does not need to be earthshakingly efficient to give an x4-clock in-order sibling a run for its money, to begin with : ) of course, it's all a big game of statistics: how does existing gc code utlize geko's oooe-ness, and the way i see it, there's a non-zero probability that there's some extreme-case code somewhere that meets its deadline mainly thanks to the cpu's oooe-nes. *shrug* anyway, we'll surely see.

Still, if Nintendo opts for in-order CPU I'll be kinda dissapointed. Don't see the reasoning for 2MB of cache either, I'd imagine 970FX even on slightly lower clock(1.6Ghz?) with 512-1MB of cache would tend to perform better in game apps.

yep, i totally share your view here. there should be at least one player on the market who does things differently that the rest (in a way which is not an evolutionary dead-end*)

* yep, i'm a big proponent of the GA way of doing things.
 
Fafalada said:
Given they are instruction level compatible I don't see any problems there. Not to mention 750x series OOOe isn't exactly earthshakingly efficient to begin with.

Still, if Nintendo opts for in-order CPU I'll be kinda dissapointed. Don't see the reasoning for 2MB of cache either.
What if the cache can be locked and used directly like a SPE LS? Could this speed up access beyond normal RAM addressing + cache management? Though at the L2 level the speed won't be there for that to be as beneficial as SPE's LS, and the work will still be being done in the D and I caches. So no, I guess that's a useless idea.
 
Back
Top