Nintendo's hardware choice philosophy *spawn

Nintendo could have probably done decently better hardware for similar BOM costs, but what about NRE? A few hundred million in R&D can make a pretty big difference, then there's the expense of tool creation and developer learning curve.. Wii U CPU hardware sounds like something Nintendo developers could really easily migrate to. And it eschews costs in maintaining backwards compatibility.

On the other hand, I really have no idea what the $1b they allegedly gave IBM was for.

Frankly I don't think Nintendo likes software emulation, they've never provided a primary previous gen backwards compatibility solution with software. Highly efficient emulators are difficult to write (the guy who led the XBox one called it the most challenging project of his career and I would agree with this) and always accept at least some compromise in compatibility or performance. Both XBox 360 and PS3 gave up on software emulation entirely, I'd consider this a pretty glaring testimony.
 
Frankly I don't think Nintendo likes software emulation, they've never provided a primary previous gen backwards compatibility solution with software.

Wut?
The Wii plays N64 titles in the Virtual Console, and the 3DS plays GameBoy Advance titles without GBA's hardware.
They're doing it when they have a 2-gen (10 years) difference between consoles, which the Wii U should have before the Wii because the latter is just a Gamecube*1.5.

Had the Wii U been built with proper hardware for late 2012 (nothing too extravagant, but some 100W power envelope together with ~350mm^2 worth of 40nm chips), software emulation with higher resolution would've been easy for Nintendo.

How easy? The Dolphin emulator can run Wii titles flawlessly at 3x its original resolution with a nVidia G92 and a fast-enough x86 CPU.
 
Nintendo could have probably done decently better hardware for similar BOM costs, but what about NRE? A few hundred million in R&D can make a pretty big difference, then there's the expense of tool creation and developer learning curve.. Wii U CPU hardware sounds like something Nintendo developers could really easily migrate to. And it eschews costs in maintaining backwards compatibility.

On the other hand, I really have no idea what the $1b they allegedly gave IBM was for.
Well to me the issue here is that ultimately Nintendo developers still have to deal with novelty. I mean in no way the system will spare them the effort to deal with a multiple CPU cores and programmable shaders. From my point of view they face now the same challenge most others developers faced at the beginning of this generation. So ultimately I do not really get the argument that for example an extra core, higher clock speed, better single thread performances per cyle that a design based (or a plain) ppc 47x would have delivered, would have made their life more difficult. If I were snarky I would question to which extend the shipping Nintendo games make use of the three cores of the design. Extra cache, higher clock speed /overall quiet higher performances might be enough for what they are doing now vs what they were doing on the Wii).
Then there is what seems like a pretty complex memory hierarchy.

The whole thing is that Nintendo doesn't seem that far from delivering something that would have been more in between the ps360 and their successors. But they made what seems like crippling decisions, low core count, low clock speed, not much on of increase in accessible RAM to application, no build in work around for optical drive seeking/reading speed (which Sony did with the last ps3 and its 16GB of flash of which 12 are free to use partial instal/ Nintendo may have reserved a given amount for caching instead as I can't see them support instal actually I agree with them on the matter).

For the 1 billions payed to IBM, I hope you are right and they made more than simply putting three broadway and more cache together. I would wish they would have tweaked a PPC47x.
Either way I wonder if they could have paid IBM to mostly "pilot" the whole project.
Frankly I don't think Nintendo likes software emulation, they've never provided a primary previous gen backwards compatibility solution with software. Highly efficient emulators are difficult to write (the guy who led the XBox one called it the most challenging project of his career and I would agree with this) and always accept at least some compromise in compatibility or performance. Both XBox 360 and PS3 gave up on software emulation entirely, I'd consider this a pretty glaring testimony.
Indeed they do not like it, but it starts to badly alter there freedom to move to more efficient design (and simpler form the a software pov, SoC + UMA is pretty straight forward) :S

Really I don't know how tough it would have been for them to move to emulation. MSFT and Sony gave up on it, you have an indisputable point here, but they were also less "consistent" with their design choices.
With the PS3 Sony moved from an old MIPS CPU and massive vector units to the cell, which was quiet different, different ISA for both units. ANd on the GPU side the GS was quiet an UFO.
Msft had a more straight forward architecture with the Xbox, with a pretty standard CPU set-up, a GPU that was a close relative to existing pc part and a UMA. Still they moved from X86 to POWERPC, with a new CPU which was not a match for their previous CPU in every way (in some cases lesser single thread performance), they also moved from Nvidia to AMD making sure that there was not any resemblance between both designs.

On the other hand may Nintendo have moved from what was mostly was a PPC750 to a PPC 47xs, would have been that much of a challenge? I mean those are really close CPU as far as the ISA is concerned, but the latter out perform the former in every way (per cyle and operates at more than twice the clock speed).
There is the GPU, I wonder how much hardware packed in that GPU for the sake of BC, but they could have done the same in a more straight forward SoC fed on DDR3.

I really can't tell, but I've this lasting feeling that assuming some level of support on the GPU side it was not the montain that it was for both MSFT and Sony when they moved to this gen with radically different design.
Wut?
The Wii plays N64 titles in the Virtual Console, and the 3DS plays GameBoy Advance titles without GBA's hardware.
They're doing it when they have a 2-gen (10 years) difference between consoles, which the Wii U should have before the Wii because the latter is just a Gamecube*1.5.

Had the Wii U been built with proper hardware for late 2012 (nothing too extravagant, but some 100W power envelope together with ~350mm^2 worth of 40nm chips), software emulation with higher resolution would've been easy for Nintendo.

How easy? The Dolphin emulator can run Wii titles flawlessly at 3x its original resolution with a nVidia G92 and a fast-enough x86 CPU.
I'm not sure everything runs flawlessly but truth is with a PPC cpu and some hardware support on the GPU, Nintendo should have invested in making that happen.

Actually I wonder if those emulators are legal ? under which kind of license they are broadcasted.
Would be funny if Nintendo can use that tech for free as a basis for their tech as long as they keep the result as an open source (which would be irrelevant for them as there is no point for anyone to pirate something that runs on PPC and would rely on some form of hardware support at the GPU level=> no other hardware but WiiU would be compatible). :LOL:
 
Last edited by a moderator:
This the "have your cake and eat it too" problem. Basically saying why can't nintendo release cutting edge hardware and sell it for cheap as well? I don't think that is possible.
It doesn't have to be so extreme. I'll use Wii as an example. Prior to its reveal we were looking at possible tech. A cheap box could have sported a DX9 ATi RV3xx (RV360) class GPU and would have produced a far better console than Wii's GC*2. Nintendo may then have attracted a much larger audience, and at no cost. The choice for Wii seemed to centre on BC.

There's a compromise being made between price and performance, of course. I'm just saying Nintendo's philosophy always favours cheap at the moment, and not even that. I seriously question their choices in terms of price/performance/cheapness. Certainly at the prices they sell at they could favour more potent hardware and appeal to a somewhat larger audience. It's as though Nintendo feel that you have to be the most powerful, most expensive hardware to attract the core audience, and that's not a route they want to take, so they go as far as possible in the opposite direction!
 
1) there are many, many millions of households without an HD console

2) surely Nintendo want HD console owners to upgrade to Wuu so Nintendo make far more money selling to a far larger market?

Your basically saying that Nintendo have given up on audience that isn't already a Nintendo fan who'll buy a Nintendo console to play Nintendo games. If instead they had really strong 3rd party support with amazing added value from the unique Wuublet experience, they could attract over a few HD gamers. It's Nintendo themselves that said they were going after the core gamer with Wuu - if they were serious about that (debatable!) then they need 3rd party support.

They might have been serious about the core gamer, but based on what we're seeing so far they have missed their mark. I still think that from a purely marketting perspective they needed to ensure that the ports that were in development had AT LEAST identical performance and also with a noticeable bump of some sort, even if that was true 720p or higher framerates. It's hard to do cause the port job is in the hands of the developer/publisher but a machine that can't even brute force current gen ports (Even if it was super powerful when used properly) doesn't look good on the surface.

It's early days yet and I don't think Nintendo's success rests on the core gamer but I do feel they have been a little disingenuous regarding their core gamer support.


But the question is well put - personally I certainly don't think the screen in the controller is nearly as interesting as good motion control is, and I can't see it having the same ability to pull in the curious.
I think Nintendo tried too hard this generation to provide a new experience, and threw out the baby with the bath water. We'll see how the market responds. The overall console market is in decline and all players there tread a fine line going forward. As I mentioned in another thread, if Nintendo face difficulties, I think it is wise to assume that Sony and Microsoft won't find consumers any easier to convince.

Based on my experience (I've had one a week now)I see the GamePad as something vastly superior to Motion Controls. The only downside is that it's not immediately obvious. Motion was obvious but it was sort of limited in its implementation and not many pieces of software took advantage of the of it. The Wii U (and GamePad), while not something new per-se is quite compelling when you look at it as part of a whole system.

TouchScreen controller for games. TV Controller, Media controller, Web browser and social networking via MiiVerse all optionally usable while the TV is off or used by someone else.

The problem is that it's greater than the sum of it's parts so requires people to try it out before they understand. The Wii Remote was only one part really (OK IR and Motion) and was easy to demonstrate.
 
Well to me the issue here is that ultimately Nintendo developers still have to deal with novelty. I mean in no way the system will spare them the effort to deal with a multiple CPU cores and programmable shaders. From my point of view they face now the same challenge most others developers faced at the beginning of this generation. So ultimately I do not really get the argument that for example an extra core, higher clock speed, better single thread performances per cyle that a design based (or a plain) ppc 47x would have delivered, would have made their life more difficult. If I were snarky I would question to which extend the shipping Nintendo games make use of the three cores of the design. Extra cache, higher clock speed /overall quiet higher performances might be enough for what they are doing now vs what they were doing on the Wii).

Then there is what seems like a pretty complex memory hierarchy.

I agree that moving to multi-core is going to be a new challenge for Nintendo, although they have some level of experience with DS and 3DS (although DS is hardly an SMP setup and would seem pretty auxiliary the libraries do have a fair bit of concurrency control). But I don't agree that this means no reduced challenge vs other hardware.

Compare with XBox 360 and PS3. There you'd be targeting 6-8 threads instead of 3. Hand-coded assembly needs to be much more aggressive scheduled, and a lot of critical code needs to be hand vectorized for a new ISA (they'd have paired singles libraries going back forever, and that stuff's easier to write to begin with). There are gotchas a mile long for things to avoid, like branches and load-hit-stores, and you have to prefetch a lot more conscientiously.

I don't know about the cache hierarchy being complex, but I'm not aware of anything that this would do to contribute to software difficulty. If there's a generous amount of cache for this level of compute that just makes things easier.

PowerPC476FP is relatively new, possibly too new to have been a viable candidate for Nintendo who probably likes sourcing out parts far, far in advance. Backwards compatibility would require software emulation or support built into the core (not just for the functionality but also decoding, the ISAs are not totally compatible). AFAIK this core doesn't have any FP SIMD at all so it'd be even worse in that regard. Or were you thinking about a different processor?

The whole thing is that Nintendo doesn't seem that far from delivering something that would have been more in between the ps360 and their successors.

Not with PPC47x, unless I'm not doing a good job understanding its specs.

But they made what seems like crippling decisions, low core count, low clock speed, not much on of increase in accessible RAM to application, no build in work around for optical drive seeking/reading speed (which Sony did with the last ps3 and its 16GB of flash of which 12 are free to use partial instal/ Nintendo may have reserved a given amount for caching instead as I can't see them support instal actually I agree with them on the matter).

Honestly, most of these things are ways it falls behind its current competitors, not things that keep it from being next gen :/ The RAM part is probably a non-issue for this level of technology though.. even if it's weird that so much of it is inaccessible.

For the 1 billions payed to IBM, I hope you are right and they made more than simply putting three broadway and more cache together. I would wish they would have tweaked a PPC47x.
Either way I wonder if they could have paid IBM to mostly "pilot" the whole project.
Indeed they do not like it, but it starts to badly alter there freedom to move to more efficient design (and simpler form the a software pov, SoC + UMA is pretty straight forward) :S

Really I don't know how tough it would have been for them to move to emulation. MSFT and Sony gave up on it, you have an indisputable point here, but they were also less "consistent" with their design choices.
With the PS3 Sony moved from an old MIPS CPU and massive vector units to the cell, which was quiet different, different ISA for both units. ANd on the GPU side the GS was quiet an UFO.
Msft had a more straight forward architecture with the Xbox, with a pretty standard CPU set-up, a GPU that was a close relative to existing pc part and a UMA. Still they moved from X86 to POWERPC, with a new CPU which was not a match for their previous CPU in every way (in some cases lesser single thread performance), they also moved from Nvidia to AMD making sure that there was not any resemblance between both designs.

On the other hand may Nintendo have moved from what was mostly was a PPC750 to a PPC 47xs, would have been that much of a challenge? I mean those are really close CPU as far as the ISA is concerned, but the latter out perform the former in every way (per cyle and operates at more than twice the clock speed).

I don't know about clock speed, what I'm reading says it's up to 2GHz. A 9-stage pipeline is going to be a big limiter there.

It's possible it wouldn't be a huge effort for emulation with a little bit of hardware glue but I think Nintendo is very paranoid about emulation.

But it wouldn't have come from rearchitecting something like Dolphin, that's too far off what they'd want.
 
Wut?
The Wii plays N64 titles in the Virtual Console, and the 3DS plays GameBoy Advance titles without GBA's hardware.
They're doing it when they have a 2-gen (10 years) difference between consoles, which the Wii U should have before the Wii because the latter is just a Gamecube*1.5.

Well what did you think I meant when I said "previous gen"? There are enough people who did N64 and GBA emulators for platforms that were similar in capability to Wii and 3DS respectively that Nintendo can outsource the engineering. AFAIK they didn't do the VC stuff in-house. Much different from the effort needed for previous gen emulation.

I wouldn't be so sure 3DS doesn't have something resembling GBA hardware either. Since it most likely has complete or nearly complete DS hardware, that pretty much gives it GBA hardware. Even having compatible 2D hardware gives you a massive boost.

Your logic that they should be able to do a Wii because they covered other 10 year gaps and Wii is basically a GC (although that 1.5x makes an absolutely enormous difference when it comes to emulation) is flawed. And that's because with each generation emulation has gotten harder. You need a pretty high end x86 computer to even start to do Wii emulation comfortably - my i7 Ivy Bridge laptop can only sort of do it - and that's already way past what's reasonable performance for Wii U. And they've been working on this for years. Of course compatibility problems abound. Nintendo obviously doesn't want to accept compromises in system loaded backwards compatibility where they have to deal with games not working or not working well (that's probably why Sony and MS both dropped what they had). This is very different from VC where they control it on a game-by-game basis and are free to control working games only, as well as hack the emulator for compatibility and performance. Trying to do this over an entire library all at once is not practical or probably even possible.
 
The only games which really matter from Nintendo's/customers perspectives are Nintendo games and Just Dance. They could easily recompile their games (which is probably the case with virtual console N64 games) to run on a different architecture.
 
The only games which really matter from Nintendo's/customers perspectives are Nintendo games and Just Dance. They could easily recompile their games (which is probably the case with virtual console N64 games) to run on a different architecture.

Yeah, they could do this and sell them all back to you ala VC.

When you offer BC as a console feature it isn't so simple as only caring about some games. When things don't work it looks bad. It's hard to communicate how you're only partially supporting it to people who are considering buying it.

I don't know why you think the N64 games are ports, do you have any evidence to support that? Saying they could "recompile" them is grossly trivializing what's actually a pretty large amount of work, in the case of N64. For Wii it might be somewhat less but still not just a matter of changing the compiler.
 
@ShiftyGeezer you are talking about a cheap-box based on PC hardware? Consoles and PCs are still different no? or is it the same thing now?

I think backwards compatibility should not be ignored. Just as the so called "hardcore" want ports they also what to keep playing the games they bought last generation. I am sure the PS4/NextBox will have a tough time emulating their previous gen hardware and games - even if they are 10 x faster.

Not to mention all the downloadable games being sold. Backwards compatibility is going to be an even more important consideration now than ever.
 
Yeah, they could do this and sell them all back to you ala VC.

When you offer BC as a console feature it isn't so simple as only caring about some games. When things don't work it looks bad. It's hard to communicate how you're only partially supporting it to people who are considering buying it.

Yes that is true but the majority of the games which will be put into the Wii U for backwards compatibility will be Nintendo games because for various reasons they are generally the better titles with the most longevity. They could easily say 'backwards compatible with Wii games made by Nintendo only'. Though I guess if you're supporting the legacy controllers you may as well support the whole thing.

I don't know why you think the N64 games are ports, do you have any evidence to support that? Saying they could "recompile" them is grossly trivializing what's actually a pretty large amount of work, in the case of N64. For Wii it might be somewhat less but still not just a matter of changing the compiler.

I said that they *could* be ports. You're not running the original cartridge so how would you know if it was working in a virtual console or if it had been subtly changed to facilitate playback on the Wii?
 
Yes that is true but the majority of the games which will be put into the Wii U for backwards compatibility will be Nintendo games because for various reasons they are generally the better titles with the most longevity. They could easily say 'backwards compatible with Wii games made by Nintendo only'. Though I guess if you're supporting the legacy controllers you may as well support the whole thing.

Did you know that Nintendo doesn't even have their entire N64 catalog on Virtual Console? Given that they made far more games for Wii I think you're talking about a ton of support. It doesn't take a lot of difficult cases to turn the whole thing bad.

What Nintendo could be concerned about is dealing with customers who are complaining about the lack of or badly performing BC. It's not enough to make 98% of your customers happy.. reducing the unhappy from 2% to 1% is a big win.

I said that they *could* be ports. You're not running the original cartridge so how would you know if it was working in a virtual console or if it had been subtly changed to facilitate playback on the Wii?

No, you did not say that they could be ports, you said they they're probably ports. I don't think I need to explain the difference to you.

And I actually do know for a fact that the VC N64 games aren't ports because people have been able to inject other ROMs into them.
 
What Nintendo could be concerned about is dealing with customers who are complaining about the lack of or badly performing BC. It's not enough to make 98% of your customers happy.. reducing the unhappy from 2% to 1% is a big win.

I guess so, this is one of the pros to support their decision to make the Wii U like it is even though the general consensus here is not supportive of that decision.



No, you did not say that they could be ports, you said they they're probably ports. I don't think I need to explain the difference to you.

And I actually do know for a fact that the VC N64 games aren't ports because people have been able to inject other ROMs into them.

There you have it. They use some form of emulation then. I was just following on from comments regarding recompiling Xbox 360 titles for Durango and how that would be an easy solution.
 
I can't see Microsoft offering XBox 360 compatibility via ports, at least not for free. The development effort is way too high, and it requires heaving to deal with source code that they, for the most part, would have barely dealt with (if at all). They'd most likely have to contract the original developers to do it as much as possible.

It'd be really impressive if they pulled off an emulator that worked at all, but I think the experience with XBox 360 BC must have left a really bad taste in their mouth. I know if I bought a launch console along with a previous gen came because it advertised BC, then found out the game doesn't work, I'd be really pissed. Even more if the game worked until some big show stopping bug halfway through. Nintendo doesn't really wow me with any of their decisions (to put it mildly) but I can understand and respect their foolproof and paranoid approach to backwards compatibility. And this is coming from someone who loves emulators and has used them more than real consoles over a span of many years.
 
I agree that moving to multi-core is going to be a new challenge for Nintendo, although they have some level of experience with DS and 3DS (although DS is hardly an SMP setup and would seem pretty auxiliary the libraries do have a fair bit of concurrency control). But I don't agree that this means no reduced challenge vs other hardware.
--------------------------------------
Compare with XBox 360 and PS3. There you'd be targeting 6-8 threads instead of 3. Hand-coded assembly needs to be much more aggressive scheduled, and a lot of critical code needs to be hand vectorized for a new ISA (they'd have paired singles libraries going back forever, and that stuff's easier to write to begin with). There are gotchas a mile long for things to avoid, like branches and load-hit-stores, and you have to prefetch a lot more conscientiously.
I do agree that three (could have been four though) OoO cpu are more developers friendly (Nintendo or not). Xenon was quiet an unforgiving bitch that needed to be nursed to get any form of sane performances.
I don't know about the cache hierarchy being complex, but I'm not aware of anything that this would do to contribute to software difficulty. If there's a generous amount of cache for this level of compute that just makes things easier.
I was speaking of the whole memory hierarchy, not the CPU caches, it is still unclear if the CPU can access the EDRAM located on the GPU die. I think it is a possibility that it can for the sake of emulating the embedded RAM in their previous design. But that pretty iffy, let wait to learn more (with the most likely scenario being that Edram for all intend and purpose act like limited (by the size) of VRAM).
PowerPC476FP is relatively new, possibly too new to have been a viable candidate for Nintendo who probably likes sourcing out parts far, far in advance. Backwards compatibility would require software emulation or support built into the core (not just for the functionality but also decoding, the ISAs are not totally compatible). AFAIK this core doesn't have any FP SIMD at all so it'd be even worse in that regard. Or were you thinking about a different processor?
------------------------------
Not with PPC47x, unless I'm not doing a good job understanding its specs.
-----------------------------
I don't know about clock speed, what I'm reading says it's up to 2GHz. A 9-stage pipeline is going to be a big limiter there.
I was indeed thinking of that CPU. I can't tell when IBM finalized the product / when it was available either to costumers, but the product brief is dated from August 2010.
The ISA is indeed different. I went back to those paper and thanks to your comment it make me realize that I was making a wrong assumption: as the CPU handles FP in double precision it might like previous designs uses a paired FPU. Reading through the white paper, it was a faulty assumption of mine.
Whereas it seems that the CPU can mostly achieve ~2 DP FLOPS per cycle, it doesn't automatically imply that he would achieve ~4 in SP. I guess I'm not the only that had that misconception as many others posters here may have gone by the same assumption.
So it is always nice to have people like you clearing misconception.

If that CPU doesn't have a paired FPUs (like broadway and PPC750) but instead a single FPU that "natively" handles DP calculation (a bit like the DP version of the Cell doesn't provide twice the throughput of the original Cell in SP mode), it indeed makes it a bad target for Nintendo.
I think this alone could explain why Nintendo discarded it, more than timelines.

But that get me back to your question about what IBM made for the billion they were given(either way the figure is wrong, I never read till recently here).

It is a lot of money, one could wonder if IBM could have tweaked such a CPU (the 476FP) and replace the DP FPU with a paired FPU ala broadway. The basis of the CPU is pretty good, the pipeline is a bit longer than on Broadway/ppc 750, so it clock a bit higher, it is a 4 issue design and I think you told us that Broadway is a 2 issue design. It has more advance OoO execution (up to 32 instructions on the fly). It can dispatch up to 6 instruction at a time to the functional units, mostly like more modern branch unit, and so on. As the CPU is to handle DP calculation pro-efficiently I would assume that data path are already 64 bit wide and that feeding a paired FPU, SIMD style, would not have necessitate a rework of those aforementioned data path (vs pushing to 4 wide SIMD).

I don't know why I go through this you read the paper and better than I did :oops:

Indeed what IBM did for its money, is indeed the billion dollars question.
THe clock speed of Expresso indeed point to a pretty short pipeline. One could assume on the clock speed alone that it is mostly the same 5 stage pipeline as in broadway pushed to its limits on a newer process.
It's possible it wouldn't be a huge effort for emulation with a little bit of hardware glue but I think Nintendo is very paranoid about emulation.

But it wouldn't have come from rearchitecting something like Dolphin, that's too far off what they'd want.
Well actually I though (and I am not alone) that Nintendo had an option for a off the shelves CPU that was a modernized of the PPC 750 which Broadway is built upon. The premise turned out wrong and based on a misunderstanding of me and others.

May be Nintendo would have been willing to work some "software magic" to make BC happens with a CPU as I described above (but was indeed not the ppc 476fp) but they never had the chance. They would not have gone as far as switching to another ISA that for sure.
The CPU they may have been searching might simply does not exist in IBM port folio. A bit like if MSFT or Sony had shipped this year the pretty sexy, well rounded, CPU that Jaguar seems to be would have not be an option. Either way they would have to develop their own CPU, and that is costly. Either way there was the Power a2 CPU but back to your first point, it might not be the most developers friendly option around even if Nintendo were to set a 2 threads per core limit. Dealing with 8 threads, really sucky single thread performance, etc. would have affected Nintendo team.

For now on, I will discard that hypothesis that the PPC 476fp without modification was an option for Nintendo.
I would also discard that the hypothesis that Expresso could be a customized PPC 476, as it clock speed (really low) hint at really short pipeline.


We still are left with the 1 billions Dollars question ???
 
Last edited by a moderator:
@ShiftyGeezer you are talking about a cheap-box based on PC hardware? Consoles and PCs are still different no? or is it the same thing now?
The hardware is basically the same now, but that's immaterial. You don't make a console different just to be different. ;) In the past there were benefits to consoles having customised hardware, but Wii didn't feel them other than BC. Nintendo could have gone with PC parts, or at least a PC GPU and PPC CPU, for a sweet little, well balanced box. BC would have been pretty pointless - how many Wii buyers valued the ability to play GameCube games on their shiny new waggle-box?

Not to mention all the downloadable games being sold. Backwards compatibility is going to be an even more important consideration now than ever.
Yep. But download games will be able to sit on middleware layers like Windows 8 apps. There's a whole other thread for the discussion of BC's value. Including BC is certainly one reason for Nintendo's choices.
 
The ISA is indeed different. I went back to those paper and thanks to your comment it make me realize that I was making a wrong assumption: as the CPU handles FP in double precision it might like previous designs uses a paired FPU. Reading through the white paper, it was a faulty assumption of mine.
Whereas it seems that the CPU can mostly achieve ~2 DP FLOPS per cycle, it doesn't automatically imply that he would achieve ~4 in SP. I guess I'm not the only that had that misconception as many others posters here may have gone by the same assumption.
So it is always nice to have people like you clearing misconception.

If that CPU doesn't have a paired FPUs (like broadway and PPC750) but instead a single FPU that "natively" handles DP calculation (a bit like the DP version of the Cell doesn't provide twice the throughput of the original Cell in SP mode), it indeed makes it a bad target for Nintendo.

It's a pretty natural assumption that a processor with fully pipelined FP64 would repartition the FPU to allow for paired singles, since it's not that much more logic (Loongson does this for instance). I figured that the reason the 750 didn't do this in the first place is because they had VEX/AltiVec planned and there was no point polluting the instruction set space with both. So Gekko was free to do this.

Not totally sure why PPC476FP doesn't, except that it might involve ISA extensions they want to avoid.

It is a lot of money, one could wonder if IBM could have tweaked such a CPU (the 476FP) and replace the DP FPU with a paired FPU ala broadway. The basis of the CPU is pretty good, the pipeline is a bit longer than on Broadway/ppc 750, so it clock a bit higher, it is a 4 issue design and I think you told us that Broadway is a 2 issue design. It has more advance OoO execution (up to 32 instructions on the fly). It can dispatch up to 6 instruction at a time to the functional units, mostly like more modern branch unit, and so on. As the CPU is to handle DP calculation pro-efficiently I would assume that data path are already 64 bit wide and that feeding a paired FPU, SIMD style, would not have necessitate a rework of those aforementioned data path (vs pushing to 4 wide SIMD)

The cores are really different, I think a good comparison would be pretty nuanced, but this is what I'm getting out of PPC476FP..

I don't think it's as OoO as you think it is. There's an 8-entry issue queue for integer instructions; each cycle three instructions can be issued to it and three dispatched from it to three execution ports. So while it has an 8-wide window to schedule from only one instruction can occupy each execution port at a time, and there's no register renaming. In a way this makes it even more limited than Broadway, which at least has a reservation station per execution port (or two for the load/store pipe) and has register renaming. Not really sure where the 32 instructions in flight window comes from - there's an 8 instruction queue for integer and FP respectively so it'd just be 16 total.

The issue is certainly wider with 3 integer plus 1 FP per cycle, but I think it doesn't actually buy you that much because the integer dispatch ports are just branch, ALU/mem, and ALU/mul. While Broadway only has 2-wide issue it actually resolves branches in the frontend. So in practice you'll be bottlenecked by the two non-branch integer issue ports the same way you would in Broadway. You'd still benefit by being able to dispatch an FP instruction as well, but that's not a huge win because you don't often mix a lot of FP code with integer code (you'd be using FP execute + FP load/store a lot, same as Broadway, but you get an integer execution port that would typically not be used for much more than address generation and loop counting; assuming the latter isn't folded into the branch handling already)

Branch prediction isn't much better here and the mispredict penalty is worse, and since there's just a standard branch target cache instead of PPC750's branch target + instruction cache there might be a fetch bubble on taken branches. Load to use latency is much worse at four cycles instead of one cycle.

I'm probably missing some important details since I didn't have a lot of time to look through the manual but the overall impression I get is that for the type of code Wii U would run PPC476FP isn't necessarily much of a win at similar clock speeds. Of course, you could clock much higher so that might be moot.. but I don't really know how the theoretical limits for Wii U's CPU clocking. Not that PPC476FP looks bad by any means, but it's probably a lot more optimized for power consumption where it doesn't really matter that much for Nintendo unless they were going for far more cores. They probably don't really want more than the 3 threads they have.

Regarding the glaring "what could Nintendo have paid IBM for" question..

Right now we're really taking for granted the claim that it's just Broadway with minor tweaks. If the guy who said this really found all of his information through reverse engineering then it's limited how much he could have analyzed the processor's performance in just a few days. It's possible this really is a generational improvement. Nintendo would have to pay quite a bit more money to have Broadway improved than to use an already designed core like PPC476FP. $1b is about right for a major CPU revision that only Nintendo will use.

But it probably still has the compatible instruction set and therefore probably doesn't offer more SIMD than paired singles (or other newer POWER instructions that would cause conflicts), and that is what it is. But if it can dual issue them that changes the equation a lot.

I think we need to see a detailed analysis; if the aforementioned Wii hacker has done it we need a revelation of what was actually found...
 
I still find it quite odd that the three cores are identical to each other, but one of them as four times the amount of L2 cache. Maybe that core is enhanced with respect to the others.

Moreover, Gekko was 43 mm2 at 180 nm and Espresso is 33 mm2 at 45 nm. Even with three cores, inefficient scaling and the increased L2 cache size, I find it hard to believe that the cores are exactly the same Gekko cores of 11 years ago.

I seem to remember that Broadway was rumored to have some small enhancements with respect to Gekko. Did we ever managed to get to know if this was true?
 
It's more concerning it has 3 cores and they don't share an L2, though I'd guess that's an artifact of jamming multicore onto an existing design. It affects how efficient synchronization primitives can be, things get flushed all the way to main memory if multiple cores are touching the data.
 
It's more concerning it has 3 cores and they don't share an L2, though I'd guess that's an artifact of jamming multicore onto an existing design. It affects how efficient synchronization primitives can be, things get flushed all the way to main memory if multiple cores are touching the data.

Coherency updates could be communicated directly from cache to cache, it doesn't necessarily have to go through main memory. Like what Intel provides with the F state in the MESIF protocol.

Is there any really definitive source on the alleged L2 arrangement? The only one I'm aware of is a pretty old rumor..
 
Back
Top