Wii U hardware discussion and investigation *rename

Status
Not open for further replies.
So, interesting tidbit from wsippel over on GAF (not sure if its already known or not): WiiU has a multicore ARM aswell as the main PPC processor.

That pretty much (potentially) frees up the main CPU from OS related tasks, no?


Edit: Sorry, not really GPU related at all...
 
Last edited by a moderator:
Since the thread has stalled a bit, I'd like to reintroduce eDRAM as a topic.

Gubbi said:
DRAM and logic processes are optimized differently. For DRAM you want very low static power, for logic you want very high dynamic performance. DRAM processes typically has less than 1% leakage power compared to a typical logic processes.

Integrating the DRAM, you end up compromising both DRAM and logic performance. You also get a more costly process. Steps are needed to create the trench capacitor for the DRAM cells, and all the metal layer steps needed for the logic area are wasted on the DRAM cell.

The compromised performance has extra consequences in a console where you cannot bin and sell slower units at a discount. In order to maximize yield you'll need to provision for higher power consumption of your lower quality bins. This impact the cost of the entire system (cooling, reliability, PSU). I'm guessing that's why MS hasn't opted for integrating the eDRAM of Xenos; They don't need the performance so it is cheaper overall to have a separate die and spend a few dozen cents on adding a substrate to connect the CPU/GPU to the eDRAM die.
To which AlStrong added
AlStrong said:
Indeed. A not-so-insignificant factor when it comes to fab times or yield...
And to those two I'd add software issues.
Ports of software developed on other platforms is likely to under utilize the eDRAM (or even not use it at all!), since it encourages a way of doing things (lots of read-modify-write passes) that is a bad model when bandwidth is more limited. To some extent the same goes for upcoming multi platform titles - lowest denominator coding will penalize the eDRAM approach of Nintendo.

You don't do a custom design with it's associated issues to reach parity with a more traditional way of doing thing at a given cost in R&D, gates and power. It has to be decidedly better than the more straightforward approach to justify itself.

So what I'm missing is an in depth discussion about how the speculated amount of eDRAM would affect what techniques you could use, what the the gains would be, then how targeting multiple platforms might compromise what you can do, and so on and so forth. I can't believe Nintendo would have chosen that path if they didn't feel certain the gains would be substantial. Otherwise they could have saved themselves the risk by spending their budget elsewhere, or simply saved time, money, power and gates and dropped eDRAM entirely from the GPU. Shifty made a valiant effort to get a good thread going on eDRAM back in January, but it got bogged down in the shortcomings of the XBOX360 implementation. A fresh take would be appreciated.
 
Since not much activity has been happening in this thread, I'll take the opportunity to point out some statements I noticed in the DF analyses of the Wii U, and, hopefully, encourage some discussion.

In this article, Richard made some interesting statements. Like:
On the plus side, Wii U benefits from a significantly more modern graphics core, equated by many with an entry-level enthusiast GPU a couple of generations old, provided by AMD. Our sources tell us that the hardware is rich in features compared to the Xenos core within the Xbox 360 (also supplied by AMD) but somewhat lacking in sheer horsepower: still a useful upgrade overall though.
I suppose the question now, though, is lacking in sheer horsepower compared to what? Xenos or some kind of expectation for a next-gen 2012 console?

[. . .]the tri-core IBM "Espresso" CPU is an acknowledged weakness compared to the current-gen consoles - the processors consisting of revised, upgraded versions of the Wii's Broadway architecture, in itself an overclocked version of the main core at the heart of the ancient GameCube. Nintendo clearly hoped that tripling up on cores, upping clock speed and adding useful features such as out of order execution would do the trick[. . .]
I don't exactly now whether this came from a source(s) or whether it is just from the other rumors that were flouting around, but maybe Espresso is a derivative of the PowerPC 7xx architecture.

[. . .]We also know that the silicon is manufactured in the 40-45nm range[. . .]
Again, I don't know whether this is from a source or not, but could potentially be a nice tidbit.


Moving on to this older article, I don't think anybody commented on this statement.

One developer working on a Wii U launch game told us he was able to optimise his game's CPU main core usage by up to 15 per cent throughout the course of development, and further optimisations are expected before the game comes out.

We've heard about the asymmetrical L2 Cache, and someone, I believe it was Iherre on GAF
in one of its countless & infamous Wii U Speculation Threads
, hinted a couple times to an asymmetrical design with one "master core" and two "slave cores" I believe is how he put it. Perhaps a Gaf member would be willing to track down exactly what he said in his posts. The above post seems to support that, referring to a "main CPU core." It's all just speculation, but how might such a system be set up? Would the game run on the main core with devs. distributing certain instructions to the other cores without them explicitly running their own threads? Maybe it solves some issues with latency that multithreaded games tend to have?
 
In this article, Richard made some interesting statements. Like: I suppose the question now, though, is lacking in sheer horsepower compared to what? Xenos or some kind of expectation for a next-gen 2012 console?
I expect lacking compared to a GPU expected for the price range and a 2012 console, in the same way Wii was lacking horsepower irrespective of what PS360 had to offer.

The above post seems to support that, referring to a "main CPU core."
It's ambiguous. You could parse it with the 'usage' attached - "main CPU core usage" - and interpret it as the main CPU core (symmetric MP) as opposed to the ancillary cores (DSP, possible ARM/s), or even a bad explanation that the game's main code on the CPU has been optimised but the extra code is still struggling.

The only reason I can see for an asymmetric core would be cheapening out. ;) Provide one full-fat, full-taste proper CPU core, and provide a couple of lite, sugar-free variations on the side to support parallel processing of simpler jobs. Given the die is so flippin' tiny, this would be a bit of a slap in the face to devs IMO. There'd be no need to make such a limited system given that three full-fledged cores with all the trimmings could be included in a very small, low power part.

Apart from an asymmetric cache, I don't believe the CPU design is asymmetric.
 
512k L2 isn't terrible, if that's what the "sugar-free" cores are equipped with; current intel desktop CPUs only have 256k, and the need for the L3 they're paired with is generally due to the bloated nature of today's desktop and server software.

This probably is the least point of concern about the wuu's hardware, IMO, along with things like the number of USB ports etc (which is surprisingly generous akshully.)
 
512k L2 isn't terrible, if that's what the "sugar-free" cores are equipped with;
No, by Lite cores I mean reduced performance asymmetric cores, such as in-order or without SIMD units or whatever. If the cores aren't different, then there's not really a 'main' core. 512 KB Cache versus 2 MB is neither here nor there except for some particular data-heavy functions.

So I think we have three symmetric cores, with no 'main' core beyond a difference in cache.
 
Nintendo recently released a financial report that said, among other things, that they will initially be selling the Wii U at a loss

Think that says anything about the hardware they could have included, or just it just tell us that the Gamepad was really bloody expensive
 
In this article, Richard made some interesting statements. Like: I suppose the question now, though, is lacking in sheer horsepower compared to what? Xenos or some kind of expectation for a next-gen 2012 console?
I was ready to give that article a fair show, even after I saw what website it was coming from

But upon actually reading the article, it seems they're still quoting "anonymous sources". This is the same website that published this charming little piece, which has since become famous for the following line: "There aren't as many shaders, it's not as capable."

HURR LESS SHADERS

I'm not trusting anything from that website, whether I'd like to believe it or not, if they're going to continue to make claims without any sources.
 
I'm not trusting anything from that website, whether I'd like to believe it or not, if they're going to continue to make claims without any sources.
:???: They're a (the?) leading industry website, not consumer website. Of course their sources are going to be valid; they're not going to randomly make stuff up and have those in the industry who name discredit them to their audience. That doesn't mean they'll be accurate (a developer could be an artist or producer without technical knowledge) but if you'll only take info from named sources, you're out of the rumour game and should just wait for official specs (which you won't get). As ever, all info should be collated and evaluated, except where the source is disreputable as some blacklisted 'tech' websites are.
 
Strange that no one has brought up that the Wii U hardware apparently has some type of onboard textural compression technique which allows for 100mb of texture memory savings. Although Two Tribes says that it isn't ASTC, perhaps a proprietary method dictated to AMD by Nintendo? One can only guess as to what the ratio is, if in fact it is even using a block based compression format.

https://twitter.com/TwoTribesGames/status/260385727286751233

Toki Tori is hardly a texture resource heavy game, (as say Trine 2 for instance) so they're using it as a cache for faster loading times. Another observation, Two Tribes has had Wii U development kits for quite some time. Even with the final dev kits & more mature SDKs, they've just stumbled upon this now?
 
Last edited by a moderator:
Strange that no one has brought up that the Wii U hardware apparently has some type of onboard textural compression technique which allows for 100mb of texture memory savings. Although Two Tribes says that it isn't ASTC, perhaps a proprietary method dictated to AMD by Nintendo? One can only guess as to what the ratio is, if in fact it is even using a block based compression format.

https://twitter.com/TwoTribesGames/status/260385727286751233

Toki Tori is hardly a texture resource heavy game, (as say Trine 2 for instance) so they're using it as a cache for faster loading times. Another observation, Two Tribes has had Wii U development kits for quite some time. Even with the final dev kits & more mature SDKs, they've just stumbled upon this now?
It was brought up and discussed in other forums, though all we can do is guess on what Two Tribes is using. Since it is under NDA for some reason, I'm also guess that it is some proprietary method. It is interesting how the devs just realized this so late during the development cycle.

BTW, Li Mu Bai, how is the Nintendo middleware SDK thread developing? ;)
 
Strange that no one has brought up that the Wii U hardware apparently has some type of onboard textural compression technique which allows for 100mb of texture memory savings.
It's hardly strange if no-one's heard about it. ;)

Looking at Toki Tori 2, it's a 2.5D game, so I guess this texture compression is some lossless format that's not very aggressive but saves for 2D artwork, recovering 20% RAM from raw bitmaps (100 mb from 500 mb guess artwork). Maybe it even halves texture size? I don't understand the comment about load times though, as the data could be compressed on disk and decompressed on load.
 
Looking at Toki Tori 2, it's a 2.5D game, so I guess this texture compression is some lossless format that's not very aggressive but saves for 2D artwork, recovering 20% RAM from raw bitmaps (100 mb from 500 mb guess artwork). Maybe it even halves texture size? I don't understand the comment about load times though, as the data could be compressed on disk and decompressed on load.
Or it gets decompressed on the fly when textures (parts of it) get loaded to the texture L1 cache (as it is done on quite a few GPUs), so it stays compressed in RAM? Would operate like a bog standard texture compression, just with some proprietary format.
 
Last edited by a moderator:
Or it gets decompressed on the fly when textures (parts of it) get loaded to the texture L1 cache (as it is done on quite a few GPUs), so it stays compressed in RAM?
That's what's happening, but that shouldn't impact load times. They'll either be loading 400 mb compressed textures off disc and storing them compressed, or loading 400 mb compressed textures off disc and decompressing during load to 500 mb. Decompression is so fast that you have decompressed one texture in the time it takes to load the next, so it shouldn't add anything to the load times. Maybe they used a monolithic texture file and spend 10+ seconds decompressing it that can now remain compressed?

Anyone know anything about AMD supporting lossless texture formats?
 
That's what's happening, but that shouldn't impact load times. They'll either be loading 400 mb compressed textures off disc and storing them compressed, or loading 400 mb compressed textures off disc and decompressing during load to 500 mb. Decompression is so fast that you have decompressed one texture in the time it takes to load the next, so it shouldn't add anything to the load times. Maybe they used a monolithic texture file and spend 10+ seconds decompressing it that can now remain compressed?

Sounds like it'll be one of those options. There are 360 games that store compressed textures in even more compressed format on the disk to take up less space and so they load faster. I think I've read of some 360 games decompressing textures on the CPU on the fly during gameplay so they can keep as much as possible as compressed as possible for as long as possible.

Given it's puny CPU the WiiU might not be able to do this on the CPU, so it could be that these developers were loading from optical into a 100 MB cache and decompressing as fast as possible on the CPU, but now they find they have hardware that will decompress on the fly either as texture data is coming from the disk or as it is passed to the GPU.

Something like that maybe.
 
Egads, the Wuu CPU isn't THAT puny. Come on now!

Even the gamecube could do realtime loadtime decompression, which was implemented in the metroid prime series for example. As gamecube already features texture compression it means textures were transcoded in real time, by a less-than-500MHz CPU, that was also running a game at the same time, with a 3D engine featuring software skinning, bones and ragdoll physics.

Fairly impressive stuff actually.
 
On gpu hardware texture decompression isn't enough for modern games?
GPU hardware decompression is typically lossy, like JPEGs. It's not bad for some content, but it does destroy detail like JPEG does, and isn't good for UI graphics or 2D sprites when you want the best quality.

It's worth noting this tweet doesn't say it's hardware decompression, but a hardware feature. At the moment, relating that to saving load times and saving texture memory has seeing texture compression as the only explanation, but maybe someone else has other ideas? Like use of flash cache??
 
Flash would be way too slow to be useable as cache for 3D rendering data. Also, you'd wear out your memory cells filling them up repeatedly with spurious game data...
 
Status
Not open for further replies.
Back
Top