Wii U hardware discussion and investigation *rename

Status
Not open for further replies.
So now it's thought that both the GPU and CPU have a separate pool of eDRAM?

The speculation seems to be that the L2 cache on the CPU is EDRAM instead of the more conventional SRAM.
Though it's certainly possible, I'd be kind of surprised if it were true given the size of the cache. It's interesting, but doesn't change anything, other than the L2 having potentially slightly higher latency, and the CPU being marginally more compact and slightly lower power draw.
 
mmhm... well, FWIW, the new Tekken Tag 2 trailer still exhibits that dynamic scaling implementation.

I was referring to PS4 and Xbox3 in that regard, though I'd still rather wait for non-PS360 ports (or lower budget 1st party titles) to assess the console's potential.

Sounds more like this chip has squat to do with POWER7. It's a tricore PPC with 2 MBs eDRAM cache.

3MB. Main core 2MB, 512KB for the other two.

Do we know if the GPU can access the eDRAM in any way? If the rumored amount is true, wouldn't that be a lot for a CPU?

Considering GC's (3MB) and Wii's CPU (both the 24MB and 3MB) could, I don't see that changing.
 
16 bit HDR = 2 bytes x 4 channels - 8 bpp x ~1 million pixels ~ 8 megs for a non AA FB. x4 = 32 MBs for 4 samples per pixel.

You also don't just want the backbuffer in there. A Full 32 MB scratchpad eDRAM would be superb for many graphics tasks. You could put your particle textures in there and read/write to your heart's content. You can render to multiple rendertargets and have direct access to those buffers with no need to export/import from system RAM. I would anticipate Wii U being strong in graphical special FX if not raw polygon and pixel power. Making the most of that BW would also require a specialist engine, meaning ports not doing as well on the hardware at first.

The 16bit HD and others buffers in eDRAM is confirming or it's just supposition?
 
To state things again we now believe that the CPU has a small amount of edram and the GPU has a big amount of edram, used for different purposes.

GPU only needs to talk to its own edram and system memory ; CPU should be able to do anything. But its edram is just L2 cache, usually this means it's transparent and programs don't even know it's there.

The simplest reason for edram L2 is : it's what IBM uses for its power efficient CPUs (PowerPC A2, bluegene). The CPU cores in question want to be fast, but not as fast as an i5 2500K, Phenom II or Bulldozer, they may be compared to AMD Bobcat/Jaguar.


3MB. Main core 2MB, 512KB for the other two.

Yes. It would be nice if true, this would be the only long-known reliable information other than three cores and edram.

The 16bit HD and others buffers in eDRAM is confirming or it's just supposition?

It's developer's choice. They may use FP16 (64bit per pixel), long supported in PC GPUs, RGBA8 (32bit per pixel) the simple, default thing, or some format that crams HDR into 32bits per pixel such as FP10 or a more advanced format.

Even on a PS3 you could use FP16 framebuffer already, but it uses a lot of bandwith and fillrate/blending operations are slower.
On Wii U it's an usable choice but you may choose to use something else.
 
To state things again we now believe that the CPU has a small amount of edram and the GPU has a big amount of edram, used for different purposes.
It's perhaps worth recapping why this is the current belief.

We have the EG rumours reportedly from devs giving a description of a tricore PPC CPU and 32 MBs eDRAM on the AMD GPU. Large amounts of eDRAM make sense on the GPU and that's where we all expected it to be. If just that, we'd be believing the CPU was using SRAM. However, IBM PR statements have claimed use of eDRAM on the CPU. There are several ways that can fit in with the current picture:

1) The rumours are wrong, and there's a load of eDRAM on the CPU
2) The PR is wrong and there's no eDRAM at all on the CPU
3) The eDRAM is being used as cache

Point 1 makes little sense. Lots of eDRAM on a low power CPU is pointless, and the die is tiny.
Point 2 requires the PR to be all-out wrong. Now where it was regards tweets, I don't believe a full press-release would be that inaccurate, so I'm happy to accept there is eDRAM on the CPU.
Point 3 seems a bit odd to replace the tried and tested SRAM with slower eDRAM, but it has precedence with IBM. There are cost and power benefits there and that seems to be Nintendo's focus with Wii U, so this explanation fits all the info so far.
 
1) The rumours are wrong, and there's a load of eDRAM on the CPU
2) The PR is wrong and there's no eDRAM at all on the CPU
3) The eDRAM is being used as cache

Point 1 makes little sense. Lots of eDRAM on a low power CPU is pointless, and the die is tiny.
Point 2 requires the PR to be all-out wrong. Now where it was regards tweets, I don't believe a full press-release would be that inaccurate, so I'm happy to accept there is eDRAM on the CPU.
Point 3 seems a bit odd to replace the tried and tested SRAM with slower eDRAM, but it has precedence with IBM. There are cost and power benefits there and that seems to be Nintendo's focus with Wii U, so this explanation fits all the info so far.

The Wii U CPU is tiny, probably 40mm2 or less. Assuming 10mm2 per MB of SRAM on IBM's 45nm process, there would be almost no room left for the 3 CPU cores if it were to use 3MB of SRAM.
As for why Nintendo would do so, your points about cost (die size) and power are probably right.
 
Last edited by a moderator:
You also get a more costly process. Steps are needed to create the trench capacitor for the DRAM cells, and all the metal layer steps needed for the logic area are wasted on the DRAM cell.

Indeed. A not-so-insignificant factor when it comes to fab times or yield...
 
DRAM and logic processes are optimized differently. For DRAM you want very low static power, for logic you want very high dynamic performance. DRAM processes typically has less than 1% leakage power compared to a typical logic processes.

Integrating the DRAM, you end up compromising both DRAM and logic performance. You also get a more costly process. Steps are needed to create the trench capacitor for the DRAM cells, and all the metal layer steps needed for the logic area are wasted on the DRAM cell.

The compromised performance has extra consequences in a console where you cannot bin and sell slower units at a discount. In order to maximize yield you'll need to provision for higher power consumption of your lower quality bins. This impact the cost of the entire system (cooling, reliability, PSU). I'm guessing that's why MS hasn't opted for integrating the eDRAM of Xenos; They don't need the performance so it is cheaper overall to have a separate die and spend a few dozen cents on adding a substrate to connect the CPU/GPU to the eDRAM die.

Cheers

And still, Nintendo apparently thought it worthwhile to include eDRAM on the CPU as well as the GPU. On the CPU it allows a smaller die, and lower power draw and cost. If the core clocks aren't very high, the timing on IBMs 45nm eDRAM is just fine for L2 cache, no excuses needed.

I'm far more interested in the data path between the CPU and the GPU and the justification of eDRAM on the GPU. Going the eDRAM route isn't a given by any means, but apparently Nintendo is confident in the benefits or they would have saved themselves the cost and risk. If the assumptions and measurements aren't too much off, we're looking at a roughly 150mm2 GPU with a design optimised for console use. There are performance levels it needs to hit in order to be a worthwhile investment at all. Put another way, in the console environment it must be capable of achieving roughly E6760 levels of performance in order to justify its existance.
 
And still, Nintendo apparently thought it worthwhile to include eDRAM on the CPU as well as the GPU. On the CPU it allows a smaller die, and lower power draw and cost. If the core clocks aren't very high, the timing on IBMs 45nm eDRAM is just fine for L2 cache, no excuses needed.

Noone's really disputing that. Clearly the extra cost is worth the die reductions resulting from it (if it makes the die half as big yet yields are slightly worse, then it sort of cancels out, so then there's also the power savings). We already know the CPU is 45nm, so it's a moot point.

Let's start thinking about the GPU now: The greater implication is that it needs to have been in production for quite a while now if they're hoping to hit any decent supply, and using the latest nodes doesn't seem particularly feasible at the moment.

Even IBM's Power7+ seems to be having production issues at the moment with reports of initial shipments having 3 or 4 cores disabled (along with the associated eDRAM).

Given how 28nm capacity has been at TSMC, it's hard to expect adding a significant chunk of eDRAM to a design having favourable production results this year (let's not forget inflated production costs since we are talking about a supply limited process to begin with). GloFo is only just starting to ramp up as well, so that can hardly be counted upon.

If the thing is on 28nm, it won't be particularly inexpensive. Guess it depends on how much they're willing to spend for a severely limited supply.

it must be capable of achieving roughly E6760 levels of performance in order to justify its existance.
Sure, there have been a couple theories about that particular level of specs for a long time now.

Redwood - 400:20:8 - 104mm^2.
E6760 - Turks - 480:24:8 - 118mm^2

(Those two basically came out of TDP considerations).

Obviously, we don't know about how much space is taken up by other HW packed into the GPU such as north bridge/starlet etc.

TSMC's 40nm eDRAM is supposedly 0.145mm^2/Mbit (macro) so 32MB will be around 37mm^2 (+overhead).
 
Last edited by a moderator:
I'm far more interested in the data path between the CPU and the GPU and the justification of eDRAM on the GPU.
Justification, same as always... Simpler/cheaper external memory subsystem; more bandwidth, lower latency, higher efficiency. CPU-GPU interconnect is probably simplest possible, but details will probably be very scarce. I wonder if Nintendo would even release details to their devs... After all, they pretty much refused to document microcode for the reality coprocessor in the N64. If the CPU really is based on good ol' gekko from the gamecube it very well might be the same CPU bus as used there, just modified (if neccessary) to handle multiple cores.

Going the eDRAM route isn't a given by any means, but apparently Nintendo is confident in the benefits or they would have saved themselves the cost and risk.
Not sure what "risks" you're talking about; eDRAM have been used in consumer GPUs for a decade and a half soon. There's just more in the Wuu GPU than has been the case previously, but such is life in the the semiconductor industry; it's always more, faster, better (well, not always, with Nintendo... :rolleyes: :LOL:)
 
Using the eDRAM is a way to lower power consumption because moving data over external buses is energy expensive. I believe that durango's rumoured 64mb store on the GPU is there for similar reasons.
 
Hello, I am not an expert but I usually read technical issues and so I would like to ask if you end wiiU is a 2x or more on Xbox 360 and PS3. I read on another website that is even less powerful than that and hence my question.
 
Status
Not open for further replies.
Back
Top