Wii U hardware discussion and investigation *rename

Status
Not open for further replies.
I assume the GPU LSI uses Renesas' UX8 process (40nm) and UX8GD eDRAM. UX8GD supports up to 256Mbit, which happens to be exactly the amount the Wii U is supposed to have, and, according to Renesas, targets game consoles. A single cell of UX8 eDRAM is 0.06 square micron, half the size of the previous generation UX7LSeD eDRAM for 55nm.
 
Saw this on neogaf :LOL:

captureasq.jpg
 
I assume the GPU LSI uses Renesas' UX8 process (40nm) and UX8GD eDRAM. UX8GD supports up to 256Mbit, which happens to be exactly the amount the Wii U is supposed to have, and, according to Renesas, targets game consoles. A single cell of UX8 eDRAM is 0.06 square micron, half the size of the previous generation UX7LSeD eDRAM for 55nm.

The 40nm eDRAM density is slightly higher than IBMs on 45nm SOI (0.067um2). On IBMs process, that works out to 0.24mm2 finished 1Mbit macros or 32MByte in 61mm2, so the same amount on UX8 might be 55mm2 or so, leaving on the order of 100mm2 for the rest of the GPU if initial size estimates are correct.

For whatever reason, bgassassin seems adamant that the GPU is made using finer lithography. I have no idea what he bases that on.
 
IT seems there are proper estimates of the dies size. I guess our better bet would be to estimate the size of the MCM (by comparison to the USB port easy to spot, not hidden by anything) and then to estimate the size of the chips.
From here the width of the connector is 15.7mm. Though the picture is tiny and error can be significant... Wish I had photoshop as paint is a bit sucky...

Using paint I found:
1) 27 pixels for the connector width (in the picture from bottom to top)
2)66 pixels for the mcm eight (in the picture from bottom to top)
3) 84 pixels for the mcm width (in the picture east west measurement made at the bottom)
I made the same measurement at the top of the chip and found 81/82. Perspective in not eating that much.
So translated into mm I get :
1) 15.7 mm
2) 37.8 mm
3) 48.8 mm

Thing is I made measurement on another picture and found that the MCM must be scare (slide #4 in the original link). Perspective account for something in fact... :LOL:
As the best picture is the slide#4 which is in diagonal... and paint make measurements inconvenient) I will round the whole thing as a 50mm X 50mm chip ( I could write 47.5 +/- 2,5mm but not worse the headache on such a sucky picture and using paint...).

So the mcm is =< 2500mm^2.

So to the chips... here it gets tough with paint if somebody has photoshop...
Made a hack job here using squares (to grossly correct for perspective) so three corners of the each chip touch three sides of the square.
I got:
166x166 for the mcm relates to 2500mm^2
55x55 for the big chip =< 280mm^2
25x25 for the tiny chip =< 60mm^2

Eye balling after having found what I found is that I could fit the big chip about 9 times within the MCM. Try for your self both the MCM and the main chip are squarish, 9 is kind of working.

It's quiet shockingly big. Even accounting for errors and optimistic rounding I'm left with something quiet big. Too big in fact.

So I dug a bit more.
*Renesas doesn't seem to produce edram @ 40nm. Their own site put their 40nm tech as under development.
* IBM stated that the CPU use their 45 nm process.

* Nintendo stated this:
Takeda: First of all, adoption of a multi-core CPU8 for the first time. By having multiple CPU cores in a single LSI chip, data can be processed between the CPU cores and with the high-density on-chip memory much better, and can now be done very efficiently with low power consumption.
&
Takeda: This time we fully embraced the idea of using an MCM for our gaming console. An MCM is where the aforementioned Multi-core CPU chip and the GPU chip10 are built into a single component. The GPU itself also contains quite a large on-chip memory. Due to this MCM, the package costs less and we could speed up data exchange among two LSIs while lowering power consumption. And also the international division of labor in general, would be cost-effective.
I found for the max size of the GPU to be 280mm^2, which is huge it in fact makes sense if there is on die EDRAM to feed it. Looking at Renesas website I would say that the big chip is made on their 55nm process.
It also makes sense for the GPU pov, as Nintendo chose a architecture that was made on (another though) 55nm process.
FYI the RV730 was 146mm^2. With a max size of 280mm^2 that let some room for quiet some Edram (though I don't know how Renesas process compares to TSMC @55nm when it comes to density).

The 60mm^2 for the CPU on a 45nm process didn't make sense either. IBM may have designed a custom 470s for Nintendo. Those cores are 4mm^2 without L2, there is no way it would get you to 60mm^2 (or close even taking in account the gross imprecision of my measurements).

Nintendo gave us the solution, there is a large amount of on die memory (edram on the chip) more than the 2MB of L2.

Now the real question is how and where Nintendo allocated its Edram budget with most likely BC being an imperative factor. Their were multiple chips in Hollywood and one would now have to guess how Nintendo would put those multiple chips into only 2.

It seems that the total amount of EDRAM seems to be 32MB though I would not be cautious.

We know that for BC sake Nintendo may want to have a "block" of (at least) 24MB of on chip memory accessible to the CPU and the GPU.
Looking at the Wii I could see Nintendo having put together the two chips that was Hollywood into the big chip. I would also assume that the memory controller is in the big chip as it part of Hollywood in the Wii. Following that line of thinking and looking at die size I would say that "at least" 24MB of edram are on the big chip (GPU). This memory has to be accessible to the CPU and the GPU as in the Wii.

That would let 8MB of edram on the CPU. Thing is we heard nothing about it, and as we know about cache size I wonder if logic would be that it would have been leaked too as (L3 most likely).
The translation (firs Takeda quote) could be misleading. There might not be "high density on chip memory" on the CPU die and that he spoke of the CPU accessing the memory on the GPU.
Though it matches well IBM initial PR. 8 MB sounds reasonable in size it could be a match.

Still unsatisfying I could see Gubbi being right, the three cores have their own local L2 (sram 256 KB) and there is extra cache L3 (2MB). Would still not explain the size of the chip. To me it could mean that what is called enhanced broadway while most likely based on ppc 470s could be more custom than one though. First the cache hierarchy would be different, no half speed L2 and a L3. And I could see the CPU having 4 wide SIMD unit using a super set of broadway instructions set. Data path may have been double, etc.
That could explain the size and overall as I don't think Nintendo engineers are stupid I could bet on that instead of the old paired FPU in broadway and PPC470s and more EDRAM on top of plenty of EDRAM already.

Back to the GPU (big chip) it would indeed include 32MB of EDRAM. In emulation mode it would serve as the Wii 24MB memory pool and the 3MB of texture ram.
---------------------

Now about why the system is produce for cheap I would say the magic could be in the 55nm GPU manufactured by Renesas, and they keep the part using a high performance process really tiny (below 60mm^2).
 
Last edited by a moderator:
SO from there the system could be as such:
tiny die:
3 cores CPUs,
32KB I$ 32KB D$
256 KB of L2 per core
2 MB of L3 (edram running at half speed)
4 wide SIMD
running in the 1.6 Ghz range.
IBM 45nm process

big die:
RV730 class of GPU ( I would see more as half a RV740 than a rv730 as rv730 has its weirdness).
Include the same functional units as Hollywood (arm cores, dsp, etc.)
Include the memory controller (acts as the north bridge for the CPU).
Include 32MB of edram (accessible to both the CPU and the GPU at different speed though I hope).
Low clock speed.
made on Renesas 55nm
 
The 40nm eDRAM density is slightly higher than IBMs on 45nm SOI (0.067um2). On IBMs process, that works out to 0.24mm2 finished 1Mbit macros or 32MByte in 61mm2, so the same amount on UX8 might be 55mm2 or so, leaving on the order of 100mm2 for the rest of the GPU if initial size estimates are correct.

For whatever reason, bgassassin seems adamant that the GPU is made using finer lithography. I have no idea what he bases that on.

What of this? http://www.neogaf.com/forum/showpost.php?p=43089261&postcount=6359

@lioli, why would it not just be AMD/TSMC fabbing the GPU on 40nm?
 
What of this? http://www.neogaf.com/forum/showpost.php?p=43089261&postcount=6359

@lioli, why would it not just be AMD/TSMC fabbing the GPU on 40nm?

That calculation is just taking the size of the basic element and multiplying it with 256 million. For instance, the cell needs to be connected..... The way it works out is that they design à macro that takes care of signalling, power supply and refresh, yada yada yada which then serves as the basic building block. In IBMs case the area per effective bit in the finished macro is 3.5 times higher than the basic structure. That's à huge difference, implying that the macro layout is actually more important than the basic cell size. I made à big assumption that this would be identical between companies and processes when I back-of-the-enveloped the corresponding size of 32MB 40nm eDRAM by Renesas. Use it for entertainment purposes only.
 
why in the heck would you use gddr5 and 32mb edram? you would use one or the other. come on people.
I agree. The most rational explanation has been exactly the one we're seeing. 32 MBs eDRAM, on the GPU, no insane monster GPU or CPU - why would you load the platform up with fast, expensive RAM where its BW would be mostly idling?
 
I agree. The most rational explanation has been exactly the one we're seeing. 32 MBs eDRAM, on the GPU, no insane monster GPU or CPU - why would you load the platform up with fast, expensive RAM where its BW would be mostly idling?
The WiiU GPU is supposed to be much faster than the 360, CPU about on par, so why did the 360 need that kind of memory?
Okay I understand it makes sense if the CPU has access to the eDRAM. (It looks like it does, would that be the reason for the MCM?)
Or does the CPU look really bad?
 
Inside the Wii U

http://www.eurogamer.net/articles/digitalfoundry-what-is-inside-the-wii-u

EDRAM is built into the GPU and not the CPU
Well there is nothing in that article that the original article doesn't state, actually there is less.

Predicition: Its not 28nm or this

Mario-Luigi level was so predictable when they pulled this thing out last year
I'm not sure about what you mean but I would bet with 99% of confidence that Nintendo would not use 32nm/28nm parts for many reasons, costs and availability being on top of the list, next I would put implementation costs.

I would put most of the units that were part of Hollywood (an MCM too) in the GPU because using 55nm process (optical shrink /half node of 65nm) they could simply have shrink the ARM cores, dsp etc. Instead of coming with something brand new. They stated it in the interview, design process were about adding things to the Wii.
---------------------------------------------------

I don't get why people wonder about the RAM it is obviously DDR3. For the bus size, well I don't know either 64bit or 128bits. I would bet on the later as the GPU is big and the mcm even bigger they are nowhere near to be limited by the physical IO size.
WRT to the DDR3 I would bet on the cheapest available, possibly slower than DDR3-1600.
 
Status
Not open for further replies.
Back
Top