Wii U hardware discussion and investigation *rename

Status
Not open for further replies.
Looking at the die shot, it seems that there are 8 blocks of (probably) 20 shaders for a total of 160. Directly above it there are what looks like 3 repeated blocks that I would guess are TMUs, so 12 of them. Then two blocks below the shaders, so probably 8 ROPs. The arrangement of TMUs to shader blocks seems odd, but that's going by the few other die shots I've looked at.
[strike]I'm more inclined to think it's 40 per block given the size, so 320.[/strike]
 
320 SPs (352 FLOP/s) would at least provide one point that actually fits the vague "50% more powerful than XBox 360" claims that seemed to be going around, AFAIK even from Nintendo themselves. Although Nintendo also said the ports look dramatically better >_>

The ~3MB of eDRAM above the 32MB eDRAM seems to fit the 2MB framebuffer + 1MB texture buffer used on Gamecube and Wii. I wonder if that means more Hollywood stuff is near it. The main 32MB could be used to emulate the 24MB of 1T-SRAM in Wii mode.
 
I made searches for GPU die shots and didn't find anything more useful than the HD4870 die shot already referenced on the first page of google hits. Something from the Evergreen family would have been useful.

Kudo to function for sticking his neck out for a first interpretation. It bothers me that the vast majority of logic on the die is unaccounted for. I'm not sure that Gipsels interpretation is correct though, I can't seem to match the regular areas that I assume is register banks/local storage for the shader processors with his hypothesis. But just what is all that unaccounted for die space otherwise?
 
I'm not sure that Gipsels interpretation is correct though, I can't seem to match the regular areas that I assume is register banks/local storage for the shader processors with his hypothesis.
The registers can be clearly seen. They simply look like SRAM arrays around the SPs (that's what they actually are). There are 64 banks in total in each SIMD engine, each one holds 4kB. And i'm mentioned already that I' missing the LDS (it is easy to spot on the RV770 as well the Tahiti Dieshot for instance). That's why I mentioned the possibility Nintendo may have gimped the architecture even further (by removing the LDS). On the other hand, I'm also not sure where the LDS (or even the TMUs for that matter) is on the Brazos die , which also sjows a quite dense and somewhat irregular layout. :LOL:
 
We don't even know what process the GPU is on. Renesas supposedly manufactures. Could it be 55nm? Would that make more sense of the size?
 
We don't even know what process the GPU is on. Renesas supposedly manufactures. Could it be 55nm? Would that make more sense of the size?
I thought it was agreed that the CPU is produced at IBM in 45nm and the GPU is manufactured at some 40nm process (I don't recall a Fab was explicitly mentioned by Nintendo) assumed to be TSMC's.
 
:?:
RV710 had two half size SIMDs and therefore 8 TMUs. As I said, starting from the R700 generation, each CU/SIMD has to have exactly 4 TMUs. The Wii U has 4 SIMDs (maybe even full size) which are just laid out in a unusual way to save the last fraction of a mm² on the die. It is basically a halved RV740 (8 SIMDs) with eDRAM on Die and some other changes (64 Bit DDR3 instead of 128Bit GDDR5, different external interfacing to the CPU and some southbridge functions).

I haven't seen a die shot of Luigi but on R770 each of the repeating shader blocks (as seen in the die shot) contains 20 shaders - and there are four such blocks for each of the 10 TMU blocks, giving a total of 40 x 20 = 800. If the same holds true for the Wii U, then 8 x 20 = 160.

My guess is that RV710 has one TMU block (containing 4 TMUs) for two shader blocks (2 x 20) giving an arrangement of 2 x (4:40) = 8:80.

I can't Google up either R770 or R710 die shots though, which would make this a lot easier ...

Without eDRAM, the die would measure around 100mm². As a comparison, the mentioned RV740 measures 137mm² and has twice the compute resources, a significantly higher clock target, twice the memory interface width at a higher standard (128bit GDDR5 PHY measures up to 20mm²) and lacks basically 4 years of experience at 40nm. It appears not impossible to fit half of it together with the eDRAM and some chipset functions in a ~150mm² die, especially if Nintendo opted to gimp it even further (no LDS, some restrictions with the amount of usable texture formats, whatever).

What appear to be the 8 shdaer blocks take up approximately 12 mm^2. Unless I'm missing something I don't see how there could be half of RV740's shaders in there.
 
We don't even know what process the GPU is on. Renesas supposedly manufactures. Could it be 55nm? Would that make more sense of the size?

The process could be estimated by measuring the eight very regular structures, and comparing their size to the HD4870 die shot.
 
I thought it was agreed that the CPU is produced at IBM in 45nm and the GPU is manufactured at some 40nm process (I don't recall a Fab was explicitly mentioned by Nintendo) assumed to be TSMC's.

The big chunk of edram, which appears to be 32 MB as there are 32 repeating blocks, takes up approximately 38 mm^2 which is pretty much bang in line with 40nm, iirc, coming in at about 1.2 mm^2 / MB.

Edit: Yeah, I've been pixel counting.
 
The process could be estimated by measuring the eight very regular structures, and comparing their size to the HD4870 die shot.

The big chunk of edram, which appears to be 32 MB as there are 32 repeating blocks, takes up approximately 38 mm^2 which is pretty much bang in line with 40nm, iirc, coming in at about 1.2 mm^2 / MB.

Edit: Yeah, I've been pixel counting.

Haha, well done. If you guys have questions which might be answered with a closer look at the image, Chipworks sent me a copy of their software. It's basically like a Google Earth, so I can zoom very close in and get a detailed look. Let me know...
 
The big chunk of edram, which appears to be 32 MB as there are 32 repeating blocks, takes up approximately 38 mm^2 which is pretty much bang in line with 40nm, iirc, coming in at about 1.2 mm^2 / MB.

Edit: Yeah, I've been pixel counting.


do you still think it's 160 or have you changed your mind to 320.
 
Haha, well done. If you guys have questions which might be answered with a closer look at the image, Chipworks sent me a copy of their software. It's basically like a Google Earth, so I can zoom very close in and get a detailed look. Let me know...

I'd just like to say "well done" Fourth Storm. You have done most excellently in pursuing this! I don't post at NeoGaf but I've been lurking in the Wii U thread and I think you deserve a round of handshakes and cheers for doing a service to gaming. :)
 
More pixel counting:

In this R770 die shot a row of 80 shaders (arranged to the right of the TMUs in a 4 x 20 fashion) takes up ~13 mm^2:

die-shot.jpg

(Got link from AIStrong post on the GAF).

In the Wii U die shot a similar row of four physical blocks of shaders takes up ~6 mm ^2.

RV770 was on 55nm, and Wii U is almost certainly on 40nm (going by the edram). This appears to show perfect (or perhaps slightly better than) scaling from 55nm to 40nm. I *think* this means that it is safe to say that the Wii U has 2 rows of 4 x 20 shaders.

So, in summary, I think:
- 40 nm
- 32 MB edram
- 16 TMUs
- 160 shaders
- 8 ROPs
 
I'd also like to congratulate the people on NeoGaf who actually did something constructive to increase our knowledge of what goes into the WiiU.

Personally, I hope that you simply get a discreet message from one of the many who actually know the specifics of the GPU and whose leak is covered by the die shot which can then be interpreted with remarkable accuracy (*cough*).
 
In this R770 die shot a row of 80 shaders (arranged to the right of the TMUs in a 4 x 20 fashion) takes up ~13 mm^2:
...
In the Wii U die shot a similar row of four physical blocks of shaders takes up ~6 mm ^2.

hm... I'm getting ~1.6-1.7mm^2 per shader block on rv770... and 1.48mm^2 in WiiU. [strike] I think you're off by 2x...[/strike]

*rv770 die shot is 600x589 pixels, shader block is ~ 62x38 -> 1.7mm^2/256mm^2

WiiU GPU is 3000x3098, shader block is 290x320 -> 1.49mm^2/150mm^2

It seems like it goes beyond ideal scaling, but they can probably get a bit denser with 40 shaders per block (not unlike Llano).

----
 
So it's only a bit more than half of a mid-range RV730 from 2008, but with lower clock speed?
 
Status
Not open for further replies.
Back
Top