Wii U hardware discussion and investigation *rename

Status
Not open for further replies.
Sorry but that is not the correct method. If you want to estimate the ALU count than you should only measure the logic of a block without the memory (SRAM).

I used your illustration and the ALU area of the Barzos is ~28% bigger than "Latte".

So 1 Wii U ALU block is ~72% of 1 Brazos(Bobcat) ALU block, when using your scaling.

Even your scaling is not pointing to 160sp (if Latte is 40nm tsmc).

But is your scaling correct?:LOL:
wtf are you talking about. y is 28% bigger than x means x is 78% the size of y. Not only that but how does that not point to 20ALU?
 
Actually taking your posted image and assuming 228mm² for Llano (which makes ur Wii U Die size to sit slightly on the low side) I get for one visible SIMD block:
Brazos top: 1.86 mm²
Bazos bottom: 1.83 mm² (and both numbers are already slightly on the generous side)
Wii U: 1.44 mm²

So two Wii U blocks are 2*1.44/1.85 - 1 = 56% larger. Or the other way around, one Wii U block takes 22% less area than a Brazos block. The reason I originally arrived at just 16%-17% less area was that I assumed a 72mm² die size for Brazos (was taken just from memory, didn't look it up) and actually a slightly larger die size for the Wii U (your picture assumes ~145mm²), namely the 150mm² posted directly on the Chipworks site (I don't know exactly where the 146.48mm² come from, which appears the most popular number lately). This exactly explains the difference to the numbers in my prior posts and is covered by the few percent possible deviation I mentioned. I don't arrive at your numbers even taking your scaled images.
Llano does not really matter for that comparison. The die area I used for brazos is 75mm^2. The area for the Wii U used was the number from neogaf. If you check my post history you can see the numbers for the calculations yourself and the size scaling.
 
wtf are you talking about. y is 28% bigger than x means x is 78% the size of y. Not only that but how does that not point to 20ALU?

My mistake. The 28% is a mathematical error of my.

The important thing is that 1 Wii U ALU block is ~72% of 1 Brazos(Bobcat) ALU block, when using your scaling.
 
Llano does not really matter for that comparison. The die area I used for brazos is 75mm^2. The area for the Wii U used was the number from neogaf. If you check my post history you can see the numbers for the calculations yourself and the size scaling.
The Llano number just provides a reference to pin absolute numbers on the sizes. The relations between them don't care at all (that's determined by the scaling of Latte and Brazos images, but as I said I just used your image). It also does not change, that your numbers don't work out. Even in your scaled picture two ALU blocks of "Latte" take way more area than just 30% more than a Brazos ALU block. It's about 56% in your image, which makes a Latte ALU block ~22% smaller than a Wii U block.
 
y is 28% bigger than x means x is 78% the size of y. Not only that but how does that not point to 20ALU?
28% (the growth needed to get from a Latte ALU block [supposedly supporting only DX SM4] to a Brazos one [supporting DX SM5]) is smaller than 56% (the growth you assume for subtracting SM5 support and the other enhancements Evergreen got compared to the R700 generation to stay comptible with just 160 SPs). Reason enough? :LOL:
 
hmmm I seem to have miss calculated or miss scaled the sizes in my original calculations. It seems like 2 Wii U blocks are 50% larger than 1 Brazos block. Still does not make up for the density which would need to increase by 33% on the same node for the Wii U to have 40 SPs. Still way way too much density to be within the realm of possibility.


Anyways, if 55nm is still in the air then it will fit in perfectly. If not then nothing really fit from what I can tell.
 
look like the cpu die photos are going to be released soon. Might be a good idea to start a new thread on the cpu so we can keep this about the gpu.
 
look like the cpu die photos are going to be released soon. Might be a good idea to start a new thread on the cpu so we can keep this about the gpu.

Not trying to be rude, but we should keep all hardware discussion pertaining to the Wii U within this thread. Definitely want to give much thanks to Chipworks for providing the CPU die photo free of charge.

I don't think the CPU is as mysterious as the GPU is, however. For example:

http://forum.beyond3d.com/showpost.php?p=1571368&postcount=43

Espresso said:
It's not a Power 7 derivative. It's directly descended from the CPU core in the Wii, there are just more of them and they are clocked a little faster. It does come up about the same as Xenon for processing power, but the clock is much, much closer to Wii than X360.

This 'Espresso' knew what the CPU was back in August, 2011. A follow-up post from them even mentioned the cache.
 
To get back to the textures and bandwidth a bit, how much bandwidth is saved by being able to use compressed textures? On PC I noticed that no matter how relatively crappy my GPU was (I tried a number of low-profile cards before I got my new PC) I could basically always enable high-res textures without incurring a framerate hit. I could imagine that on 360, textures are probably the main bandwidth consumer?
 
To get back to the textures and bandwidth a bit, how much bandwidth is saved by being able to use compressed textures? On PC I noticed that no matter how relatively crappy my GPU was (I tried a number of low-profile cards before I got my new PC) I could basically always enable high-res textures without incurring a framerate hit. I could imagine that on 360, textures are probably the main bandwidth consumer?

I guess they'd reuse S3TC, so DXT1 = 64 pixels/32 bytes. I'd guess they have a scheme for normalmaps too which probably is like 32pixels/32bytes
 
To get back to the textures and bandwidth a bit, how much bandwidth is saved by being able to use compressed textures? On PC I noticed that no matter how relatively crappy my GPU was (I tried a number of low-profile cards before I got my new PC) I could basically always enable high-res textures without incurring a framerate hit. I could imagine that on 360, textures are probably the main bandwidth consumer?

Thats a very interesting point. The FPS hit by increasing texture quality in PC gaming on a low end card is often negligible as compared to other parameters like shadow quality/depth or particle density or even enabling SSAO options.

Why is this so? (in regards to texture quality i mean)

Do PC graphic cards have superior texture format management/compression methods that consoles have no access to as part of the DirectX API?

Textures consume memory bandwidth and yet cards with miniscule 64/128 bit interfaces don't suffer much of a hit when the quality is upped.
 
I remember than on Quake 3, 10 years ago, compressed textures were a nice means of gaining free framerate, even if something like 5% or more. You would hardly see a quality difference, except if you had that old nvidia bug that made the sky look weird.

lol, I had the Voodoo5 and on this card you had to tweak your options well to get something nice looking but not too slow. special 16bit rendering (that looks like 22bit, in practice you could play with 16bit and have good looking smoke), 32bit textures with compression.
 
http://www.reedbeta.com/blog/2012/02/12/understanding-bcn-texture-compression-formats/

May be interesting as a background for discussion here as well. I certainly thought it was enlightening. DX11 seems to have some good improvements but they don't seem to be used a lot yet, and I vaguely remember some differences in DXT1-5 support between 360 and PS3 but also that with some work it didn't matter much. Suggests there probably aren't big differences for Wii U vs 360? Perhaps only better compression support for various stuff like height and other map types? In which case I have no idea of the impact for that.
 
Anyways, if 55nm is still in the air then it will fit in perfectly. If not then nothing really fit from what I can tell.

It isn't.
The eDRAM on the WiiU GPU is too dense.
If it had been done on IBMs 45nmSOI it would have been significantly larger. Renesas 40nm eDRAM cells are roughly the same density (cell sizes: Renesas 0.06um2, IBM 0.067, TSMC 0.0583).

Cell size numbers are usually from the first public presentation, and actual production silicon is a of course a better indicator of true density. Time does allow refinement, and tailoring to a specific application can also yield benefits (TSMC has a denser variety of eDRAM suitable for mobile applications and lower clocks, for instance). In fact, such refinement/customization is necessary in order to achieve the density of the WiiU GPU eDRAM at 40nm. I know of no example of eDRAM at 55nm, anywhere, at any speed, that could allow the eDRAM density of the WiiU.

So, even assuming 40nm, the eDRAM density is really very high. If we can generalize from this and assume that they have been able to achieve good density in other areas is, of course, an open question. Doesn't seem implausible though.
 
Last edited by a moderator:
It isn't.
The eDRAM on the WiiU GPU is too dense.
If it had been done on IBMs 45nmSOI it would have been significantly larger. Renesas 40nm eDRAM cells are roughly the same density (cell sizes: Renesas 0.06um2, IBM 0.067, TSMC 0.0583).

Cell size numbers are usually from the first public presentation, and actual production silicon is a of course a better indicator of true density. Time does allow refinement, and tailoring to a specific application can also yield benefits (TSMC has a denser variety of eDRAM suitable for mobile applications and lower clocks, for instance). In fact, such refinement/customization is necessary in order to achieve the density of the WiiU GPU eDRAM at 40nm. I know of no example of eDRAM at 55nm, anywhere, at any speed, that could allow the eDRAM density of the WiiU.

So, even assuming 40nm, the eDRAM density is really very high. If we can generalize from this and assume that they have been able to achieve good density in other areas is, of course, an open question. Doesn't seem implausible though.
Maybe it's actually 32nm? ;)

Nah, Chipworks says it's an advanced 40nm TSMC process. No idea what "advanced" is supposed to mean. TSMC 40LPG, maybe?
 
Thats a very interesting point. The FPS hit by increasing texture quality in PC gaming on a low end card is often negligible as compared to other parameters like shadow quality/depth or particle density or even enabling SSAO options.

Why is this so? (in regards to texture quality i mean)

Do PC graphic cards have superior texture format management/compression methods that consoles have no access to as part of the DirectX API?

Textures consume memory bandwidth and yet cards with miniscule 64/128 bit interfaces don't suffer much of a hit when the quality is upped.

Well there is no magic so obviously in those cases the texelrate is still within the limits. Some points to consider:
- Are we talking a crappy GPU or crappy memory bandwidth.
- Do pixel programs fetch texels for every operation they execute.
- [speculative and short cornered] If cacheline size is 64Bytes as stated before, a single texel fetch results in 64Bytes being loaded (EDIT: in case of a cache miss). Therefore, the requirement to single texture a complete 720P screen is about 60MB, regardless the texture resolution used
- Dynamic shadowmapping requires rendering the map first so higher resolution results in bigger load on the ROPS, which in turn are the limiting factor.

About the 64/128 bit busses, I noticed the same. Either the clockspeeds must be very high, or it has multiple 64/128 bit busses and multiple memory banks (discussed before)
 
Last edited by a moderator:
So... has NFS:MW even been mentioned here, or is it only because of the extra RAM?

EDIT: Someone here said that Aliens:CM should be the last benchmark used, and now with that looking like shit even on PC and most likely cancelled for Wii U, I think this should take it's place. (And with Wii U's sales, it might end up being the console's final notable third-party game anyway lol.)
 
Last edited by a moderator:
Status
Not open for further replies.
Back
Top