Wii U hardware discussion and investigation *rename

Status
Not open for further replies.
I don't think so.

Renesas mentioned that Wii U's eDRAM uses latest technologies that they have and that is 8192 bits eDRAM 40nm 1.1v ... On their official website it is still "under development" yet it is used in Wii U so we can conclude that Nintendo's Wii U is being used as lab rat and Nintendo agreed.

http://www.renesas.com/products/soc/asic/cbic/ipcore/edram/

http://hdwarriors.com/wii-u-specs-f...gpu-several-generations-ahead-of-current-gen/

http://hdwarriors.com/general-impression-of-wii-u-edram-explained-by-shinen/

http://hdwarriors.com/wii-u-has-a-lot-of-power-to-unleash-power-for-years-to-come/

Adaptive tessellation(directx 11/open gl 4.3?)
http://hdwarriors.com/shinen-on-the-practical-use-of-adaptive-tessellation-upcoming-games/

You may not believe it yet Xbox 360 was first to have 4096 bits eDRAM in 2005 and now most fabs can achieve it 7-8 years later, in case it is 4096 bits for Wii U then bandwidth would be 281GB/s...

So Espresso's eDRAM being 8192 bits is not far fetched at all, it is very realistic specially since Renesas was developing eDRAM technology since 2010 and Nintendo leans towards Japanese companies so they could have funded the research/development of eDRAM in Renesas.
 
So WiiU's eDRAM has more than twice the bandwidth of Xbox One's eSRAM?:oops:

Yes... Xbox One's eSRAM has way lower latency/higher speed because it is eSRAM and because it is 28nm yet it is more expensive that all of eDRAM in Wii U...

Why would Nintendo want so much BW? Makes zero sense to have such a wide bus.

http://hdwarriors.com/general-impression-of-wii-u-edram-explained-by-shinen/

"In general, development for Wii U CPU+GPU is simple. You don’t need complicated setups or workarounds for things like HDR(High Dynamic Range)or linear RGB (Color Modeling). What we also like is that there are plenty of possibilities for speeding up your rendering and code, but you don’t have to dig deep for them to get proper performance."


We know that both Espresso and Latte have eDRAM and Shin'en hints that Espresso can access Latte's eDRAM and(i think also)vice versa.


Die shot of Wii U's GPU suggests wireing and that means that there could be a wire connecting CPU and GPU and also RAM and Wii U is MCM...
 
Bollicks. That's all I have to say. Pure fantasies from beginning to end.

Tell us the one about Goldilocks too while you're at it...

Edit: btw, xenos did NOT have 4kbit wide eDRAM, don't be ridiculous. On-board eDRAM bit width was 512bit IIRC, off-die was a quarter of that.
 
Last edited by a moderator:
Bollicks. That's all I have to say. Pure fantasies from beginning to end.

Tell us the one about Goldilocks too while you're at it...

This is pure guess, but are you a local troll? :oops:

Or do you have some agenda and/or bias towards some company eg loves company A but hates company B? lol :rolleyes:
 
I have a bias against raving lunatics who make shit up and call it fact. So to continue fairy tale analogy, if shoe fits...
 
I have a bias against raving lunatics who make shit up and call it fact. So to continue fairy tale analogy, if shoe fits...

You only responded/made a reply because you don't like the information because it does not fit to your bias/agenda/interest... :devilish:

So you are going to taunt and insult people that don't share your opinion, expectations, bias, agenda and interest? Wow... There goes my hope towards humanity out of a window from roof of a hundred floor building... :cry:

I guess it was fairy tale too for Xbox 360 and its 4096 bits 10MB eDRAM in 2005... Nowadays almost every fab can do 4096 bits eDRAM and some 8192 bits eDRAM like Renesas that specialised in eDRAM and Intel can do that also if I stand correct... Right?
 
I think he means you have to show some shred of evidence and not just your tin-foil hat conspiracy theories.

Besides, the Xbox 360 has 10 MB of eDRAM with an internal bandwidth of 256GB/s. It was only 32GB/s between the eDRAM and CPU.
 
I guess it was fairy tale too for Xbox 360 and its 4096 bits 10MB eDRAM in 2005...
That's untrue. Check your facts. Although Grall is harsh, your reasoning is completely out there, and you present it as fact instead of a theory. Quote:
Latte's bandwidth is 563.2GB/s thanks to 8192 bits 32MB eDRAM
No speculation or discussion, but an outright, unsubstantiated, unrealistic claim.

You present little evidence either beyond spurious links - an eDRAM page that doesn't mention the product you are discussing, and some comment's from Wii U game developers telling potential consumers that Wii U is great. That there's supposedly lots of easily tapped power is not evidence of any particular technical feature so even if those devs comments are true, they don't point to there being massively wide eDRAM.

Posters like yourself generally get removed from the board pretty quickly. I'm half tempted to evict you now and remove noise, but as there's nothing else to say about Wii U, I'll give you a chance. If you want to contribute at B3D's level, present your theory and the reasoning behind it in a coherent fashion.

"It is my belief that eDRAM..."

Actually, sod it. You can't manage that. You'll just post fanboy quotes from devs. If someone else wants to champion this theory, go ahead, but we don't want another crazy poster. My values of giving people chances need to be tempered with a bit more common sense and realism.

Plus it appears you're an already banned member with a puppet account to try your hand at evangelising once again. Good-bye.
 
btw, xenos did NOT have 4kbit wide eDRAM, don't be ridiculous. On-board eDRAM bit width was 512bit IIRC, off-die was a quarter of that.
Off die (between main GPU die and eDRAM die with the ROPs) it was 32GB/s (64 byte per clock = 512 bit), on die internal bandwidth to the ROPs was indeed stated as 256GB/s (under select circumstances), equating an aggregate width of 4kbit. The 8 ROPs are supposed to be capable of fullspeed blending in a 32bit color target with 4xMSAA. That alone (without Z updates) requires 256 Byte/clock (2kbit). With Z updates (taking a 32bit Z/stencil buffer) it is double that. And even with Z only it is supposed to do up to 64 Z updates per clock (with 4xMSAA), which then ends up theoretically at the 512byte per clock (4 kbit) needed bandwidth for that.
 
Last edited by a moderator:
According to a WiiU Render Coder the 160 SP rumour is wrong, he said it has 192 shader units. Not a big difference of course and still pretty low.
 
I love this thread every so often the hordesof crazies breach the walls and swarm the courtyard with tales of 320 SPs and 8192 bit bus widths only to be fought off by the mods and the enraged locals.

Comedy. Gold.

I can't add anything to the debate really and I'm so OT it's not even true but I just cannot understand the mentality of Nintendo fanbois rowing over power when Nintendo themselves haven't fought on that field since the GCN days for them it really is all about the games.
 
According to a WiiU Render Coder the 160 SP rumour is wrong, he said it has 192 shader units.
I'm curious, what was counted there to arrive at 192.
cwm13ueix6.gif
 
I wonder how 192 could fit with the visual evidence shown by the (annotated) acid-etched die photo that was published quite some while back ago now.
 
Not sure where they got that info from but the WiiU GPU has 192 Shader units, not 160. It also has 32MB of EDRAM, (the same amount as Xbox One) so comparing just the number of shader units against a PC card doesn't give a representative performance comparison. On the CPU side, WiiU also supports multi-threaded rendering that scales perfectly with the number of cores you throw at it, unlike PC DX11 deferred contexts which don't scale very well. The current WiiU build runs around 18-25fps with 5AI with all post (FXAA/Motion blur etc) enabled, which is fairly good given only the fairly cursory optimisation pass that it's had.
Hard numbers finally. My life is now complete.
 
192 shaders would imply that it's not VLIW5 even though that's what was rumored to be the starting point for the GPU. You have 8 groups of shaders on the die, for 24 shaders a piece? That's really unlike any AMD GPU.

Perhaps two of those groups are redundancies and it only has 6 groups of 32 shaders.
 
192 shaders would imply that it's not VLIW5 even though that's what was rumored to be the starting point for the GPU. You have 8 groups of shaders on the die, for 24 shaders a piece? That's really unlike any AMD GPU.

Perhaps two of those groups are redundancies and it only has 6 groups of 32 shaders.

Bgassassin said over on gaf that 192 is the number of threads, not shaders:

GPU7 has a total of 192 threads which must be divided across all shader types

He also wrote:

the docs list Latte as having 32 ALUs and is a VLIW5 achitecture

Source thread:
http://www.neogaf.com/forum/showthread.php?p=89468599

Meaning it has to be 5x32=160 shaders. HOWEVER there was something more interesting later in that thread which has now been removed (NDA-breaking stuff) which might explain why the SIMD blocks are larger than they should be @40nm. They wrote that each of the 32 "parts" contains 5 ALUs and 4 GPRs. Isn't 4 general purpose registers like 64kb? That would mean the GPU has 2MB in registers when it should only have 512kb. Unless I'm wrong (and I could be). There was also this earlier inside leak:

http://beyond3d.com/showpost.php?p=1668212&postcount=2552

about the GPU...it is modeled on the R700 series, but it has significantly more GPRs. However, it seems to have fewer GPRs then the E6760, so...make your own conclusions

Could also explain the R740 and R770 rumors. They also had 2MB of registers because they had 640 total shaders. Noticed this list night and figured somebody else would have picked up on it. Since they didn't, I thought I'd register and post it here. What would be the benefit of such a design?
 
Off die (between main GPU die and eDRAM die with the ROPs) it was 32GB/s (64 byte per clock = 512 bit), on die internal bandwidth to the ROPs was indeed stated as 256GB/s (under select circumstances), equating an aggregate width of 4kbit. The 8 ROPs are supposed to be capable of fullspeed blending in a 32bit color target with 4xMSAA. That alone (without Z updates) requires 256 Byte/clock (2kbit). With Z updates (taking a 32bit Z/stencil buffer) it is double that. And even with Z only it is supposed to do up to 64 Z updates per clock (with 4xMSAA), which then ends up theoretically at the 512byte per clock (4 kbit) needed bandwidth for that.

I remember reading that Xenos' daughter die was clocked at 2Ghz to achieve those rates. That would then only require a 128-bit connection to the main GPU and 1024-bit internal with the ROPs. I think that's a more realistic setup.

Here is a link from this forum: http://beyond3d.com/showthread.php?t=20082
 
Status
Not open for further replies.
Back
Top