Wii U hardware discussion and investigation *rename

Status
Not open for further replies.
You probably COULD, but it wouldn't make sense as DDR3 and such are designed as off-chip interfaces, to tolerate use with memory board add-in slots (DIMM, SODIMM etc) and so on.

To have eDRAM and NOT have massive on-chip bandwidth would be completely illogical, as the whole - in fact ONLY - point to put DRAM straight on the chip is to provide large amounts of bandwidth.

Actually I disagree - I think another, more likely reason is to use edram is to reduce costs. MS did this with the 360 - they clearly wanted to avid a 256-bit bus and 8 memory chips for life - but they did so in a way that also gave them performance advantages.

The WiiU is a low performance device. It performs worse - significantly worse - than AMD SoCs with a 128-bit bus and DDR3 1600.

I don't think the WiiU's edram is about performance - nothing about the WiiU at all says "performance" - I think it's about cost. Cost to manufacture, and cost to research and develop. Nintendo don't want to be left paying for more DDR3 in than they have to in 5 years when the price has quadrupled.

To have a (presumably large) chunk of eDRAM with piddly bandwidth would not be a help, but rather a hindrance as instead of a big, expensive fast pool of memory you'd have a big, expensive SLOW memory. That cost could have been sunk into something else that would have provided a better return for investment.

The cost argument says go with edram and a half size bus, IMO.
 
Actually I disagree - I think another, more likely reason is to use edram is to reduce costs.
Well, it's a cost/performance balance. But if Nintendo are getting only 25 GB/s total BW, why didn't they use a conventional solution? PS3 is available with GDDR3+XDR for well under £200. A single pool of 128 bit GDDR3 would have sufficed using commodity parts.
 
Function, the situation you have laid out provides no benefit to even implementing a solution that includes eDRAM.

This has been outlined above and why I said no to your original post.
 
Well, it's a cost/performance balance. But if Nintendo are getting only 25 GB/s total BW, why didn't they use a conventional solution? PS3 is available with GDDR3+XDR for well under £200. A single pool of 128 bit GDDR3 would have sufficed using commodity parts.

A single pool of GDDR3 is likely to cost a lot more than the DDR3. Five years from now DDR3 will cost several times what it does now (if it follows DDR2 and DDR1 pricing as volume dropped) and GDDR3 is likely to cost even more than that. GDDR3 would be a bad choice to put in a console launching now.

And even using DDR3 on a 128-bit bus would force you to use 8 memory chips for the life of the machine, and would cause additional costs and complications for that little motherboard (clamshell would only give you a 64-bit bus). It would probably rule out any possibility of a 28nm shrink too, should you ever want that.

On the other hand, 30 mm^2 of silicon on an old process like 40nm is going to be pretty affordable now and only get cheaper and cheaper over the years.
 
Well, it's a cost/performance balance. But if Nintendo are getting only 25 GB/s total BW, why didn't they use a conventional solution? PS3 is available with GDDR3+XDR for well under £200. A single pool of 128 bit GDDR3 would have sufficed using commodity parts.

It's even worse when you can find sub £35 graphics cards on PC that has 2Gb GDDR3 and an 128bit bus...

What were Nintendo thinking....
 
It's even worse when you can find sub £35 graphics cards on PC that has 2Gb GDDR3 and an 128bit bus...

What were Nintendo thinking....

They were probably thinking they'd like to be able to sell a WiiU Mini for $99 in five years time, and make a profit on it.

PC graphics card manufacturers ditch a memory type when it gets slow or expensive, console vendors are stuck with it, like Sony and their XDR. And GDDR3 for that matter.

I'm trying to find more info on wafer costs, but so far even on 28nm a small amount of edram looks good compared to buying obsolete memory and soldering it to your motherboard.
 
You have a fair argument there. Really disappointing if true. Wuu may not even have a BW advantage from a more flexible eDRAM design than XB360. The only areas it'll compete with the older, cheaper consoles are:

1) more RAM
2) more modern GPU architecture
 
You have a fair argument there. Really disappointing if true. Wuu may not even have a BW advantage from a more flexible eDRAM design than XB360. The only areas it'll compete with the older, cheaper consoles are:

1) more RAM
2) more modern GPU architecture

I could be completely wrong of course. :D But given the performance level of the WiiU and the areas where it seems to struggle on the GPU side (admittedly this is early days) I can't help drawing comparisons to much faster SoCs with their "puny" 128-bit DDR3 memory buses.

There's something I half remember reading about AMD Phenom memory controllers - ganged vs unganged memory. I think you could set the MC to access dimms independently, by each 64-bit channel. Slower for some things, faster for others. Putting the edram on the end of one channel (with lower latencies) and the DDR3 on the other might be a quick and dirty way of getting your APU level bandwidth but without the same long term exposure to costs. And like the CPU - which has surprised everyone with its lack performance and evolution - it might save a lot on R&D time and money.

I've no proof though, beyond the apparent contradiction of on-GPU edram and the possible sub APU/PS3 level performance in rendering bandwidth constrained bits of games.
 
Could there be a connection between the performance degradation in CoD:BO2 vs XBox 360 and the fact that it uses 2xMSAA where no other Wii U game is using it (or is this part true)?

Xenos' 8 ROPs, built in to the eDRAM, can do 4x MSAA in one cycle. With 32-bit color and 32-bit depth/stencil over 32 samples (or 64 depth-only) we're looking at a really really wide internal datapath of 2048-bits after the ROPs, or 128GB/s at 500MHz. With only 2xMSAA you would need 64GB/s. But this still leaves a huge range above the main RAM bandwidth of 17GB/s where you could still be bandwidth limited on fill.

So given a scenario where the external interface on Wii U's eDRAM is just as fast on Xenos's, that gives you 32GB/s (it could be that like on Xenos Nintendo went with a synchronous core and RAM clock, which would give you a slightly higher 35.2GB/s). Now let's say that this time around you can texture from eDRAM. With Wii U's limited main RAM bandwidth this is desirable. In fact, it could be entirely possible that you must texture from eDRAM, which AFAICR was the case in Flipper/Hollyoowd AFAIK, which would make a big bandwidth hit on it unavoidable. In this case it's easy to see how Wii U's GPU eDRAM bandwidth could end up a limiter long before Xenos's, while still providing tremendously more than what main RAM alone does.

The FXAA in games like Arkham City is probably done in GPGPU; I don't think they'd be able to work that into the already overburdened CPU and main RAM load. This suggests at least a possibility of being able to texture from eDRAM. I don't think they'd add it if it hurt framerates further on a game that is already compromised; in this case the GPU probably had enough spare fillrate and ALU power but there could be additional overhead introduced in the extra resolve plus texturing bandwidth from main RAM that'd be involved if it can't texture from eDRAM.

If Wii U's GPU eDRAM were as capable as Xenos's then we should be seeing at least 2xMSAA on ports that didn't have it on XBox 360; it'd be free given that Wii U's GPU has more than 2x the eDRAM.
 
Last edited by a moderator:
From the guy from polygon/the verge (neogaf) answering to a guy that said that next year first party Wii U games will look just as good, if not better than PS4 / 720 exclusive launch games:

Nope. You might like the style of the Wii U games Nintendo shows next year, which is totally fine, because plenty of people preferred, say, Mario Galaxy, or even Twilight Princess to what was happening in games like Gears or Halo or Uncharted. But the 360/PS3 titles were clearly technically superior by a very wide margin. This will be repeated with the next Xbox and PS4.

I've spent the last year talking to publishers and developers off the record. They've been waiting for Durango in particular. There are things they want to do that they haven't been able to pull off, and many feel that the shrinkage in the market is due to hardware stagnation. The big guys and popular developers that haven't already gone iOS are hitching their wagons to Durango in a big, big way. Which terrifies them, by the way. They are not in any way sure that it's going to turn things around, business wise.

I'm not saying the next Playstation isn't part of the equation, but they've seemed much less clear as to what it will even be. Granted, I haven't talked to everyone, obviously.
 
Could there be a connection between the performance degradation in CoD:BO2 vs XBox 360 and the fact that it uses 2xMSAA where no other Wii U game is using it (or is this part true)?

This is one of the things I was thinking. BO2 might be "sub HD" but at 60fps it's pushing significantly more pixels per second than a comparable 1280 x 720 30fps version would, plus there's the MSAA thing as you say. Along with explosions and other effects generating transparencies it would seem a good place to look for bandwidth issues.

Xenos' 8 ROPs, built in to the eDRAM, can do 4x MSAA in one cycle. With 32-bit color and 32-bit depth/stencil over 32 samples (or 64 depth-only) we're looking at a really really wide internal datapath of 2048-bits after the ROPs, or 128GB/s at 500MHz. With only 2xMSAA you would need 64GB/s. But this still leaves a huge range above the main RAM bandwidth of 17GB/s where you could still be bandwidth limited on fill.

The figure MS give for edram bandwidth is 256GB/s, as it can do a full-speed alpha blend. The PS3 does a surprisingly good job of keeping up, under the circumstances!

So given a scenario where the external interface on Wii U's eDRAM is just as fast on Xenos's, that gives you 32GB/s (it could be that like on Xenos Nintendo went with a synchronous core and RAM clock, which would give you a slightly higher 35.2GB/s). Now let's say that this time around you can texture from eDRAM. With Wii U's limited main RAM bandwidth this is desirable. In fact, it could be entirely possible that you must texture from eDRAM, which AFAICR was the case in Flipper/Hollyoowd AFAIK, which would make a big bandwidth hit on it unavoidable. In this case it's easy to see how Wii U's GPU eDRAM bandwidth could end up a limiter long before Xenos's, while still providing tremendously more than what main RAM alone does.

I was originally thinking of the 35.2 GB/s figure based on a 550 mHz synchronous clock (it was the starting point for my speculation), but then along with main memory bandwidth that'd give you ~48 GB/s aggregate bandwidth - higher than the PS3's total and double Trinity. The video memory bandwidth alone would be 50% higher than the PS3's vram and Trinity's unified ram. This doesn't fit with what were saw in BO2 (or the missing trees in Darksiders 2, which I suspect may also be bandwidth related).

This 35.2 GB/s bandwidth would also require Nintendo paying to engineer a high bandwidth internal bus, more advanced than the 360s unidirectional bus, and to do so for a system with rather unimpressive performance. That's what got me thinking about simple and cheap alternatives based on existing technology that would seem to fit better with what appears to be the WiiU's performance profile.

I hadn't actually considered the possibility of the WiiU only being able to texture out of edram though. That's very interesting, and would seem to demand more bandwidth from the edram than a 1600 mhz 64-bit bus could provide. In such a scenario, where the GPU couldn't see outside its pool of edram, perhaps the "master core" with its much bigger cache might be designed to feed the GPU with any data it requires from main memory? Such a feature might also make sense with the two "unganged" memory channels I was speculating about earlier actually, if the GPU were to normally be busy on the edram channel ... :???:

The FXAA in games like Arkham City is probably done in GPGPU; I don't think they'd be able to work that into the already overburdened CPU and main RAM load. This suggests at least a possibility of being able to texture from eDRAM. I don't think they'd add it if it hurt framerates further on a game that is already compromised; in this case the GPU probably had enough spare fillrate and ALU power but there could be additional overhead introduced in the extra resolve plus texturing bandwidth from main RAM that'd be involved if it can't texture from eDRAM.

I think you're right and that it can texture from edram. In my speculation on the previous page I was assuming the edram could be read from and written to by the GPU (and perhaps the entire system) like main ram or video ram on a PC GPU. You'll know more about this than me, but I'd have thought that the FXAA would be done after rasterising and texturing is complete and so you wouldn't be contenting with other processes for bandwidth (so you'd be shader limited). If so it wouldn't hurt performance in the way MSAA could, and if it's true that the WiiU has 320+ shaders at 550 mHz then it should be faster at it than the 360 at that form of AA.

Due to it's lower resolution BO2 really needs some kind of sub-pixel AA though I think, so they probably didn't have the option.

If Wii U's GPU eDRAM were as capable as Xenos's then we should be seeing at least 2xMSAA on ports that didn't have it on XBox 360; it'd be free given that Wii U's GPU has more than 2x the eDRAM.

Not all engines support MSAA (like Unreal) but on many games and on Nintendo's first party games it should be a really nice way to improve IQ. Maybe the pixel counters of the IQ thread could draw up a list of MSAA WiiU games?
 
I find it slightly comical that some of the same people who were touting the 360 GPGPU capabilities to help its CPU against Cell, is now downplaying the better implementation (most likely) of the same idea for the Wii U.
One can't help but think it's just ill will towards the name.

If the rumored large eDRAM pools on both dies are true the smallish bandwidth will be more than made up for AND it will allow the design to scale better with improvements in fabbing.

And as already said in other posts in this thread, the CPU should be fine of it is not used for the bulk of vector tasks.
Compare the 740/750 bechmarks @ ca. 1Ghz with processors from around 2005 (launch of 360) and you will see very respectable performance.

One thing is for sure: The Wii U GPU is a lot more powerful than either PS3's or 360's.
Seeing how that power is going to be used in various ways will be interesting.
 
Last edited by a moderator:
I don't remember anyone talking about GPGPU and the X360. I'm not even sure if the term was invented at all in 2005-2006.
 
I think the XB360 talk was just it's GPU making up against Cell, rather than GPGPU, since PS3 had "measly cut down G71"
 
I would hardly call blops 2 60fps on any platform apart from PC.

From what I've seen during fire fights the frame rates drops to 30-35fps and in that case it's not really any higher then a game that's 720@30fps.

Frame rate during intense action is much important then the frame rate when Jack is happening on screen
 
The figure MS give for edram bandwidth is 256GB/s, as it can do a full-speed alpha blend. The PS3 does a surprisingly good job of keeping up, under the circumstances!

Right, I meant to say 128MB/s write bandwidth, plus 128MB/s read bandwidth for grabbing color and depth/stencil values. If Wii U's GPU eDRAM isn't optimized for read-modify-write it'd be another deficiency.

I was originally thinking of the 35.2 GB/s figure based on a 550 mHz synchronous clock (it was the starting point for my speculation), but then along with main memory bandwidth that'd give you ~48 GB/s aggregate bandwidth - higher than the PS3's total and double Trinity.

You can't just add bandwidth together like this. If eDRAM is used both for render targets and textures it's basically like a cache. There's extra bandwidth overhead transferring to and from main RAM to eDRAM to fill it with textures and resolve render targets (unless the video output comes straight from eDRAM, in which case that takes bandwidth instead). I don't know how big a typical scene texture footprint is, but if you're streaming it into an eDRAM texture cache you'll need to basically double buffer it to have some part reserved for the incoming textures. This could cut down the amount of space by a fair amount.

Of course, it's possible that there's dedicated hardware that'll manage the eDRAM as an outer level texture cache.

The video memory bandwidth alone would be 50% higher than the PS3's vram and Trinity's unified ram. This doesn't fit with what were saw in BO2 (or the missing trees in Darksiders 2, which I suspect may also be bandwidth related).

But isn't that what we're seeing, that in BO2 Wii U is typically behind XBox 360 and ahead of PS3?

You'll know more about this than me, but I'd have thought that the FXAA would be done after rasterising and texturing is complete and so you wouldn't be contenting with other processes for bandwidth (so you'd be shader limited).

I don't know how it's done in games but ideally I'd expect you to be post-processing frame N while frame N+1 is being rasterized, if you have the space for it.
 
Status
Not open for further replies.
Back
Top