Wii U hardware discussion and investigation *rename

Status
Not open for further replies.
Oh, a wuu could undoubtedly draw a 2D platformer at 1080P with oodles of parallax, you could even use Z-buffering to eliminate overdraw between layers, and 1GB of RAM and 25GB of disk space would let your pixel artists draw graphics until their fingers bleed. The ancient Amiga has nothing on a wuu, for sure.

However, platformers are pretty un-demanding as far as hardware goes. If you look at a game like Limbo you find it runs fine even on the integrated GPU in intel sandy bridge.
 
If the WiiU is capable of running at higher resolution with the same performance due to untapped hardware performance being available then Nintendo would have simply upped the target resolution. It's quite simple to do. The fact that they did not is informative.

It's a predominantly 2D game (as in, graphically) like the rest in the series. Changing the resolution isn't trivial and higher resolution demands higher detailed artwork which requires more time and money.
 
Actually I was a little tired already ;) But ERP has also mentioned the performance, if there was a lot of extra bandwidth available then the Wii U could run at 1080p most of the time. Yet not even Nintendo's own games - that have a relatively low graphical complexity - are going above 720p usually.

I'm not a tech-insider, but running 1080p would be a hassle when this GPU has the same design as Hollywood, namely a small embedded frontbuffer that can be dumped to memory. For that very same reason I wonder why it can't have 16rops; the frontbuffer would have the bandwidth if designed so and only large textures on small polygons would suffer from texturing bandwidth issues.

With early z checks and front to back drawing I suppose MEM2 rate doesn't need to be huge either; only 1Mpix needs to be textured, so at 60fps and 64 4byte texel fetches per pixel MEM2 would be saturated.

Another thing that caught my attention with blops2 is the amount of colour banding in zombies game. It wouldn't suprise me if that GPU uses low precision FP units. Could such a thing be? For example it could use SPs that can be setup as 2xFP16 or 1xFP32. It could be in the 500 GFlops range then. Just a speculation ofcourse, I'm nothing compared to others here.
 
It's a predominantly 2D game (as in, graphically) like the rest in the series. Changing the resolution isn't trivial and higher resolution demands higher detailed artwork which requires more time and money.

My sarcasm detector is broken, so take my response accordingly...

Hogwash. Changing the resolution is trivial. You don't need higher detailed artwork for higher resolution.
 
Hogwash. Changing the resolution is trivial. You don't need higher detailed artwork for higher resolution.

This is 2D that is predominantly not scaled or anything. They're shooting more or less for the art assets to be 1:1 with the pixels in the framebuffer. If this is your goal and you change the resolution then that means you're either going for a wider viewport than what the game was designed for or you redo the art. If you think it's trivial then you need to spend some time with more classical 2D game consoles.

This is the difference between traditional 2D raster graphics design and modern 3D vector graphics design. The former is something NSMB has always targeted.

Yes, they could upscale this sort of graphics to 1080p, but what would be the point of that?
 
The characters are polygonal, at least. Some parts of the environment do have edge aliasing as well (most notably on the world map where they don't use their edge filter).
 
I appreciate the response, Grall. I do not presume to be able to look at a scene and immediately identify why Wuu is choking and how that relates to hardware (especially alpha effects and such), so the experience of those here is quite welcome.

Yes, but there's not the slightest shred of evidence that says that information relates in any way whatsoever to the edram integrated into wuugpu. That's where your inferrence fails. It's just baseless speculation, it has no foundation.

It's an eDRAM technology developed by the company that we know is providing the Wuu GPU and eDRAM. They even say that it's targeted at games consoles! Perhaps there is some other type of RAM that Renesas are hiding, but it seems highly unlikely. It's still speculation, I'll give you, but baseless is selling it a bit short.

Grall said:
Yes, but as mentioned before, that bandwidth exists solely to support single-cycle 4x antialias combined with alpha blend. If wuu isn't designed with that capability in mind there'd be no point to have that much bandwidth at hand, especially if it only has 8 ROPs as suspected. As a comparison, the contemporary (from wuugpu's base technology) radeon 4890 has 16 ROPs and 128GB/s bandwidth.

I've been thinking about this one, and have concluded that you are probably right. After seeing Durango's set up w/ eSRAM at 102.2 GB/s and reading some of the explanations in the other thread, it seems that modern ROPs do not need the type of bandwidth that I was suggesting for framebuffer purposes. The other (and lowest) possible configuration of that UX8GD eDRAM entails 8 MB macros, each with a 256-bit bus. This is actually kind of intriguing, because at 450 MHz (the reported clock of the dev kit GPUs up until late 2011 ), that nets you 57.6 GB/s. This is the exact bandwidth of the RV770LE, some (surely downgraded) version of which was rumored to be used in the first dev kit. It's baffled many why they would choose that card to approximate Wuu's performance in the early days. Perhaps this is the answer.

Anyway, at 550 MHz, that would put eDRAM bandwidth at 70.4 GB/s. Probably good for 8 ROPs. One lingering question I have, though, is why would Nintendo stick to 8 ROPs? Would it not strike them that rendering an additional scene to the Gamepad would surely increase necessary fillrate? Have ROPs improved that much since Xenos?


Grall said:
I didn't say you should ignore it; that information is years old and is nebulously diffuse. Nothing specific can be derived about the hardware itself beyond what is stated from that tiny fragment of information.

I merely brought that up as you said the claim was never made that Wuu could pull of MSAA and right there it says it's capable of 4x at 720p. I don't know if that counts as "free" like on Xbox 360, but from what I've heard, that expression was a misnomer to begin with. I also doubt they've scaled back from what's on that sheet, so it should still be capable of as much, although I'm sure with many a sacrifice.


Grall said:
Yeah, but as mentioned, you don't need to emulate the texture cache specifically, in hardware, to run wii games on another system. You just let the game think it moved textures into the cache, while in reality it just sits right where it always was in main memory, and the GPU renders transparently from there with no penalty - since that's exactly what it's designed to do. Flipper on the other hand probably can't texture directly from main RAM at all, IE it would be limited in capability compared to modern GPUs.

Not sure about that last claim, although you may be right. From what I've read on Flipper, textures can be locked into the texture cache, but I'd imagine all other textures would still run through it in what space there is left. However, I don't think that pulling textures from main memory would work as you describe it in Wii mode. Some of those texture reads require very low latency, which would be impossible to match with off chip DDR3. Additionally, It just seems very unlikely and shortsighted that there would be no render to texture and such to the eDRAM. Again, the labeling as MEM1 points to it being used as a main pool if one desires to do so.


Grall said:
Why not. The underlying wii hardware is almost 15 years old by now. The wuu CPU is still code compatible to wii from what we know, it should be able to run the game logic directly with little to no overhead. The differences in GPU, sound and I/O can be offloaded to a software abstraction layer, running on the other CPU cores. Remember that dolphin is a hobbyist amateur project, nintendo obviously has a lot more resources to dedicate to their own emulator, including full hardware documentation of both systems and so on.

You are correct regarding the CPU being compatible, and from what I've read, Dolphin is very CPU intensive. I also doubt they'd even have to go through the trouble of emulating Starlet and the DSP. It's possible Nintendo are using a modification of the same DSP core which was used in Wii (they are clocked similarly at least) and even the ARM cores may be compatible. I have speculated in the past on the use of a Cortex A5, due to its small size and it being included in future AMD APUs as a security processor , but looking at Nintendo's recent history and the 3DS especially, I wouldn't put it past them to have stuck in a pair of ARM11s. They'd save a buck and have perfect binary compatibility w/ the ARM926 in Wii. Heck, they could probably even make a "Super 3DS" peripheral for Wuu then, if they wanted.
 
The characters are polygonal, at least. Some parts of the environment do have edge aliasing as well (most notably on the world map where they don't use their edge filter).

Yes, it is predominantly but not entirely 2D, save for the world map which is entirely 3D. Same with all the NSMB games. This one does seem to have more 3D in the level backgrounds, but unless my eyes are fooling me most of the level graphics and enemies are still traditional sprites.
 
You can do "sprites" just fine with polygons. Here you can see that the camera zooms in at the end of the level:

http://www.youtube.com/watch?feature=player_detailpage&v=uHnnclSwS8Q#t=399s

so there is certainly no requirement for 1:1 texel to pixel mapping. Just looks like bilinear filtered textures. The game would certainly look better at 1080p.

Yes, occasionally it zooms in slightly in or out, but nominally it's intended to be 1:1. There's a reason why I used words like "shooting for" and "goal" instead of "requirement." And I certainly didn't mean to imply anything about non-polygonal or non-filtered graphics. I don't know how my point is being lost here. It wouldn't look better at 1080p because for all those times when it's NOT slightly zoomed (which, if it's anything like the others in the series, is far more than it is) the graphics drawn for 720p will look blurrier. Not a great trade-off IMO.
 
Nope. You send work to the GPU - it completes it as quick as it can. More ROPs with adequate BW would mean more drawing. The only bottleneck would be BW, and if Nintendo included more ROPs than BW to support them, then they are stupid. ;)

Hi shifty, assuming that the area covered by 16 ROPS (4x4) is painted with the same material, which is cached in all TMUs involved, would bandwidth really be a problem? I ask because I thought that, at least in the past, all ROPS load the same block of data to their local caches, so every single texel read from main memory will be distributed to all 16 ROPS at the same time instead of it requiring 16 separate reads.
 
I don't know of any GPU that bunches up all ROPs in a 4x4 block. PS2's rasterizer was a bit weird (like in many other ways!) and used 2x8 (or 2x16 IIRC, when not texturing), but modern GPUs typically work in an independent 2x2 block, this certainly includes any GPU derived from a PC/directx heritage like wuugpu.
 
My apologies for derailing the topic temporarily, although some vindication is in order. I stated that Monolithsoft's project (as well as Retro's) would be the first graphical showpieces with certainty for the Wii U platform some time ago. I even said that Takahashi teased a Xenoblade sequel, though I believed he was simply trolling. Here it is: http://www.youtube.com/watch?v=6GxUMMGyZcM

Retro's will most likely debut at this year's E3. I return to catching up, & seeing where I can contribute regarding the Wii U's architecture.
 
I've being partaking in some of the Wii U technical discussions on Neogaf, it's raised a few more questions I was hoping someone where could input their view on.

ROPs and shaders are pretty integeral to a GPU, given the wide range of processes they handle. Z and Frame buffering, vertex, matrix, texture mapping, etc etc.

The Wii U has 32 megabytes of eDRAM. That to me doesn't sound like enough to store the data for the above work loads in a complex 3D game at 720p, yet alone anything else like act like the theorised scratch pad. The Xbox 360's eDRAM wasn't large enough to store much outside of the buffers, and relied on the GDDR3 for bandwidth for vertex, texture, and many other tasks. All of which happen to be very bandwidth dependant.

I can't see how on earth Nintendo can get around the 64bit bus on the Wii U while still being able to offer even parity with the visual quality and texture resolution of Xenos.

I'll admit GPU architecture isn't my strong point, but is it even possible for modern GPU architecture to be able to achieve the same results as the Xenos but with a significant reduction in over all bandwidth?
 
I've being partaking in some of the Wii U technical discussions on Neogaf, it's raised a few more questions I was hoping someone where could input their view on.

ROPs and shaders are pretty integeral to a GPU, given the wide range of processes they handle. Z and Frame buffering, vertex, matrix, texture mapping, etc etc.

The Wii U has 32 megabytes of eDRAM. That to me doesn't sound like enough to store the data for the above work loads in a complex 3D game at 720p, yet alone anything else like act like the theorised scratch pad. The Xbox 360's eDRAM wasn't large enough to store much outside of the buffers, and relied on the GDDR3 for bandwidth for vertex, texture, and many other tasks. All of which happen to be very bandwidth dependant.

I can't see how on earth Nintendo can get around the 64bit bus on the Wii U while still being able to offer even parity with the visual quality and texture resolution of Xenos.

I'll admit GPU architecture isn't my strong point, but is it even possible for modern GPU architecture to be able to achieve the same results as the Xenos but with a significant reduction in over all bandwidth?

A couple of pages back a couple of us had a think about ways the WiiU might move BW requirements from main ram to edram, relative to the 360:

http://forum.beyond3d.com/showpost.php?p=1695285&postcount=4276
http://forum.beyond3d.com/showpost.php?p=1695303&postcount=4280

It's not likely that the Wii U GPU is seeing a big reduction in overall BW relative to Xenos for doing the same jobs, just that the BW is being moved elsewhere (mainly edram and caches). The Wii U probably doesn't have the magic ROPs of the Xbox - in MSAA and transparent overdraw it seems to bang its head and can't do what Xenos does - but it can probably make up for this to some extent through BW saving features like Z compression and stuff.

For Xbox 360 resolution graphics 32 MB is actually a pretty decent amount of embedded memory.
 
Hi guys! do do we know for sure now if WIIU is more powerful then current gen, or is it just on par.
On a par. Any performance advantages Wii U may have are offset by limitations (being small and low power draw), such that any increase in overall performance (nigh impossible to measure) above current gen will be fractional rather than a multiple.
 
I just hope it's on par, but with the ability do to "true" 720p even on complex game, and no strange upscaled resolution.
 
I don't know of any GPU that bunches up all ROPs in a 4x4 block. PS2's rasterizer was a bit weird (like in many other ways!) and used 2x8 (or 2x16 IIRC, when not texturing), but modern GPUs typically work in an independent 2x2 block, this certainly includes any GPU derived from a PC/directx heritage like wuugpu.

Ok, given that each 2x2 group works on a different triangle that may use different textures, it increases texel bw requirements. Sorting triangles by texture helps lowering it again I guess. Four 2x2 groups also requires 4 times as much framebuffer access. Is it feasible to assume that, due to its lower latency and thus suited for random access, eDram is a requirement in that case?

Thinking a bit about caches, I kind of wonder what a texture cache's efficiency is and how much a miss actually costs. I guess it has to read a complete cache block from memory in that case. Perhaps it is possible to estimate how much texel bw is needed for 1280x720 anyways.

I btw found this patent (perhaps already mentioned before): http://www.google.com/patents/US20120309526. Not that it gives away anything but it explicitly mentions VRAM, which is supposed to be used to store vertices, textures etc.
 
Status
Not open for further replies.
Back
Top