Wii U hardware discussion and investigation *rename

Status
Not open for further replies.

Something must conduct the thread scheduling so yes, there must be a kernel running on each core (which could be implemented in hardware ofcourse). I would be amazed if all cores literary run OS tasks.

Couldn't it be that a lot of the vectorizable code such as the physics engine etc is being offloaded onto the GPU to compensate for the weak CPU?

If it's linear code (i.e. code that can be unrolled) yes, it could be. Perhaps the unidentified parts of Latte house something to perform better on code that branches a bit more. It would be wonderful if a shader cluster could run 1 unique thread per datastream instead of multiple datastreams per thread. I'm not sure if my understanding is correct though, but as long as nobody takes effort to correct me...

The next game confirmed to look better than PS360 and PC versions seems to be Deus EX. I cannot get a grasp if it really looks worse on PS360/PC... Anyone feels to comment?
 
Fourth, doesn't this only apply if the PS360 versions don't hit anywhere near maximum fillrate? I don't know how much SPU clocks today's shaders take. But the fact that PS360 doesn't filter shadowmaps either makes me think they don't use an excess of it. 20-30 ops divided over all 5 ALUs at a max doesn't impact the fillrate at all, having 240 SPUs. But it's a wild guess, I sadly enough don't have experience on modern GPUs.

A custom ISA that provides some 'shortcuts' to accelerate general shading algorithms, not being used due to immature GX2 API or unwillingness to port code to DX10.1++, may be a reason too. Though I think this is my own sad desire:)

You sound like you have a better grasp of this stuff than I do, so I'll just sit back and do some more reading on the topic. :LOL:
 
DEHR wasn't exactly a graphical powerhouse in the first place.
It still doesn't run super great on the RV770 that I occasionally play it on (probably needs more VRAM).

Now that I think about it again, I hope there's another Deus Ex in the works.....need moar! lol
 
It still doesn't run super great on the RV770 that I occasionally play it on (probably needs more VRAM).

Now that I think about it again, I hope there's another Deus Ex in the works.....need moar! lol

How much VRAM do you have on that thing? I ran it at high settings aside from DX11 and the various crappy AA options they built into it on a GTS 250 with a passable frame rate(About the same as the consoles).
 
It's 512MB. It must average about 30fps at 1680x1050 with everything cranked. Undoubtedly that 4850 dusts WiiU. But the game engine isn't exactly the smoothest creation even after their numerous stutter fixes.

On the same note, Dishonored suffers a lot with 512mb. Frequent stuttering compared to a 1gig card.
 
Guess what, my wife bought me NFS for my 36th birthday last week haha:)

So I took a look at it and the reflections and nighttime lighting look really good. It's a pitty they don't have proper AA and shadow filtering though:) Anyway, it made me think back about the Eurogamer face-off and realized that the pictures and framerate comparison may give away some good performance indications:

- framerate is between 1.25 and 1.3 times faster than XBOX,
- there is more geometry drawn in reflections, even though other images show less difference, the first image of the quad comparison clearly shows only half the screen actually reflects on XBOX.
- the reflected texels look like 4 times better resolution

Could this indicate that the GPU must be at least at 1.5-1.9 times the XBOX's?

Furthermore, it is plausible that they use depth based reflectionmapping, such as RLR. The reflections are too accurate too not use such a technique, and drawing mirrored views for a given maximum number of planes that reflect is less flexible. RLR uses a 'raytracing in a environment cubemap' technique to estimate the correct reflected pixel (it cannot reflect anything around corners as such).

IMO, the better resolution reflections are caused by finer resolution raytracing so could this indicate that the GPU has 4 times better (or at least 2 times better) texel bandwidth than XBOX has?
 
- framerate is between 1.25 and 1.3 times faster than XBOX,
- there is more geometry drawn in reflections, even though other images show less difference, the first image of the quad comparison clearly shows only half the screen actually reflects on XBOX.
- the reflected texels look like 4 times better resolution

Could this indicate that the GPU must be at least at 1.5-1.9 times the XBOX's?

No. In order to say "at least X to Y" you'd need to have solid measurements of performance and not be relying on subjective quality to derive qualitative performance. It could be a simple matter of improved texture cache, as the XB360 is rather anemic by today's standards.
 
I came across this on the internets regaurding the wii u amidst a lot of.... stuff, and it actually kinda bothers me, this one thing.
The wii u apparantly has a 64 bit bus, and relies on some nice size edram pools on its gpu to keep interactions on main ram bandwidth apparantly managable on a 64 bit bus.

Ok. Sure, 64 bit bus. Im a little old, behind times, but some things dont change right?
Ddr whatever, you want a 64 bit bus yeah? 4 chips, with 8 lanes coming from each chip. each lane gets 2bits per clock (rise and fall)

so 8x2=16x4 chips = a 64bit bus. Seen it a hundred times on countless ram chips. But even on consoles.

360: 4chips, 16 lanes come from each chip.
16x2=32 bits x 4 ram chips=128 bit bus. Thats the 360 yeah?

So, wii u should have 8 little lanes come from each ram chip. Sothen i decide to go google pictures.
Wii-U-hynix.jpg

Why so many? There should by 8 per chip for a 64bit ddr bus. Ive never seen... this. There is an easy explanation right?
 
The game is double buffered with vsync on. You have to be very careful trying to directly compare frame rates.

At a guess I'd say that the GPU is double buffering, but something on the CPU side is probably triple buffered. The Wii U GPU appears to be doing a little better than the PS360 - fewer drops to 20 fps, but something seems to be causing the game to flicker just under 30 fps more often during gameplay and it seems likely that this is the CPU. IMO.

It's unlikely that the bulk on the consoles power is being spent on reflections, so I wouldn't build too many hopes on nicer looking wet roads. Rendering to a fairly low res texture using low lod assets seems the most likely way to do road reflections so increasing the draw distance shouldn't be too costly. If you got a lot more power to spare then throw some at the shadows or even the alpha.

I really should catch up with this thread.
 
No. In order to say "at least X to Y" you'd need to have solid measurements of performance and not be relying on subjective quality to derive qualitative performance.

The framerate and reflection resolution are evident. I was subjective on the amount of additional geometry for reflection maps and assuming a 4 times higher reflection res though. Yet, the WiiU's reflections show detailed decals not just a few more cubes so that less sky is reflected. You think that the extra time it spends on that is only marginal?

It could be a simple matter of improved texture cache, as the XB360 is rather anemic by today's standards.

What exactly could be a matter of improved texture cache? Sampling the textures while tracing a ray at higher frequency, or writing 4 times as much pixels to the framebuffer to increase environment map quality?

Function said:
The game is double buffered with vsync on. You have to be very careful trying to directly compare frame rates.

Why? V-sync means the ramdac (or whatever it's called today) swaps buffers as soon as it finishes reading the previous frame. Why would this behaviour differ between systems?

Function said:
It's unlikely that the bulk on the consoles power is being spent on reflections, so I wouldn't build too many hopes on nicer looking wet roads.
No? Where does it spend most of its time on instead in your opinion? Producing shadow maps? Raycasting doesn't come for free, given that it uses this approach.

Function said:
Rendering to a fairly low res texture using low lod assets seems the most likely way to do road reflections so increasing the draw distance shouldn't be too costly.
I agree that it doesn't need a huge res texture, in contrast to shadowmapping. Though, because of the better quality reflections, this game must be either raycasting at higher quality or render the environment maps in a better resolution. Or both. The first requires better texelrate and sufficient SIMD troughput, the second requires better fillrate. I don't think this can be handled by a 'slightly faster' GPU.

Function said:
If you got a lot more power to spare then throw some at the shadows or even the alpha.
They obviously could have chosen to approximate the XBOX's framerate and leave some room for filtering nearby shadows. But they didn't.
 
Why? V-sync means the ramdac (or whatever it's called today) swaps buffers as soon as it finishes reading the previous frame. Why would this behaviour differ between systems?

With v-sync, if you miss 60 fps you drop to 30 fps. If you miss 30 fps you drop to 20 fps. This doesn't mean that a GPU that doesn't drop is 100 % or 50% more powerful. You can't do the kind of maths you're attempting to do.

No? Where does it spend most of its time on instead in your opinion? Producing shadow maps? Raycasting doesn't come for free, given that it uses this approach.

I think it spends most of its time not on reflections.

I agree that it doesn't need a huge res texture, in contrast to shadowmapping. Though, because of the better quality reflections, this game must be either raycasting at higher quality or render the environment maps in a better resolution. Or both. The first requires better texelrate and sufficient SIMD troughput, the second requires better fillrate. I don't think this can be handled by a 'slightly faster' GPU.

Do you have any more info on the way the game produces reflections. I can't find anything, but I'd like to know more. I can think of a couple of ways you might be able to do reflections what I think would be fairly quickly (using using surface normals that you'd already be calculating and the Z and colour buffers that you'd already have) but I'd like to know what they did.
 
Sorry for the slow reply!

I'm not so sure. That 160 shader pc card was a DirectX11 part. I haven't seen any evidence that running games designed around a DX9 foundation on a comparable DX10.1 chip (which Latte reportedly is) gives performance advantages substantial enough to make up for an 80 shader disparity. That is to say nothing of the fact that Wii U does not use DirectX.

Well the Wii U has a 10 % clock advantage, and so it probably has 10% more fill and texturing. Or maybe more - it's likely that as well as having faster TMUs that the WiiU has more efficient TMUs - the 360's texture cache is apparently quite small and memory latency is almost certainly lower on the Wii U.

So the Wii U is probably just an awful lot more efficient - the shaders will be bottlenecked less on either end than on the 360 (assuming ROP BW isn't an issues), the logic feeding the shaders will be better, VLIW 5 is probably better than the 360's Vec4 + 1, and the GPU won't idle whenever it's copying data out from edram to main memory. And thanks to early Z test it's probably actually working on fewer pixels too.

And maybe it's more efficient on small polys too - higher ROP to shader ratio and all that.

There are a lot of things that when combined could strongly stack up in favour of the Wii U GPU just being a hell of a lot more efficient. What was that figure that went round NeoGaf for Xenon vs Durango shader efficiency, 53% vs 100% or something? Possibly that particular example is bollocks, but the Wii U only needs to get 35% more work done per clock from it's SIMD units over the length of a frame and it's past the 360 ...

While there are a variety of explanations for the disappointing framerates we've seen in Wii U games, I do agree with your assessment that a 320 shader part should be able to run PS360 games in higher resolutions. However, it is also possible that the amount of eDRAM is limiting the size of the framebuffer, and this is the reason for the lack of jump in resolution. This would be especially true if devs are also using it to store local render targets and perhaps even run CPU code.

This could well be true, and while I think the Wii U probably (but not certainly) is 160 shaders I wouldn't rule out the possibility of myriad smaller factors getting in the way of a 320 shader part. When I look at the Wii U as a whole though I do see an efficient machine - performance per watt is clearly beyond PS360, and it does this using relatively little silicon and pretty old processes. Even the CPU now seems to be - while weak in absolute terms - very capable given its tiny size and almost certainly tiny power consumption. Kind of makes you sad that IBM CPUs have mostly been forced out of consumer products actually.

Finally, just because I think the Wii U is weak sauce as a product, this doesn't mean I think that Nintendo's engineers or IBM or AMD suck. They don't, in fact with probably quite limited R&D (compared to MSony) they've done an impressive job. I just think the product can't compete with last gen for the attention of core gamers (or anyone actually) and that failure to move beyond the PS360 in terms of performance is an issue for it.

With me now on the Durango "always only online" h8 train (like a freight train, but full of h8ers) more than ever before I'd really like to see what Nitnendo's engineers could have done with 100 Watts and IBM's 32nm process.
 
You think that the extra time it spends on that is only marginal?
How much of the screen is taken up by these reflections? Can't be all that much I'd think, seeing as you need to leave room for the road, scenery and other vehicles.
 
I'm not so sure. That 160 shader pc card was a DirectX11 part. I haven't seen any evidence that running games designed around a DX9 foundation on a comparable DX10.1 chip (which Latte reportedly is) gives performance advantages substantial enough to make up for an 80 shader disparity. That is to say nothing of the fact that Wii U does not use DirectX.

While there are a variety of explanations for the disappointing framerates we've seen in Wii U games, I do agree with your assessment that a 320 shader part should be able to run PS360 games in higher resolutions. However, it is also possible that the amount of eDRAM is limiting the size of the framebuffer, and this is the reason for the lack of jump in resolution. This would be especially true if devs are also using it to store local render targets and perhaps even run CPU code.

I still think Lattè is based on r670 (3850 - 3870). 320 shaders, DX10, GL3.3 (as in earlier leaks of the Wii U GPU) and possible performance similarities if we can somehow put 1GB 12.8GB/s DDR3 RAM into the equation...
 
performance per watt is clearly beyond PS360, and it does this using relatively little silicon and pretty old processes.

The GPU all told is 150mm^2 if I recall. That's a significant GPU die and you should be getting a whole lot more out of that than "competes with current gen" imo.

Sure that die is on 45nm presumably, but that begs a question to me why they didn't go 28nm as well.

I think the Wii U's bad engineering starts with backwards compatibility. A lot of GPU area is apparently taken by legacy Wii GPU stuff. And BC likely necessitated the underpowered CPU. Nintendo surely could have done better with a clean slate design, I would hope much better.

Then you can probably argue 32MB EDRAM is overkill for this design too. It's the same as rumored Durango which is a machine in another power class. It doesn't appear the games surpass 720P anyway.
 
I came across this on the internets regaurding the wii u amidst a lot of.... stuff, and it actually kinda bothers me, this one thing.
The wii u apparantly has a 64 bit bus, and relies on some nice size edram pools on its gpu to keep interactions on main ram bandwidth apparantly managable on a 64 bit bus.

Ok. Sure, 64 bit bus. Im a little old, behind times, but some things dont change right?
Ddr whatever, you want a 64 bit bus yeah? 4 chips, with 8 lanes coming from each chip. each lane gets 2bits per clock (rise and fall)

so 8x2=16x4 chips = a 64bit bus. Seen it a hundred times on countless ram chips. But even on consoles.

360: 4chips, 16 lanes come from each chip.
16x2=32 bits x 4 ram chips=128 bit bus. Thats the 360 yeah?

So, wii u should have 8 little lanes come from each ram chip. Sothen i decide to go google pictures.
Wii-U-hynix.jpg

Why so many? There should by 8 per chip for a 64bit ddr bus. Ive never seen... this. There is an easy explanation right?
You got this wrong. It's 16 bits on 16 single ended data pins per chip. That they use DDR to transfer data doesn't factor in.
And you need quite a bit more connections to a DRAM chip than just the 16 data lines. You need a clock (some memory types like GDDR5 even use two different ones or differentially signalled ones [needing two pins]), some pins to transfer commands, and supply voltage and ground connections, too.
 
You got this wrong. It's 16 bits on 16 single ended data pins per chip. That they use DDR to transfer data doesn't factor in.
And you need quite a bit more connections to a DRAM chip than just the 16 data lines. You need a clock (some memory types like GDDR5 even use two different ones or differentially signalled ones [needing two pins]), some pins to transfer commands, and supply voltage and ground connections, too.

Yeah, thats in line with pdf's I have on the 4gb modules from every manufacturer (theres like... 3 or 4). They have a pinlist and only so many contribute to data i/o... But even in light of this information this system bus still just seems... wierd to me compared to other pictures of system ram or chips on simm/dimm cards.

Thank you for your time.
 
Actually i have the manufacturers wiring diagram.

On the top is ddq/dd power supplu stuff in, 2 lines, right side there is a lot datastrobe stuff (differential only?) dqu dql dqs etc, on the left side is stuff like clock pins, reset etc 8 out on that side. Bottom is vss, vssq 2 pins. Right is data strobe stuff? (differential only?) dqs dq tdqs etc, goes to one pin out, shows 25 ohms chipset voltage, vddq/2

But even with this diagram... Its not right with what i see in that picture... Its really got me over a barrel. Stupid wii u.
 
Status
Not open for further replies.
Back
Top