Wii U hardware discussion and investigation *rename

Status
Not open for further replies.
Why do you think it's raycasting? These consoles have hardware rasterizers. Surely reflections are done using (for example) render targets, cube maps or screenspace reflections like in Crysis 2, as has been commonly seen ever since titles like HL2: Lost Coast and so on.
 
Why do you think it's raycasting? These consoles have hardware rasterizers. Surely reflections are done using (for example) render targets, cube maps or screenspace reflections like in Crysis 2, as has been commonly seen ever since titles like HL2: Lost Coast and so on.

I'll have a look at what techniques they use. But have a look at this screen space reflection technique

Short answer: imagine the image above as reflection map. It's a 2D image, without any depth information. Imagine a reflection ray spawn somewhere on the road. How long must the ray be to reflect the correct pixel? With traditional reflection mapping they use a fixed ray length; too long length results in reflecting the sky, too short results in reflecting the road itself. Therefor, traditional reflection mapping only works correctly for distant objects (mountains on a background image for example).

To approximate the correct length, we have to check at what position the ray intersects with geometry. That's what normal raytracing does. This is too expensive though. That's where the depthmap comes in. Imagine that the reflectionmap also stores depth information. Then we could iterate every pixel between ray start and ray end, compare the ray's depth to the corresponding depthmap texel's depth and decide if it intersects. That way the correct texel is reflected.

Since checking every texel is expensive too, there is several solutions to reduce the cost. We could iterate a very low res depthmap to get an idea where it intersects and then use a binary search in a higher res map for example.

The paper I refer to in previous post explains it in more detail, and also shows the huge error that occur when not raycasting.

[EDIT] BTW doesn't HL2 run on ID Tech engine? I think even Doom 3 draws mirrored views to do reflections. Quake does for sure (I have some hands on experience on those engines, as some may remember)
 
Last edited by a moderator:
I'm sorry, but I'm 95% certain that you are incorrect here.

Several parts of this image suggest to me this isn't a form of screen space reflection (and if you don't trust me, look up the first webpage result for "screen space reflections" in google or bing).

The simplest example I can give you is the tree to the right; the tree in the background has very little aliasing - few holes through to the background - and it has a building behind it. This isn't the case with the significantly lower resolution reflected tree. (There are other examples I could point out - but most are the same thing - missing background detail).

So what is it then?
The obvious answer is almost always the correct one, and the obvious answer (when talking about a game developed by highly skilled professionals) usually means the simplest. Anything with 'Ray' in it immediately disqualifies itself. :mrgreen:

So what is one of the simplest (and oldest) way to do a decent reflection? A planar reflection. Simple, bog standard planar reflection. Assume the ground is an infinitely flat plane and reflect the camera in that plane. Have it 'look up' through the ground - and render that to a texture. Reproject from the point of view of the reflected camera, and boom - you have a reflection.

Here is the first example I could find (from 2001):
http://www.bluevoid.com/opengl/sig00/advanced00/notes/node167.html

It looks to me like exactly this. With a mip-chain being generated (allowing for glossy fake blur) and a bit of displacement to fake up rough surfaces and you have an extremely cheap and convincing effect.

What makes me think it's this?

A) It's what I'd do.
B) In the distance the road changes angle, and the reflection changes angle too (in the wrong way - infinite plane assumption broken)
C) You can see clear evidence of lodding (eg, the tree, the lack of buildings, etc)

So it's a really basic cheap effect - just that's the point, as long as you can work around the limitations (I bet the road doesn't reflect as much when changing elevation angle) then you have everything you want; it looks good and it's fast. That's graphics programming; cheating to make something simple appear complex.

As for the resolution difference (as minor as it is?) I suspect it's unlikely to be performance related. A texture that low res, with that little detail is unlikely to be a significant bottleneck in a complex deferred renderer. Memory use or texture cache size, I'd expect, would be a bigger factor. Either that or in the time since the 360 version release they have been optimising their engine.
 
I'm sorry, but I'm 95% certain that you are incorrect here.

Several parts of this image suggest to me this isn't a form of screen space reflection (and if you don't trust me, look up the first webpage result for "screen space reflections" in google or bing).

The simplest example I can give you is the tree to the right; the tree in the background has very little aliasing - few holes through to the background - and it has a building behind it. This isn't the case with the significantly lower resolution reflected tree. (There are other examples I could point out - but most are the same thing - missing background detail).

So what is it then?
The obvious answer is almost always the correct one, and the obvious answer (when talking about a game developed by highly skilled professionals) usually means the simplest. Anything with 'Ray' in it immediately disqualifies itself. :mrgreen:

So what is one of the simplest (and oldest) way to do a decent reflection? A planar reflection. Simple, bog standard planar reflection. Assume the ground is an infinitely flat plane and reflect the camera in that plane. Have it 'look up' through the ground - and render that to a texture. Reproject from the point of view of the reflected camera, and boom - you have a reflection.

Here is the first example I could find (from 2001):
http://www.bluevoid.com/opengl/sig00/advanced00/notes/node167.html

It looks to me like exactly this. With a mip-chain being generated (allowing for glossy fake blur) and a bit of displacement to fake up rough surfaces and you have an extremely cheap and convincing effect.

What makes me think it's this?

A) It's what I'd do.
B) In the distance the road changes angle, and the reflection changes angle too (in the wrong way - infinite plane assumption broken)
C) You can see clear evidence of lodding (eg, the tree, the lack of buildings, etc)

So it's a really basic cheap effect - just that's the point, as long as you can work around the limitations (I bet the road doesn't reflect as much when changing elevation angle) then you have everything you want; it looks good and it's fast. That's graphics programming; cheating to make something simple appear complex.

As for the resolution difference (as minor as it is?) I suspect it's unlikely to be performance related. A texture that low res, with that little detail is unlikely to be a significant bottleneck in a complex deferred renderer. Memory use or texture cache size, I'd expect, would be a bigger factor. Either that or in the time since the 360 version release they have been optimising their engine.

No need to be sorry, I appreciate your elaborated answer. The link I provided to Grall is the 3rd hit after the google search btw. I agree on a mirrored view being quick, I think unreal 1 featured it (probably not as texture though, and really upside down):) Yet, this solution doesn't cover everything seen here, so I have a few questions to take it for granted, hope you have some time to look at it:

- Lodding and other missing stuff isn't really restricted to one type of reflection mapping right?
- I don't see any changes of angles, except for the T intersection on the left side. I think the road in front of the car is completely flat (I actually drove to the place to check it out, but I can look again). I don't see why the reflection is wrong there (I suppose you mean it's either too high or too low). Could you please indicate what point I should look at ?
- When using the bumper cam on that exact spot in the pic I posted, the tree doesn't look that scattered, it looks much more dense. I thought of this being evidence for raycasting; if it was a mirrored view it would still look a low res reflected tree but with a ray you'd get better sample resolution because of the ray being less steep. I have another look if it's not just due perspective (but it shouldn't). How would you explain this?

I'l have a look if I can find those pointers you gave. I agree, drawing such a low res image doesn't use much fillrate. And if it is just a mirrored view, then my theory is debunked:) And in that case 320 shaders would be out of the question right?
 
Well, I think you guys have officially brought me over to the dark side. More and more it seems to me we are looking at a 160:8:8 configuration. No, that's not a typo. I'm convinced that DF has mislabled the texture units. Except for being aligned w/ the shader blocks, they don't look anything like the TMUs in any Radeon design I've seen, and not being adjacent to any L1 cache makes no sense.

I'll refer to the breakdown here: http://www.conductunbecoming.ie/wp-content/uploads/2013/02/wiiudie_blocks.jpg

The T blocks much more so resemble the TMUs on RV770 and I feel like the real giveaway is the position of the S blocks adjacent to them. Meanwhile, if we figure that each of those SRAM banks is actually 4kB (as I mentioned in a previous post), then the S blocks start looking alot like the 8kB L1 caches in the R700 series.

Immediately this raises the question of how Nintendo are getting the performance they are getting out of merely 8 TMUs. Well, there are the CoD4 benchmarks that function referenced (yes, I'm doing a complete flip flop on them). They seem to show that a 160 shader/8 TMU part is capable of that sort of performance, especially when we take into account the pool of fast eDRAM (which should allow Wii U to perform closer to the GDDR5 version). There's also the more sophisticated texture cache setup compared to the 360. The internal bandwidth of the RV770 was doubled over the previous generation (and who knows how much over the 32kB texture cache of Xenos) to 480GB/s for an L1 fetch and 384GB/s for an L2 fetch. I'm thinking the U blocks might be L2s with somewhere between 32kB and 64kB each. This would seem like a balanced set up taking everything into account. Nintendo have described the system as having a focus on memory, so I can definitely see them trying to squeeze Xbox360 performance out of less logic.

Blu has shown that the CPU may not be as much of a bottleneck as previously thought. Yes, it seems to choke when there are a ton of characters on screen, but it doesn't seem like it's constantly gimping the rest of the system. That would just be poor design and although Wii U may have less horsepower than many of us would like, we haven't heard any reports of it being unbalanced. Quite the contrary, in fact.

Meanwhile, we must take into consideration how Wii BC is being achieved. It doesn't appear that there is a straight up GX on the die; nor is there any translation being done on the CPU, so software emulation is out. Rather, we have Shiota's comments that they adjusted the new components so that they could operate as the old ones in Wii mode. This might make sense of the size of the shader blocks. Basically, I'm suggesting that Nintendo/Renesas have modified the VLIW5 architecture at its very core, beefing up the individual SPs so that they can run TEV code natively. They may have made other alterations as well. I have good reason to believe Jim Morrison when he emphasized that the chip is "custom" and should not be compared to any existing Radeon.

Rounding out this hypothesis, I will suggest there is some other fixed function logic on there. Nintendo can only be described as hesitant in adopting a modern unified shader architecture, and I can see them not wanting to go "all in" with them on Wii U. For Wii compatibility (and possibly to help mitigate the low TMU count), it seems possible they have included a couple of the texture coordinate processors described in this patent from the Gamecube days: http://www.google.com/patents/US200...a=X&ei=eqltUdSMLs-p4APl8oD4BQ&ved=0CDcQ6AEwAA

Meanwhile, block I looks a heck of a lot like the thread dispatch processors identified on Tahiti. This placement would make sense as I'd have the D block containing the Instruction Cache and Constant Cache. The J blocks would then be the fixed function interpolators found in the R700 series, perhaps beefed up a bit with some additional logic (something to perform the role of filling up the 1MB texture cache in Wii mode, which was done formerly by the hardware T&L unit).

I'll stop here, since this post is getting long. Some of this may seem outlandish perhaps, but it is the best sense I've been able to make of this ridiculous chip. haha. I welcome any comments/critiques from you guys.
 
The gx is based on the fx, and both are still pretty similar to the cxe. (Iirc aside from the cache increase, they increased cache bus size, added like, 5 registers, changed branch handling, and fixed a handful of exceptions out of a large list)

A fair amount of code doesnt even need to be recompiled to go from a cxe to a fx or gx etc.

fx and gx are pin compatable, so if some nintendo customized variants in expresso were based off these, all three cores should use the same socket, id imagine this would be good for production yes?


Would it help if i posted some documentation detailing the differences?

Yeah, ill just post links to the info so i dont miss/twist anything and see if its of any use to you guys.

https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/291C8D0EF3EAEC1687256B72005C745C

(this one talks about 750 series smp capability, a first mention ive ever seen)

https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/2A584218A2E9696187256B0600032778

'Preliminary. This application note describes the
programming model, performance, package,
and power differences among the PowerPC
750FX , the PowerPC 750, and the PowerPC
750CXe processors.'

Hope this information is helpful.
 
By GX, I was actually talking about the graphics processor portion of Hollywood (as opposed to the ARM core, north bridge, etc). Actually, I've only heard Marcan refer to it as that - always thought GX was the API. My apologies for the confusion. But those links seem interesting regardless, so thanks a bunch!
 
By GX, I was actually talking about the graphics processor portion of Hollywood (as opposed to the ARM core, north bridge, etc). Actually, I've only heard Marcan refer to it as that - always thought GX was the API. My apologies for the confusion. But those links seem interesting regardless, so thanks a bunch!
.

Oh yeah, GX was also the name of the cube and wii's graphic api.

http://devkitpro.org/wiki/libogc/GX

I do believe i have heard repeated rumblings of wii u using gx2.

But nothing tangible. Honestly, if it is what wii u's graphic api is named, after recent information from devs, I dont even know if third parties working on launch wii u games even had documentation on it....

So im not entirely surprised i havent been able to scrounge anything solid up.
 
.

Oh yeah, GX was also the name of the cube and wii's graphic api.

http://devkitpro.org/wiki/libogc/GX

I do believe i have heard repeated rumblings of wii u using gx2.

But nothing tangible. Honestly, if it is what wii u's graphic api is named, after recent information from devs, I dont even know if third parties working on launch wii u games even had documentation on it....

So im not entirely surprised i havent been able to scrounge anything solid up.


Well if the VGLeaks stuff is to be believed (not sure if it is); then the WiiUs API is indeed called GX2 and the GPU itself was being referred to as GPU7.
 
Well if the VGLeaks stuff is to be believed (not sure if it is); then the WiiUs API is indeed called GX2 and the GPU itself was being referred to as GPU7.

VG leaks is right often, and wrong often. I imagine they do the best they can, but i prefer hard documents or at least tangible evidence.

At the time, there was no gx2 directory at warioworld.com and it returned a 404 page.

HOWEVER, chalk up a point for vgleaks, gx2 now brings up a login page.

https://www.warioworld.com/gx2

Its confirmed.
 
VG leaks is right often, and wrong often. I imagine they do the best they can, but i prefer hard documents or at least tangible evidence.

At the time, there was no gx2 directory at warioworld.com and it returned a 404 page.

HOWEVER, chalk up a point for vgleaks, gx2 now brings up a login page.

https://www.warioworld.com/gx2

Its confirmed.


Yup :smile: Well in that case, we can maybe take the rest of that particular leak a bit more seriously (ie, more confidently assume its correct)


Edit: Although it doesn't give much more detail than that really; only the Shader Model 4.0 & Direct X 10.1 bit which everybody seems to be assuming is correct anyway.
 
Last edited by a moderator:

I dont know what your link was supposed to go to, but mine goes to a 401 page, not 404.

However, if you have access to warioworld, and/or reason and are telling me its no bueno ill have to ammend my personal confirmation.

www.warioworld.com/bogus

This is the kind of 404 page I would typically get for a directory in warioworld that doesnt exist. Since it didnt ask for authorization, i figured the one that did was legitimate. But it looks like thats not necessarily the case.

I may be taking back the praise i gave vgleaks?

Oh well suppose im back to wondering what wii u's graphics api is.
 
Last edited by a moderator:
Apparently WiiU specs have received a dramatic upgrade.

Technical Specifications

Processors

The CPU and GPU are built on the same package.
CPU: IBM PowerPC 7xx-based tri-core processor "Espresso" clocked at 1.24 GHz before the 3.0.0 update, 3.24 GHz after the 3.0.0 update. This is an evolution to the Broadway chip used in the Wii, is 64 bit and uses Power6 technote When IBM has said that Nintendo has licensed the Power7 tech from IBM, Nintendo is not using it for the Wii U, explaining its backwards compatibility.

GPU: AMD Radeon High Definition processor codenamed "Latte" with an eDRAM cache built onto the die clocked at 550 MHz before the 3.0.0 update, 800 MHz after the 3.0.0 update.
 
Status
Not open for further replies.
Back
Top