No tiling=why Gears of War looks so good?

jAkUp said:
Jacob- Will UE3.0 support predicated tiling to make use of 4xAA on Xbox 360?

Sweeney- Gears of War runs natively at 1280x720p without multisampling. MSAA performance doesn't scale well to next-generation deferred rendering techniques, which UE3 uses extensively for shadowing, particle systems, and fog.

Oh the PR! Of course he means his next generation deferred rendering techniques, but that would be mincing words :LOL: His comments on MegaTexturing (or lack thereof in many ways) shows they are in full PR mode. Hard to blame them as they already have like 2 dozen UE3 games in the pipe and are key middleware providers to both Sony and MS.
 
lol I didn't catch that. I was thinking that he was only speaking of UE3.0.

ERP said:
they do their shadows in screen space before the downsample
Sorry for the noob question, but why go for screen space? (where are shadows "normally" done?) :oops:
 
Alstrong said:
lol I didn't catch that. I was thinking that he was only speaking of UE3.0.


Sorry for the noob question, but why go for screen space? (where are shadows "normally" done?) :oops:

There are many ways to do it...
But the traditional method is to resubmit the recieving polygons, and use the shadow map to determine if they are in shadow.

What unreal does is reverse transform the pixels in the frame buffer which has the advantage that it's cost is unrelated to the complexity of the recieving geometry, with the obvious disadvantage that if you have more samples effected cost goes up linearly with that. You also don't have to worry about which bits of geometry receive shadows in this model.

Whether it's a performance win depends on the likely complexity of the shadow recievers, and how effectively you can cull pixels and triangles in the first case.

It's an interesting approach, what I've seen of GOW, I'm a bit surprised Epic thinks it's a win performance wise, but without trying both methods it's a difficult call.
 
i hope it will come as a demo on xbl marketplace :D. Anyway besides impresive visuals it seems to be very cool game.
 
Sweeney said:
MSAA performance doesn't scale well to next-generation deferred rendering techniques, which UE3 uses extensively for shadowing, particle systems, and fog.

The key part here is Deferred Rendering, afaik every deferred rendering technique will have problems with MSAA to some extend. So while it's not true that all next-gen rendering engines will have problems with MSAA it's true that MSAA performance doesn't scale well to deferred rendering techniques, be them either next gen or not. (not sure about any current game using DR tho)
 
I was doing a little more searching deferred rendering, and google came up with a presentation from nvidia from the 6800 era: http://download.nvidia.com/develope...800_Leagues/6800_Leagues_Deferred_Shading.pdf

ERP said:
UE3 works fine with the EDRAM, what it doesn't work well with is MSAA in general, because they do their shadows in screen space before the downsample, the cost of them is multiplied by the amount of AA.

Does that mean the performance hit would be not unlike using XxSSAA? (just different bottlenecks)
 
Last edited by a moderator:
If Gears of war isn't using predicated tiling, that means EPIC has more flexability for how they can use the CPU cores and their vector engines?
 
Brimstone said:
If Gears of war isn't using predicated tiling, that means EPIC has more flexability for how they can use the CPU cores and their vector engines?
No. Predicated tiling shouldn't be loading the vector engines and it shouldn't require significant CPU time.
 
Alstrong said:
I was doing a little more searching deferred rendering, and google came up with a presentation from nvidia from the 6800 era:
No need to search far. We've got a great article right here at B3D from Deano Calver:
http://www.beyond3d.com/articles/deflight/
Shadow maps are very easy to support under deferred lighting and have very good performance. The key is using the little used variant known as forward shadow mapping [16]. In standard shadow mapping the shadow map is projected onto the object and the depths compared. With forward shadow mapping the objects position is projected into shadow map space and then depths compared there.

When first reading ERP's info about UE3's shadowing technique, I figured it was deferred rendering. I'm still baffled as to why you get a performance win, though, especially on these reduced bandwidth consoles with high speed overdraw reduction.

The only thing I can imagine is getting a speedup by using bounding volumes for the lights. With deferred rendering, using a low poly sphere with the stencil buffer can isolate lighting calculations to lit pixels only. Still, culling via dynamic branching instead should be more effective, especially considering the amount of data transfer needed in deferred rendering. For XB360, just rendering the normals seems like it would require tiling, even without AA.

Of course, I might be completely wrong in thinking how they're doing deferred rendering.
 
Mintmaster said:
When first reading ERP's info about UE3's shadowing technique, I figured it was deferred rendering. I'm still baffled as to why you get a performance win, though, especially on these reduced bandwidth consoles with high speed overdraw reduction.
Computing a per pixel occlusion term employing a full screen pass is way more efficient than computing it in your color pass or in any other rendering pass..(even more on an unified architecture :) )
And obviously you don't need to fully go deferred..
 
Hmm, I don't see what you're saying.

When you say "per-pixel occlusion term", are you just talking about shadowing? If you defer the shadows, then you need at least a z value from the framebuffer to figure out the position of the pixel, and then you have to figure out the coordinates of the shadow map texel to fetch.

That's already more work than ordinary shadow mapping, because in the traditional method the vertex shaders tell you the texture coordinates during the color pass (or other pass) using geometry. It seems like the only way you really save work is if you don't do a full screen, and instead use some bounding volume for a light's extent. Use zfail and backfaces to mark (stencil) screen pixels potentially in the volume, and then with the frontfaces only calculate shadows for marked pixels (and clear the stencil buffer simultaneously).

And if you don't write out the normals, what's the point in going deferred? It's basically just a multipass renderer then, which has lots of redundant work. Where are the savings coming from?

:???: A bit confused here...
 
Mintmaster said:
Hmm, I don't see what you're saying.

When you say "per-pixel occlusion term", are you just talking about shadowing?
Yes
If you defer the shadows, then you need at least a z value from the framebuffer to figure out the position of the pixel, and then you have to figure out the coordinates of the shadow map texel to fetch.
Yes :)
That's already more work than ordinary shadow mapping, because in the traditional method the vertex shaders tell you the texture coordinates during the color pass (or other pass) using geometry.
That's correct.
It seems like the only way you really save work is if you don't do a full screen, and instead use some bounding volume for a light's extent.
That's not the only way, cause the non-traditional way that was proposed here is way more efficient than the traditional way, so even though it requires more work it might be faster in the end, futhermore once you have a per pixel depth value you can use it this information to do some other effect..(DOF, motion blur without a velocity buffer on static stuff, etc..)
Use zfail and backfaces to mark (stencil) screen pixels potentially in the volume, and then with the frontfaces only calculate shadows for marked pixels (and clear the stencil buffer simultaneously).
This is another interesting optmization. What about marking different objects with different stencil masks and apply to them different shadowing algorithms? :)
And if you don't write out the normals, what's the point in going deferred? It's basically just a multipass renderer then, which has lots of redundant work. Where are the savings coming from?
The point is you get some additional per pixel info that can be reused in many interesting algorithms..

P.S. Has FFXIII impressed you? :)
 
3dcgi said:
No. Predicated tiling shouldn't be loading the vector engines and it shouldn't require significant CPU time.

I thought predicated tiling prevented the use of the vector engines for certain things because it caused incompatibility issues? Since EPIC isn't using predicated tiling, they're free to have them generate geometry or whatever.
 
Brimstone said:
I thought predicated tiling prevented the use of the vector engines for certain things because it caused incompatibility issues? Since EPIC isn't using predicated tiling, they're free to have them generate geometry or whatever.
I've never heard of it loading the vector engines and I can't think of why it would.
 
3dcgi said:
I've never heard of it loading the vector engines and I can't think of why it would.

I believe he is talking about the VMX units on the Xenon cores being used for data streaming to the GPU. In which case it does not play nice with auto tiling.
 
Acert93 said:
I believe he is talking about the VMX units on the Xenon cores being used for data streaming to the GPU. In which case it does not play nice with auto tiling.
If Brimstone's talking about generating procedural geometry and having Xenos read from the L2 then it's a big stretch to say the vector units are disabled during tiling. It might be that tiling makes using this feature more difficult, but I bet no current game uses it. The only difficulty I see is you might have to recalculate the geometry for each tile.
 
Back
Top