*spin-off* Deferred Rendering & 360

You can always choose deferred light pre-pass rendering as an alternative which would also allow for more material variety.
I disagree that "light pre-pass" allows more material variety - in fact it's just the opposite. The "pre-pass" requires refactoring your BRDFs into separate diffuse and specular components and although you can vary how you combine these, that's rarely useful; all useful variation happens in the computation of those two components themselves, or not even splitting up the BRDF like that at all.

Fully deferred rendering in contrast is practically a compiler transformation of the forward shader... arbitrary inputs and arbitrary outputs. The only assumption at all is that lighting from different sources is added together, and that's hardly an assumption/restriction - every renderer that I know of does that.

Consoles whose branching is significantly more poorly scheduled than their rasterizer output can tip the practical choice for this generation multi-platform SKUs, but at best light pre-pass is a minor bandwidth optimization (with a flexibility cost!) over fully deferred... and fully tiled/deferred is better than both by a long shot.
 
You can always choose deferred light pre-pass rendering as an alternative which would also allow for more material variety. Can anyone point out the pro/con features of going fully deferred vs. light pre-pass?). I personally get the impression that light pre-pass is easier to implement across multiple platforms (GTA4/RDR, Dead Space 2, Crysis 2) or even on the 360 (Halo Reach) - but there must be some disadvantages too.

Well with light prepass you can totally avoid MRT, which helps for certain hardware (either because it's really slow, or because it has a side effect like Xbox 360's tiling). It's can also be a little nicer for offloading to SPU's, since you have significantly less G-Buffer data to DMA in. The big downsides are that you draw all of your geo twice, and MSAA becomes tricky and expensive if you want to "do it right". This is because you have to work at a subsample level not just during lighting, but also when you're combining it with the materials in the second geo pass.

The "material variety" thing is overstated, IMO. You don't really get much more variety than traditional deferred, since both techniques have the same problem of making it difficult to use more than one BRDF. Light prepass really just gives you more options with regards to how you combine your lighting with your material properties (for instance you can do multiple layers where you do lighting * albedo for each layer and then blend the results), and also makes it a little more natural/convenient to integrate baked lighting.
 
So basically light pre-pass is actually more expensive than fully deferred (due to the second geometry pass), but fully deferred shading requires more memory for the buffers?

For scenes with few overlapping lights, yeah. Once you ramp up the number of lights in the scene, LPP or fully deferred may even end up with similar frame times as the bottleneck shifts.
 
For scenes with few overlapping lights, yeah. Once you ramp up the number of lights in the scene, LPP or fully deferred may even end up with similar frame times as the bottleneck shifts.
Right. Simply put, LPP adds an extra big constant cost (retransforming/rasterizing the scene) but pays a slightly smaller incremental bandwidth cost per light. Thus if you have enough overlapping lights, LPP will win by a small constant factor, but they both scale the same. In reality, LPP is mostly to work around hardware constraints like slow MRT, etc.

And again, tiled like BF3 is doing is far better than both.
 
I disagree that "light pre-pass" allows more material variety - in fact it's just the opposite. The "pre-pass" requires refactoring your BRDFs into separate diffuse and specular components and although you can vary how you combine these, that's rarely useful; all useful variation happens in the computation of those two components themselves, or not even splitting up the BRDF like that at all.

We're about to ship a Xbox 360 titles with a light pre-pass renderer. We use it to have significantly different shading on characters, with more texture maps (e.g. a fake SSS map, a material map switching between skin/cloth/metal shading, a separate environment map mask), and a significantly heavier shader (said skin/cloth/metal splitting, more physically correct specular, custom character lighting etc.). Deferring all this additional attributes to G-buffer channels, and running the heavier shader over the entire screen would be impractical, and now that we're trying to go towards fully deferred (TBH, we're first trying the single-pass LPP variation), we have trouble replicating that functionality.

One significant downside for LPP which I rarely hear mentioned is the need for better than 8-bit precision for the light buffer. Since we work at full 1280x720 (instead of the much more sensible 1152x720 - which we'll use in the future, most likely) we couldn't afford 16 bpp; 8-bit linear is too low-res in the darks, 8-bit sRGB doesn't sum lights properly, storing exp(-x) and blending via multiply instead of add in the light buffer makes all shaders considerably heavier due to the exp/log encoding/decoding operation; we went with a dithering "solution" which non-graphics-programmers don't seem to notice, but grates my eyes.
 
Right. Simply put, LPP adds an extra big constant cost (retransforming/rasterizing the scene) but pays a slightly smaller incremental bandwidth cost per light. Thus if you have enough overlapping lights, LPP will win by a small constant factor, but they both scale the same. In reality, LPP is mostly to work around hardware constraints like slow MRT, etc.

And again, tiled like BF3 is doing is far better than both.
You can also do single pass light prepass (I thought Crysis 2 also does this). Just render all your MRTs in your geometry pass. This pass is basically the same as the fully deferred pass, but after the first pass, you need to sample one texture less in your lighting shader (color is not needed). So you will gain a bit performance if you have high light overlap. The only downside is that you need to sample the lighting buffer after you have rendered all your (local) light sources and do the rest of the light equation. Usually this doesn't add any draw calls or full screen passes, since you'll likely apply full screen sunlight and ambient at some point. So the extra cost in really minimal (just one extra tfetch basically).

We also planned to use single pass light prepass in our new engine (I tend to experiment with everything Crytek devs use :) ). But when I compared it to our old fully deferred tiled lighting system, the light prepass was 3 percent slower on average... and our old tiled full deferred didn't have depth culling optimizations either (it really boosts up the perf). So I have to agree with Andrew. Tiled is the future.
 
We're about to ship a Xbox 360 titles with a light pre-pass renderer. We use it to have significantly different shading on characters, with more texture maps (e.g. a fake SSS map, a material map switching between skin/cloth/metal shading, a separate environment map mask), and a significantly heavier shader (said skin/cloth/metal splitting, more physically correct specular, custom character lighting etc.). Deferring all this additional attributes to G-buffer channels, and running the heavier shader over the entire screen would be impractical, and now that we're trying to go towards fully deferred (TBH, we're first trying the single-pass LPP variation), we have trouble replicating that functionality.
On 360 I could certainly see more G-buffer parameters being impractical, but moving forward I don't think there's going to be a significant issue. Particularly with tiled techniques you touch your G-buffer a very small number of times (even once), so storing quite a lot of parameters in there is not a big deal.

That said, even on current gen hardware it's just a question of whether or not it's cheaper to just lay down a parameter and read it (ideally once) during rasterization of the scene, or regenerate it by re-rasterizing everything. Depending on the particular parameters you can sometimes just store a material index for indirection into a parameter buffer, or if they really are all interpolated vertex params or similar you can just pay the cost and store them with the knowledge that you'll only sample them for pixels that actually use the "heavy" material (assuming decent branching, which is admittedly not guaranteed on current gen consoles).

I'm curious though... you mention "more physically correct specular". That's one of the things that is typically tougher to do with LPP renderers since the specular is done in the "middle" light accumulation pass. If you can handle divergence in that pass, you can by definition handle it the same way with a fully-deferred renderer.

You can also do single pass light prepass (I thought Crysis 2 also does this). Just render all your MRTs in your geometry pass.
Yup that's the variant that I showed in my Beyond Programmable Shading slides - indeed it can be slightly faster for lots of overlapping lights, but it scales similarly. Given that if anything it restricts your options a little bit on BRDFs as I argued above, I'm not sure it's really worth it, but to each his own.
 
Depending on the particular parameters you can sometimes just store a material index for indirection into a parameter buffer, or if they really are all interpolated vertex params or similar you can just pay the cost and store them with the knowledge that you'll only sample them for pixels that actually use the "heavy" material (assuming decent branching, which is admittedly not guaranteed on current gen consoles).
If you have uniquely mapped virtual texturing everywhere and world space normals stored in the virtual texture, your g-buffer can just save the uv to the VT cache. A single 16-16 integer g-buffer is enough for that (in addition to your z-buffer), since caches are no larger than 4096x4096 (in all current virtual textured engines).

With that layout you can sample the world space pixel normal vector and all the material properties (color, specularity, glossiness, etc) directly from the VT cache in the (tiled) lighting pass according to the stored texture coordinate. This setup is practical if you have high amount of material properties, since you do not have to store them in g-buffer in uncompressed format. You read the DXT compressed textures directly from the VT cache in your light rendering.

Naturally if you have per object modifiers to your material properties, you have to bake these modifiers to the virtual texture (since the g-buffer doesn't have any extra storage). But since we already assumed you have unique mapping everywhere (for the unique world space normals to work) it's trivial just to modify the affected object area in the VT beforehands (or when the tile it is loaded to the VT cache).
 
Huh... So the Alan Wake presentation didn't really illuminate much on the renderer, just that it was deferred and 100% dynamic lighting (no static). Why so secret... Would be kind of nice to know how it was setup considering they managed 4xMSAA (even if 540p).
 
Back
Top