Artists usually want somewhere between 2-3 lights to hit every surface in the 3d world (to make it not look flat, the light count is not important, the coverage is). Of course lights that are behind a wall (shadowed) do not help much, and neither does sunlight in indoor scenes. We have lots of partially indoor/outdoor scenes in Trials Evolution, and indoor sections (big areas where sunlight is blocked) cost more because you have to add lots of local lights (but the sun still must be calculated to the same pixels even if it's mostly shadowed out). We have fully dynamic lighting and user created content, so users can create lots of cases like this to their levels (that we have no way to prevent).Also, I don't think it's a good idea to focus on average performance like this, as you'll get some bad framerate dips when you can't cull shadows well due to the scene. At least with thousands of lights you can cap how many lights could affect each pixel through your art assets.
Improving minimum frame rate is of course the ("only") goal (we have 60 fps with vsynch after all). We want the 2-3 lights per 3d surface to translate to 2-3 lights processed per pixel. Smaller tile sizes and good culling are very important to reach that constant frame time (less border pixels that require "random camera angle specific" amount of processing). It's important that the "visible amount of lighting" translates well to processing cost (so that artists/users can have the amount of light they require per each surface).
I fully agree with you. The more processing you can move away from the raster pipeline to the (screen space) compute pipeline, the better. Everything in raster pipeline has a fluctuating cost (variable amount of triangles/vertices in screen, variable overdraw and hi-z efficiency, variable quad efficiency, no way control branching granularity = branching fluctuation, etc). Minimizing the fluctuating cost is the key to solid performance (60 fps with hard vsynch has always been our goal).worst case performance is the most relevant, which is actually one of the reasons why I'm slightly less enthused with so-called "Forward+" then the very-similar deferred variant. In my experience running complex shaders at the end of the raster pipeline results in a whole lot more variability (due to triangle sizes, occlusion, scheduling, etc) than doing it in image space.
With Forward+ those far away objects with tiny triangles (artists always have too little time to create perfect LODs) are hit with plenty of light sources (further away geometry has more z-fluctuation, so light culling has more false positives). Those tiny triangles have very bad quad efficiency (and often very bad texture cache efficiency as well). I would personally prefer to do as little as possible in the raster stage. I would be ready to go as far as simply storing the texture coordinate (to the virtual texture cache) for each pixel instead of sampling the material textures in the rasterization step. Our current Xbox 360 game has 2000 meter view distance, and we are already getting bad quad efficiency for further away geometry. In next gen titles we of course want more (more draw distance, more further away geometry, more geometry with smaller details). Lots of things must be deferred to make all this happen (at constant non-fluctuating 60 fps).
Last edited by a moderator: