Reverend said:
Well... what sort of improvements do we get when we disable shadows in Doom3?
A significant bottleneck I've encountered (6800, AthlonXP 2400+) isn't stencil shadows, but rather the sheer number of draw calls in some areas, which is mainly a product of the number of
discrete surfaces and the number of participating lights. I've only looked closely at the multiplayer maps so far, so I can't say how indicative this study is, but it's a good way of analysing problem locations, without AI and so on getting in the way. I'll get on to why I think it could be representative though, in a mo'.
As an example, there's one problem spot in d3dm1 where the framerate drops to around 20-25. This framerate is independent of fillrate -- at least up to medium resolutions (1024x768) -- and, specifically, vertex and fragment processing are not bottlenecks as I've tested with a null interaction shader and the frame rate is largely unaffected. The framerate is also barely increases when shadows are disabled.
So, what's gobbling up the FPS? Well if one disables lighting (r_skipInteractions), the framerate shoots up to the 60hz cap. Now, even without shadows enabled, we might be CPU limited from lots of light-surface testing in software and if one looks around with the nulled lighting shader and just the torch light source active (interactions enabled again) one can see individual faces being 'touched'. If, however, we instead disable the rendering context (r_disableRenderContext) so that driver calls are nulled, the same CPU processing still occurs but the framerate is again up at the 60 cap.
This all strongly suggests that the bottleneck is driver-related and one only has to dump a GL log to see a frightening number of glDrawElements calls per frame (around 1150 in this case) with lighting enabled, dropping to a far more reasonable figure when disabled (350 or so). This is an unfortunate reality of the main lighting and shadowing system coupled with lots of texture variation. It's tricky to batch effectively in the case of the latter anyway but the problem is multiplied by the number of lights, which you can't process in groups due to the serial restriction of shadow volumes.
In fact, if you try setting all surfaces in the scene to use the same material, you still get roughly the same number of calls (some special materials are multipassed even with arb2) -- the engine doesn't even attempt to optimise this unlikely scenario! Of course in-game there are other potential bottlenecks, but we're talking about rendering here and I believe this is a general issue, but with perhaps less attention paid to it by the creators of the multiplayer maps.