Chalnoth said:
How is that irrational? It's a reaction to a large number of benchmarks. I mean, that's pretty much the definition of rational, isn't it? A belief brought about and supported by real-world evidence?
Dunno about the real-world evidence. See Quake 3, MDK2, Serious Sam 1st and 2nd Encounters or whatnot. That's OpenGL, too, and ATI is very competitive there. Wouldn't one single falsification suffice to shoot down a general conclusion? That's how it works in the world of maths
People look too hard for generalizations, and while they make life less complex they are not always useful. ATI not being competitive in Doom 3 does
not mean that they stink in OpenGL. It means just the obvious: they're not that good at Doom 3. If they lost out
everywhere and there's no explanation on the hardware level, there'd be a point. I don't see that yet.
I'd rather look at this case by case than quickly jumping to conclusions just for the sake of having something that's easy to remember ("ATI+OpenGL=teh suq") but not necessarily true.
Doom 3: stencil fillrate is NVIDIA's stronghold, plain and simple. Also ATI's hierarchical Z implementation doesn't like the depth test function varying too much over the course of a frame.
Riddick: soft shadows with PCF. Wouldn't surprise me at all if the game took advantage of NVIDIA's hardware acceleration which the Radeon line lacks (dunno if the R5xx series has that, even if, it might require a game patch to see it).
Etc.
Then there's the whole issue of texture filtering "optimizations" and related tricks. IIRC one of the recent Catalysts claimed a huge performance increase in IL2 just by forcing the cloud textures to a compressed format. Doing such things is not truly improving the OpenGL driver, but it makes some games run faster.
Who has how many of these "optimizations" in place, and what are the gains? This issue might pretty much bork up any comparison on its own.
And lastly, some people coming off the NV_vertex_array_range path just don't get it. NVIDIA supports some rather peculiar usage models of VBO (allocate buffer object, lock it, fill it, render it
once and throw it away again; might as well use immediate mode instead), probably because it is their VAR legacy model. ATI drivers don't support such stuff all that well and rather go for a more pure VBO model (fat storage objects are for
reuse).
Both approaches have some benefits over the other, and they are somewhat exclusive. ATI's model bites them more often then not. Btw, if anyone cares, technically it is the correct one IMO. VBOs are overkill and unneccessarily complex for the "one shot" usage. Different methods already exist for this case, they use less memory and are more portable, and are equally limited by AGP/PCIe bandwidth. Many developers fancy VBOs so much that they do it anyway, and it always comes back to bite ATI's reputation.
See Tenebrae, NWN. The Tenebrae technicalities wrt to unsatisfactory VBO performance on ATI hardware were discussed on these boards but I can't find the topic anymore. Might be buried in the old T&H archives. I'm pretty certain it was the issue I just tried to describe.
And finally, there are -- gasp! -- things that ATI's OpenGL driver handles more efficiently than NVIDIA's OpenGL driver. I've seen a Radeon 9200/64MB soundly beat a Geforce 3/64MB because the driver coped much better with many (thousands) small texture objects which were constantly shuffled around and refilled. This is just
one scenario, and it does not imply that ATI's OpenGL drivers are better than NVIDIA's, but instead
only that this specific case works better. The truth is to be found in the details more often than not IMO.