John Carmack sounds a bit pissed at both nVidia and ATi during the introduction of the Doom3 benchmark. The general stress level will probably be quite high right now
While Doom3 performs stellar on nVidia hardware, it is drastically tuned and tweaked by John as well as nVidia to make that happen. So the engine is deelpy frozen: any change made after the finalization of the drivers, carries a large performance penalty.
That seems to be the largest difference between the drivers and performance: ATi seems to try and make all improvements as general as possible, while nVidia heavily depends on specific changes for a few titles that carry a large PR impact.
While we see this happening, it might be that nVidia has no choice: their chips might be truly powerful, but really hard to operate. So, it can perform stellar, as we see with Doom3, but it needs work from nVidia itself (and cooperation from the developers) to make it happen.
This is nothing new. Only five years ago, it was still common practice to hand-tune some pieces of program code by writing them in assembly yourself. And the GPU's themselves consist entirely of clever hacks to speed up the rendering process.
While OpenGL and D3D evolve into general languages that hopefully remove the need to code execution parts for each chip family in the near future, at the moment it seems, that benchmarks are less and less testing the hardware, but more and more the performance of specific execution paths and the way they are optimized.
In general, it seems that benchmarks of ATi hardware determine the common performance you will see over a broad spectrum, while those same benchmarks on nVidia hardware mostly show in how far things are tuned and optimized for that card and game.
Now, it's not quite as simple as that, because the DX9-2.0 standard is almost identical to the specifications of the R3xx, with some features removed for compatibility with nVidia hardware. So, it is not surprising that the R3xx performs so well generally.
The focus on the specifications of Microsoft's shader models is a bit skewed as well, as they're not the ones that decide what those specifications will be by themselves. It is (for example) more like a law that states that in five years time new cars should have an electric motor and fuel cells, although older models are allowed to be sold and so gas stations should still sell gas as well as hydrogen.
Those shader models and specifications are created by the whole graphics industry (with the main movers having the largest say, of course), to create a standard that gives all the players a set of guidelines to see what to aim for. But like with cars, you can get high performance in different ways: a small, turbo-charged engine that revolves fast, can have the same performance as a slow revolving V8.
While it would be a great thing if everyone could just follow the standards and be done with it, that would mean, that only the few large players that make those standards perform optimal. And, as we see with GPU's, if you have two basically different designs, one will be better if you follow a specific set of rules.
At this moment, it seems the best bet for designers is, to design their game engine for ATi hardware and to let nVidia optimize their drivers for good performance on nVidia hardware. If this is because even nVidia has a hard time to make shaders that run optimal on their hardware or if it is because their drivers have a hard time compiling HLSL into something that executes fast does not matter.
Benchmarks generally use the top-ten of games to measure performance, as those are most likely the ones that are running on that hardware. And older games will run well anyway. But they're meant to inform the potential buyers which card is the best bet. And that seems to depend on nVidia making optimized drivers for the game they want to play.
So, with most current benchmarks, if you buy the card to play the tested games, you can see in how far nVidia has succeeded in optimizing the performance of that specific game at the moment of testing, while you can compare that with the general performance of their ATi counterparts.
In and by itself, that is not a bad thing. And this story would be quite different, when SM2.0 was based on the specifications of nVidia hardware, instead of ATi hardware. But it wasn't. To complicate matters, this specific behaviour of both ATi and nVidia seems to be generic: ATi only does things that improve things in general, while nVidia aims only for the highly visible things.
While it is not very relevant when you just want to play a game and want to see how many FPS you get at a specific resolution/AA/AF, it has become increasingly harder to determine general results from benchmarks. Because the actions executed in each case are totally different. So, as long as the test resembles your setup well, it is valid. But you cannot easily interpolate them to other setups and games anymore.
So, it seems that for playing mostly the top-ten games, nVidia is the best bet at this moment, while for general consistently good performance your best bet is ATi.
While Doom3 performs stellar on nVidia hardware, it is drastically tuned and tweaked by John as well as nVidia to make that happen. So the engine is deelpy frozen: any change made after the finalization of the drivers, carries a large performance penalty.
That seems to be the largest difference between the drivers and performance: ATi seems to try and make all improvements as general as possible, while nVidia heavily depends on specific changes for a few titles that carry a large PR impact.
While we see this happening, it might be that nVidia has no choice: their chips might be truly powerful, but really hard to operate. So, it can perform stellar, as we see with Doom3, but it needs work from nVidia itself (and cooperation from the developers) to make it happen.
This is nothing new. Only five years ago, it was still common practice to hand-tune some pieces of program code by writing them in assembly yourself. And the GPU's themselves consist entirely of clever hacks to speed up the rendering process.
While OpenGL and D3D evolve into general languages that hopefully remove the need to code execution parts for each chip family in the near future, at the moment it seems, that benchmarks are less and less testing the hardware, but more and more the performance of specific execution paths and the way they are optimized.
In general, it seems that benchmarks of ATi hardware determine the common performance you will see over a broad spectrum, while those same benchmarks on nVidia hardware mostly show in how far things are tuned and optimized for that card and game.
Now, it's not quite as simple as that, because the DX9-2.0 standard is almost identical to the specifications of the R3xx, with some features removed for compatibility with nVidia hardware. So, it is not surprising that the R3xx performs so well generally.
The focus on the specifications of Microsoft's shader models is a bit skewed as well, as they're not the ones that decide what those specifications will be by themselves. It is (for example) more like a law that states that in five years time new cars should have an electric motor and fuel cells, although older models are allowed to be sold and so gas stations should still sell gas as well as hydrogen.
Those shader models and specifications are created by the whole graphics industry (with the main movers having the largest say, of course), to create a standard that gives all the players a set of guidelines to see what to aim for. But like with cars, you can get high performance in different ways: a small, turbo-charged engine that revolves fast, can have the same performance as a slow revolving V8.
While it would be a great thing if everyone could just follow the standards and be done with it, that would mean, that only the few large players that make those standards perform optimal. And, as we see with GPU's, if you have two basically different designs, one will be better if you follow a specific set of rules.
At this moment, it seems the best bet for designers is, to design their game engine for ATi hardware and to let nVidia optimize their drivers for good performance on nVidia hardware. If this is because even nVidia has a hard time to make shaders that run optimal on their hardware or if it is because their drivers have a hard time compiling HLSL into something that executes fast does not matter.
Benchmarks generally use the top-ten of games to measure performance, as those are most likely the ones that are running on that hardware. And older games will run well anyway. But they're meant to inform the potential buyers which card is the best bet. And that seems to depend on nVidia making optimized drivers for the game they want to play.
So, with most current benchmarks, if you buy the card to play the tested games, you can see in how far nVidia has succeeded in optimizing the performance of that specific game at the moment of testing, while you can compare that with the general performance of their ATi counterparts.
In and by itself, that is not a bad thing. And this story would be quite different, when SM2.0 was based on the specifications of nVidia hardware, instead of ATi hardware. But it wasn't. To complicate matters, this specific behaviour of both ATi and nVidia seems to be generic: ATi only does things that improve things in general, while nVidia aims only for the highly visible things.
While it is not very relevant when you just want to play a game and want to see how many FPS you get at a specific resolution/AA/AF, it has become increasingly harder to determine general results from benchmarks. Because the actions executed in each case are totally different. So, as long as the test resembles your setup well, it is valid. But you cannot easily interpolate them to other setups and games anymore.
So, it seems that for playing mostly the top-ten games, nVidia is the best bet at this moment, while for general consistently good performance your best bet is ATi.