Okay, so first le me admit this is gonna be long.
First, I'm going to assume you're talking specifically about nVidia in 3DMark. Of course, you could generalize it to any other GPU company, and any other benchmark. But by doing this, I'm assuming the NV30 architecture and the fact that you can ask Futuremark for some specific things to test. So, it makes my job a lot easier
I'm also going to assume DirectX, so that I can speak in DIP calls terms and stuff. It can be easily converted to OpenGL talk if needed.
Okay, first, let's discuss object-specific optimizations. To be more accurate, we should think about DIP-call specific optimizations because vertex-call optimizations would require hardware for cheating, and I most sincerly doubts that exists, hehe.
Those optimizations aren't about changing the shader, but this category includes:
- Ignore specific DIP-calls, because you know the object will be barely visible or invisible due to it being occluded by something rendered later
- Completely transform the DIP-call into something else. For example, you could use a precalculated sky instead of all the things the program does on the fly.
- Cache the DIP call, and render it at a later date, so you get more efficient front-to-back ordering.
There obviously are several ways to determine if a DIP call is what you're waiting for. A way for this might be based on the used vertex buffer location, for example. It should proof to be a quite cheap task, too. Of course, determining things like during which frame we are might be important because cheating might become visible as of a specific frame.
Okay, so let's review each of those things.
The first is quite easy to implement, but you need code for every frame, so it's a very annoying job and takes a lot of time. If possible, I'd bet drivers teams would try to do everything else first.
You could also say "from frame 307 to frame 341, do not draw this".
The advantage of this method can be either fillrate or geometry, or both.
Getting caught using this method is quite easy if you do a mistake, and mistakes are easy to get.
Completely transforming the DIP call into something else can give a quite good performance boost in some cases, such as precalculation. Remember the NV3x got amazing texturing performance, so using textures more often would likely increase performance.
In the case of Mother Nature, there's a LOT of procedural shading IIRC, so nVidia trying to precalculate some of it would make sense. I'd heavily suggest to look for this.
How to look for this? Well, with procedural shading, everything is calculated perfectly according to the pixel position, giving very good IQ. With texturing, you're never gonna get as good IQ, but it can sometimes be very close.
Taking a screenshot and looking at small details on things like the terrain between driver versions is the only way to check for this.
This method may save both fillrate and geometry, depending on case, sometimes you might actually use geometry to save fillrate or stuff, it all depends on situation. In its most easy implementation of preculation using textures, it only saves fillrate.
Now, the third option, caching and rendering later, is actually the easiest one to implement. And it can give ludicrous performance boosts - IF the benchmark isn't well optimized.
For example, you could realize that in a benchmark, the water is rendered first, event though most of it isn't visible. So you could cache the DIP call and do it after everything else, or after a specific DIP call you're searching for.
I believe that's something that's practically IMPOSSIBLE to check, but that is very efficient indeed.
The only way to, maybe, check for this is modifying the executable in a few ways, such as changing the DIP order or changing a few renderstates.
Note that there's no guarantee that'll work, because a really robust implementation could work anyway. However, a really robust implementation might take a lot of CPU power ( or it might not, depending on the talent of the programmer... ) , so there's a good chance you could detect it.
Okay, so now that's roughly what you can do for DIP-specific optimizations.
Now, let's talk about shader optimizations.
Let's face it, if FutureMark programmed with DX9 standards in mind, nVidia got a LOT of things to do to cheat. As I said before, you can do precalculations for procedural shading in DIP-specific optimizations, but it's obviously easier to do it by simply modifying the shader and render states than doing it by-DIP.
Now, the obvious optimizations are using FP16 registers *and* using FX12.
Remember though that one in three FP instructions are "free" on the NV3x, so it's unreasonable to think they'd certainly use FX12 everywhere.
How to check for this? Simply modify the shader program slightly, for example by retrieving an instruction, giving modified output. There are two possible outcomes if nVidia cheated:
1. The output is modified as expected, but performance is *down*, not up.
2. The output is not modified, and performance remains identical.
Of course, a very robust system looking for instructions could specific instructions, and not programs, then changing them would still work. But as always, I'd be surprised if nVidia took the time to do such a thing.
Anyway, that's pretty much all I can think of for the moment. I'll post some more if I get the time and I get more ideas.
Feedback, comments, flames?
Uttar
First, I'm going to assume you're talking specifically about nVidia in 3DMark. Of course, you could generalize it to any other GPU company, and any other benchmark. But by doing this, I'm assuming the NV30 architecture and the fact that you can ask Futuremark for some specific things to test. So, it makes my job a lot easier
I'm also going to assume DirectX, so that I can speak in DIP calls terms and stuff. It can be easily converted to OpenGL talk if needed.
Okay, first, let's discuss object-specific optimizations. To be more accurate, we should think about DIP-call specific optimizations because vertex-call optimizations would require hardware for cheating, and I most sincerly doubts that exists, hehe.
Those optimizations aren't about changing the shader, but this category includes:
- Ignore specific DIP-calls, because you know the object will be barely visible or invisible due to it being occluded by something rendered later
- Completely transform the DIP-call into something else. For example, you could use a precalculated sky instead of all the things the program does on the fly.
- Cache the DIP call, and render it at a later date, so you get more efficient front-to-back ordering.
There obviously are several ways to determine if a DIP call is what you're waiting for. A way for this might be based on the used vertex buffer location, for example. It should proof to be a quite cheap task, too. Of course, determining things like during which frame we are might be important because cheating might become visible as of a specific frame.
Okay, so let's review each of those things.
The first is quite easy to implement, but you need code for every frame, so it's a very annoying job and takes a lot of time. If possible, I'd bet drivers teams would try to do everything else first.
You could also say "from frame 307 to frame 341, do not draw this".
The advantage of this method can be either fillrate or geometry, or both.
Getting caught using this method is quite easy if you do a mistake, and mistakes are easy to get.
Completely transforming the DIP call into something else can give a quite good performance boost in some cases, such as precalculation. Remember the NV3x got amazing texturing performance, so using textures more often would likely increase performance.
In the case of Mother Nature, there's a LOT of procedural shading IIRC, so nVidia trying to precalculate some of it would make sense. I'd heavily suggest to look for this.
How to look for this? Well, with procedural shading, everything is calculated perfectly according to the pixel position, giving very good IQ. With texturing, you're never gonna get as good IQ, but it can sometimes be very close.
Taking a screenshot and looking at small details on things like the terrain between driver versions is the only way to check for this.
This method may save both fillrate and geometry, depending on case, sometimes you might actually use geometry to save fillrate or stuff, it all depends on situation. In its most easy implementation of preculation using textures, it only saves fillrate.
Now, the third option, caching and rendering later, is actually the easiest one to implement. And it can give ludicrous performance boosts - IF the benchmark isn't well optimized.
For example, you could realize that in a benchmark, the water is rendered first, event though most of it isn't visible. So you could cache the DIP call and do it after everything else, or after a specific DIP call you're searching for.
I believe that's something that's practically IMPOSSIBLE to check, but that is very efficient indeed.
The only way to, maybe, check for this is modifying the executable in a few ways, such as changing the DIP order or changing a few renderstates.
Note that there's no guarantee that'll work, because a really robust implementation could work anyway. However, a really robust implementation might take a lot of CPU power ( or it might not, depending on the talent of the programmer... ) , so there's a good chance you could detect it.
Okay, so now that's roughly what you can do for DIP-specific optimizations.
Now, let's talk about shader optimizations.
Let's face it, if FutureMark programmed with DX9 standards in mind, nVidia got a LOT of things to do to cheat. As I said before, you can do precalculations for procedural shading in DIP-specific optimizations, but it's obviously easier to do it by simply modifying the shader and render states than doing it by-DIP.
Now, the obvious optimizations are using FP16 registers *and* using FX12.
Remember though that one in three FP instructions are "free" on the NV3x, so it's unreasonable to think they'd certainly use FX12 everywhere.
How to check for this? Simply modify the shader program slightly, for example by retrieving an instruction, giving modified output. There are two possible outcomes if nVidia cheated:
1. The output is modified as expected, but performance is *down*, not up.
2. The output is not modified, and performance remains identical.
Of course, a very robust system looking for instructions could specific instructions, and not programs, then changing them would still work. But as always, I'd be surprised if nVidia took the time to do such a thing.
Anyway, that's pretty much all I can think of for the moment. I'll post some more if I get the time and I get more ideas.
Feedback, comments, flames?
Uttar