I hope I don't fail to take into account something in the quoted text I snip...
tb said:
My No, was to this part of the message "Game Test 2 and 3 don't change much between CPU vs GPU skinning because primarily they are Pixel Shader limited"
I think they are more vertex shader and fillrate limited than pixel shader calculation speed limited.
Based on what? Pixel shading performance hit may be the same on every card for the functionality exhibited in the test (I don't know how many clock cycles each pixel shader takes on each card, and that is something the 9500 non pro versus 9000 Pro could maybe tell us about how this has improved), but that doesn't mean the hit is not a significant factor in the benchmark.
Your results from "no txt" show an increase, but your results from "no ps" show more of an increase when you raise the resolution to 1024x768. This supports the idea that the pixel fillrate determines the absolute limit of performance, but pixel shading limitations determine how much of that can be realized.
BTW, does "No txt" mean no textures read, or no pixel output (i.e, "No coloring" to my mind)?
What your analysis (as I understand things, which of course may be in error) seems to ignore is the idea that just because the 9700 can run pixel shaders fast enough to not significantly limit pixel fillrate, that doesn't necessarily mean other hardware can do the same (some questions concerning the GF FX come to mind). I.e., it doesn't mean that pixel shaders are not a limit of the benchmark, just that the
9700 performs them quickly enough so that they are not a limit. If all cards are not limited in this way, however, it does mean that the benchmark is useless for evaluating comparative performance of that factor...hence my question regarding the 9500 Pro and my curiosity about future nv3x parts, on whql drivers, in regards to this benchmark.
The limitation goes from the vertex shader to the fillrate (and a little bit of pixel shader) when you increase the resolution. Test 1 is most of the time vertex shader limited, but fillrate comes into play with some very high resolutions.
But that is the case for any fillrate limited benchmark....it is fillrate limited until you lower the fillrate demands so that the impact is less than that of the other demands. The thing is that the test scales downwards wrt performance from 1024x768 and up, and as far as I can recall that is part of the baseline assumptions of the benchmark...so to say that it is "most of the time" vertex shader limited based on testing at 320x200 seems a distortion.
Test 2,3 and 4 are not that heavy limited by the vertex shader. Fillrate is the main limitation in these tests(2,3,4) and the pixel shader has a very less impact.
On a 9700, bandwidth is not a factor for many situations, but it exhibiting this behavior for an application does not mean that the application doesn't depend on bandwidth, just that it isn't a good comparative benchmark for comparing cards that share this behavior.
If the 9500 non pro versus the 9000 pro gives us indication of equal performance under the right conditions, this illustrates that the GT 2 and GT 3 are fillrate limited for the purposes of that comparison, which should give us indication of how ps 1.4 execution characteristics of that card compare to the characteristics of the 9500 non pro (i.e., they are the same). This would be a good reason to start believing this reflects what the benchmark (as far as GT 2 and GT 3) is able to test (the usefulness of a benchmark in the real world is dependent on the set of applicable items there are in the real world to be benchmarked), pending confirmation of actually evaluating other cards as they are released.
Sorry, don't have a radeon 9500 / 9500 pro
Thomas
Someone who does should get cracking!