Dave H said:
demalion-
I'm not sure I understand exactly what you're getting at...
But if your point is that 5200U and 5600U show large clock-normalized performance differences on some VS tasks and not on others, yes, I'm quite aware of that but that doesn't really concern me because we know there are plenty of other differences between NV31 and NV34 that could explain this.
What I am sure of--and AFAIK the review at hardware.fr is the only one to address this--is that comparisons of 5200 to 5200U, and 5600 to 5600U (i.e. differently clocked versions of the same chip) demonstrably show that VS performance scales linearly with clock rate i.e. is done
completely in hardware.
Well, I thought I made it clear that my issue was with the bolded words (not one or the other set alone, but both together).
You quoted VS 1.1 benchmarks (in fact, my comment that the benchmarks were a good indication and complex workload was based on it being VS 2.0 before I corrected that in an edit), and one that doesn't seem likely to use all of even VS 1.1 functionality. But we aren't talking about a vertex shader 1.1 benchmark accelerator, we are talking about a vertex shader 2.0 (and not just benchmark) accelerator, which to me leaves two issues:
1) All we have indication of (
in what you propose is the complete picture) in the first place, is that on the CPU in question (a very fast one), the CPU workload for the specific (VS 1.1) benchmark in question is not limiting.
2) We have no idea how the CPU workload changes for implementing other VS functionality as far as I've seen (as I tried to illustrate, among other things, with my charts). This is other VS 1.1 instructions, register counts, macros, VS 2.0....some pretty significant items (in fact, it would be interesting to see this tested for a whole host of cards at the same time, but reviewers seem to have things like lives and stuff that get in the way of them doing some of the testing I'm curious about
).
It is your insistence that this is a complete picture that continues to puzzle me, that's all. What if the behavior isn't the same with a 1 GHz Athlon or P III? Is it conceivable that the limitations might be different than for a 2.8 GHz P4? And that is not even the bottom end of the range the bargain cards will address.
An example that comes to mind is that of watching Quake III scale perfectly with GPU clock speed on a Rage Pro and concluding the graphics speed of the game is determined solely by GPU, in the absence of any benchmarks involving changes in CPU performance, or, as in my counter example, observing a change based solely on CPU performance and ignoring the graphics card and resolution at which this is observed to conclude it is accelerated completely by the CPU.