400 M vertices/s * ~40 bytes/vertex = 16gigabytes/s. The only way it will ever approach 400M vertices is if you are redrawing the same vertices over and over again in the vertex cache. Even if we assume 24byte vertices, you've got a bandwidth problem.
400M is probably the theoretical max if you take the minimum vertex transform and compute how many can be done per clock times the clock rate. It doesn't take into account triangle setup, bus bandwidth, etc.
Someone needs to release a reasonably complicated vertex and pixel shader benchmark like SpecCPU or Dhrystone and let cards bench against that. "e.g. we get 13,000 VertexStones and 9,000 PixelStones"
The raw performance of the minimal vertex shader is a hard yard stick to gauge real performance by.
400M is probably the theoretical max if you take the minimum vertex transform and compute how many can be done per clock times the clock rate. It doesn't take into account triangle setup, bus bandwidth, etc.
Someone needs to release a reasonably complicated vertex and pixel shader benchmark like SpecCPU or Dhrystone and let cards bench against that. "e.g. we get 13,000 VertexStones and 9,000 PixelStones"
The raw performance of the minimal vertex shader is a hard yard stick to gauge real performance by.