I agree with you there. But really ideally you'd wanted something which generally gives good performance, without having to rewrite things manually too much. AMD's pre-SI chips could get very good performance for the right workloads too, but were (rightly so) criticized that performance for other workloads was just bad.Well, once again, Kepler sacrifices registers/cache for flops, and sometimes that hurts performance a good bit. If you rewrite programs to take this under account, you can achieve very good results. Example: Understanding the Efficiency of Ray Traversal on GPUs – Kepler and Fermi Addendum. Clearly, this requires extra work and may not always be possible, but Kepler does have the potential for very high GPGPU performance.
And as far as I know, none of these opencl/directcompute benchmarks do anything particularly stupid, so seeing Titan fall behind the "little" 7970 GE is quite disappointing, given it has a sizable raw power advantage (granted it's not _that_ big with SP).
Looks like AMD did an awful lot right with GCN indeed. I think there's no question that Teslas are outselling FirePros (probably by an order of magnitude), but on the chip side of things this doesn't really seem justified.GCN is probably easier to use, but since Teslas are outselling FirePros (to be best of my knowledge) the general feeling must be that NVIDIA's software is better. Whether things will remain that way is another story.