Pete said:
From a consumer's point of view it's not like Nvidia would have sold their cards cheaper if there was no SM3.0 support anyway....
Possibly, though it's also possible nV could have clocked their cards higher (like ATi) if they didn't have to work in SM3.0.
And these are valuable points. For instance, SM3.0 is being sold on the fact that some of its features will give performance increases or are easier to support, and that’s very true, especially when you may be comparing SM2.0 on the same board as you are comparing SM3.0, however when you have different hardware that has different properties you can just take everything as read.
For instance, SM3.0 introduces vertex instancing which should provide a performance benefit on NV40 - however, NV40's vertex performance appears to be lower than R420's; will the use of vertex instancing allow NV40 to regain that ground? We don't know until we've tested it in a wide variety of scenarios. NV40's PS3.0 unit supports dynamic branching, which can be a performance benefit in some cases, however (as we've learnt recently) it does also have some detrimental performance points - however, R420's PS2.0 performance is generally higher than NV40's - will the cost of state changes for unrolled PS2.0 shaders on R420 be faster or slower than NV40 dynamically branching.
These things aren't determined because we've not seen any metrics for them and even when we do, they are unlikely to be definitive since they will inevitably change from application to application. However, from the
end user perspective these aren't settled. As trinibwoy points out, R300's PS1.x support showed parity, or better, in PS1.x applications to NVIDIA's PS1.x support, however what would you say if it turns out that ATI's PS2.0 support was faster than NVIDIA's PS3.0 with similar IQ?
Factors like these are neither settled, or frankly understood yet as we've not seen any sufficient tests, or even titles that will go to this level of complexity.