The vertex program extensions provide almost the same functionality. The ATI hardware is a little bit more capable, but not in any way that I care about. The ATI extension interface is massively more painful to use than the text parsing interface from nvidia. On the plus side, the ATI vertex programs are invariant with the normal OpenGL vertex processing, which allowed me to reuse a bunch of code. The Nvidia vertex programs can't be used in multipass algorithms with standard OpenGL passes, because they generate tiny differences in depth values, forcing you to implement EVERYTHING with vertex programs. Nvidia is planning on making this optional in the future, at a slight speed
cost.