It's unclear. It seems more likely now that NV34 (GFfx 5200) does indeed have (at least one) hardware vertex shader(s). This is--at the least--very impressive, considering it also includes a fully functional 4x1 pipeline with full PS/VS 2.0+ support, all in ~45 million transistors.
It's downright amazing when you consider NV31 (GFfx 5600) is also a 4x1 with full PS/VS 2.0+, but takes 80 million transistors, almost double. The only functional difference between the two that Nvidia has specified is that NV31 does color and z-compression, while NV34 does neither. Fine, and that takes up some amount of transistors...but not 35 million.
Pixel and vertex shader results from the reviews we've seen today seem to indicate NV34 has ~25-50% less shader performance per-clock than NV31. Some of that is surely explained by NV31 having more physical shader units, although some is probably also a result of the lack of z-compression on NV34. I'd hold out for some more "pure" shader tests (i.e. that don't actually display anything useful and thus don't use z at all, taking that variable out of the equation) before making any firm pronouncements, but it's almost certain there are transistors saved here, too. Again, it doesn't seem like near enough to get us from 80 to 45 million.
Anand's review, interestingly enough, mentions a third source of saved transistors: they claim NV31 is more deeply pipelined than NV34, and thus trades off extra transistors for increased clockability. Presumably they didn't come up with this on their own, so Nvidia must have told them this in which case it may be true. Then again, it may not: NV31 does seem to overclock higher than NV34 (reviewers have been able to get the reference NV31s above 400MHz, while the NV34s don't seem to get above the 325MHz Ultra clock), but then again NV31 is made on the .13u process while NV34 is .15u.
Meanwhile, several reliable members of the forum have been implying for some time that not only did NV34 not have full hardware vertex shader functionality but that NV31 may also use some part-hardware/part-software scheme.
So what gives?
I dunno. The NV34 vertex shading scores we saw today were not remarkably bad, but then again considering most reviewers used ~3GHz class CPUs, this wouldn't be surprising even if software VS was being used. By far the simplest way to sort this all out would be to have someone do a vertex shader limited test (the 3DMark03 VS test should work very nicely) at two very different CPU speeds. That should tell us all we need to know.
Until then, I don't think we can say one way or the other.
(edit: P.S. - the only difference between Ultra and non-Ultra should be core and memory clock rates.)