No, they definitely are very different things and VFP can't run NEON instructions. Also obviously in the real world, the VFP is mostly used in non-vector mode so it has to be fast for that even if it obviously can't be as fast as with 4-wide instructions. On the other hand, the A8 Neon is a 2-wide FP32 engine which can, like MMX, also run things like 4-wide INT16, 8-wide INT8, etc. - the A9 Neon is the same thing but twice as wide (similar to the Qualcomm 'Scorpion' CPU's FPU core, codenamed 'VeNum').
What is possibly the most interesting part about VFP is what's happening to it in the Cortex-A9 generation, where they claim full IEEE is now full-speed and it's twice as fast (it's not clear if it can be both at the same time; i.e. maybe it's 1/clock for full IEEE vs less than 1, and an extra 1 for the non-IEEE case?). However the issue width to the VFP still seems to be 1, so if this is true and not just marketing you'd expect the only way to achieve that to be via those vector instructions as Farrar described in there.
Since Farrar apparently examined the ISA for the VFP even more than I did, it would be interesting if he had any idea of what they could have done there... I've tried to poke ARM about it and see if I could get some info, but no luck so far.