In my experience (I did try both ways), using post-projection z even with PCF gave significantly worse depth precision than linear z, and there are a several papers and applications that recommend the same thing.
fp32 is arguably overkill for PCF, but it's certainly not for VSM in which the second moment is somewhat unstable. Still, fp16 is certainly *not* enough for PCF so you're stuck with depth formats or fp32. There are a number of reasons why I dislike the depth formats (including the aforementioned inflexibility with different metrics, biasing and a few more), but if I was doing a straight PCF implementation I'd certainly look into them.
That said I think I've given ample justification (and read Gems 3 for more!) for why VSM is often a better option than PCF, and with VSMs you certainly need to render to a color buffer. The only alternative is to render to the depth buffer, then read that back and write to a color buffer depth and depth^2. I'm fairly certain this would be slower on most architectures (excepting maybe the 360) and more importantly it eliminates the possibility of using derivatives to represent the variance of a certain pixel, which is often desirable albeit not crucial.
Anyways I'm sure PCF implementations using the depth buffer are quite usable (there wasn't even an option until recently), but in my experience they suffer from quite a few problems and the depth formats are far too limiting, even in DX10.