The judder they're talking about is from animated elements within the scene. For those things, yes of course you'll get motion artifacts when you have an irregular rendering rate. The head tracking however is not disturbed by that with positional reprojection because you have the tracking information and necessary scene information to construct a frame. It's POV interpolation, not scene interpolation.
If you look at a static cubemap in VR you're essentially looking at a 0 FPS scene that's reprojected to whatever your display rate is. If it were an animated cubemap (say, a room with an object in it moving back and forth) an irregular animation rate will result in the object not being tracked consistently by your eye, resulting in a similar judder artifact produced by full persistence displays.
I'm not arguing that constantly hovering between 70-90fps on a 90Hz HMD results in a negligible difference and can be ignored by developers, but rather that brief dips below 90fps are often imperceptible because you don't get that nasty kicked-in-the-head feeling that we used to when you miss a buffer swap.
Yes, I agree.
I don't understand why you think we're in disagreement. The link you posted starts off with "synchronous timewarp (ATW) is a technique that generates intermediate frames in situations when the game can’t maintain frame rate, helping to reduce judder.", so clearly what I said above is exactly right. I never said it was perfect, obviously you want to have full frame rate, but if and when you do drop frames, ATW (aka re-projection) will fill in the blanks with the result being that you never drop a frame to the display.
My point was simply that if you're already using ATW to double your frame rate from 60->120 fps, then drops below 60fps (say into the high 50's) are probably going to be a lot more noticeable than drops into the mid 80's when you're outputting at 90hz. That's why, while Oculus are also pushing for a native 90fps, it's unsurprising that Sony are being more strict with the 60fps requirement.
The only thing I disagree with is the idea that dropped frames have more impact at 60-120 than 90-90. Sony wouldn't allow a game with a frame rate dropping occasionally to 80, they tell the devs to either optimize until they get 90 stable, or use the 60->120 mode. They say dropped frames are bad and cause discomfort, so does Oculus in their best practice document.
Oculus says ATW reduces the judder compared to not using reprojection at all. This is about head rotation, not scene movements. For head rotation, it could be rendering at 10fps or 1000fps without any visible change. The problem of mismatched frame rate or frame drops is about translations, as hughj said above.
60->120 dropped frames would break scene movements with a frame being reused for 24ms instead of the normal 16ms. The glitch would last 8ms.
90->90 dropped frames would break scene movements with a frame being reused for 22ms instead of the the normal 11ms. The glitch would last 11ms.
So to use the examples we have...
55->120 is 5 dropped frames per second, 8ms error each.
80->90 is 10 dropped frames per second, 11ms error each.
Both may or may not be very noticeable, but with reprojection, this error is relative to the scanout rate, not the rendering rate. Frame time becomes the number of scanout it occupies.
Caveat: There is a serious additional artifact from doubling frames 60->120, but it's unrelated to dropped frame. It also doesn't cause discomfort because this artifact is stable and time-accurate.