That's an interesting point. However, I would say that shadowmaps are relatively degenerate in terms of work load -- the verticies are simple.
That's just it though - in the unified shading era, almost all vertices are "simple". BTW, shadow maps still have to do all the same position calculations as scene vertices, including matrix blending.
Also, most modern hardware is very good at tossing out invalid verticies.
Not any faster than one per clock, which is no faster than the setup rate on modern GPUs, so.
It can set up 500M verts/second. For a 1920x1080 monitor that's 1 vert per pixel with an overdraw of 4.
Forget about pixels per polygon, because it doesn't work that way. There are lots of vertices off the screen. You only have a limited granularity in scene culling by the CPU. You have lots of vertices not facing the camera.
On top of that you have draw
more than what's inside the view frustum for cascaded shadow maps and reflection maps.
Finally, vertex loads are very clumpy in nature, so it's extremely rare to have pixel and vertex load balanced for much of a frame. If you have 5M polygons to draw, you'd be setup-limited for, IMO, around 4M of them because they only cover, ~10% of the pixels. That costs you 8 ms per frame on Xenos. All your other rendering - 90% of the pixels - have to be done in the remaining time, e.g. 8.7 ms for 60fps. Furthermore, no matter how many shader units and ROPs you have, they can only reduce this latter part.
This vertex load is 40% lower than your example, and we're still triangle-limited 48% of the time.
Anyway, like I was saying earlier, this only applies to SFR. The best way to look at SFR, then, is that it doubles resolution, not framerate.