NV31/34 (and, I assume, 36) are 2x2 doing multi-texture, and 4x1 with single texturing.
I'm not sure how "shader parity" relates to the fillrate numbers one can glean from 2x2/4x1 pipeline organization. Normally, assuming clock speed equality, one can assume a 4x1/4x2 card can output twice as many shaded pixels per clock as a 2x1/2x2 card, simply because it has twice as many pixel pipelines (and, thus, pixel "shaders" embedded in each pipeline). But the FX and Radeon DX9 lines don't really have comparable shader hardware, if we listen to nV's "sea of shaders" line, so the comparison isn't that clear-cut. As it stands, Radeons appear to be more efficient per clock, but who knows what driver updates and special code paths (that may affect more than just shader brute force) will bring?