Of course.
But it's all academic in the context of driver improvements etc, when it has the potential of improving performance by 1%.
I believe that, in the Nvidia architecture, triangle performance is joined at the hip with other functional blocks.
So if you have, say, 2 triangles per clock for some mid range device, and you scale it up, you get 4 or 6 depending on the ratio.
That doesn't mean that those extra units for the higher SKUs will be used at full capacity.
Not sure if that's the case for AMD, but I don't think it is.
Not saying that it was a mistake to have up to 11 or 17 or whatever triangles per clock, it may simply be a free benefit of some architectural choices. But I'm just questioning their value in terms of final gaming performance.