Well what ever is slowest is where its going be there, its not really stacking up.
Its just the consequence of have to feed things through a certain way. Lets say geometry shader slows you down, but you have a fast pixel shader, Well you really arn't going to go faster cause you have the geometry shader to worry about. If it was the other way around, well now you have to worry about the pixel shader. If everything is slow in the pipeline lol, well someone did something wrong
I can't really understand what you have in mind. A program is not a pipe with a diameter. In a program slow + slow + slow = 3x slow and not 1x slow. A program is (if you like images) like a mountain range, climbing one hill doesn't free you from climbing all the other hills as well to reach your destination.
There are bottlenecks affecting program time and there are bottlenecks which affect parallelism, and those two add-up to each other as well, often in complex ways (non-linear). Register space affects parallelism, throughput suffers, other bottlenecks get hit like primitive rate, throughput suffers more.