Rampage was running. It was running DX. Fixed chips could run OpenGL, but if your chip wasn't fixed (if I recall correctly, it could only support direct writes and so there were issues with the FIFO buffer)
Rampage had some transformation capabilities of its own. There would have been a low-end version without Sage, and then a mid-range with Sage and a high-end with 2 Rampage and 1 Sage. Sage was extremely powerful, though it lacked address ops unfortunately, so it only supported 1.0 vertex shaders (meaning there wasn't a matrix pallet, though our people had come up with some good tricks to getting around the issue).
socketable.. no.. I've never heard of it being that way.
HOS and Photoshop type filters - yes.
SAGE 2
No. I think, at the very least, SAGE2 needed to have its own RAM. But also it could have done some of the binning work - i.e SAGE2 could have done all the binning, and only sent the data to the rasteriser that needs to be processed. I don't know if that was how it was due to operate, but it makes for some interesting thoughts as to exactly where you split the processes.
As for Geometry data issues, I'm fairly sure that GP had a Hierarchical Z-Buffer before binning in the first place, which helped alleviate some geometry overhead. I also think SAGE2 had Geometry compression, which also would have helped with the binning with Fear.