Yes, it is possible to apply transformations to a subset of vertex programs for mixed-mode execution. If dependencies aren't an issue, non-fixed-function idioms can be moved to the front of the vertex shader and executed in two phases.
Also, from what I've read, the GF2 T&L pipeline isn't completely fixed, it just isn't flexible enough to do DX8. There are still registers and microinstructions, presumably setup by the device driver or firmware ROM. These could have been altered for the GF4MX to expose more of the underlying pipeline to programmability. Who knows what the deficiences are? Maybe a lack of registers, constants, swizzling, etc?
This is similar to the situation with register combiners, where the underlying pixel pipeline was flexible, just not flexible enough.