Not a graphics expert either, so I can't go into any specifics, but DX9 was much more constricted in the way of programming features and limitations compared to versions that came later on. You could work around the limitations* no doubt, but the penalty would be reduced performance, and perhaps also reduced image quality in the form of visible artifacts due to precision loss from repeated blending operations and so on. In a DX11, 12 or Vulcan world, what can be done in a single step might require several steps of operations in DX9.
*John Carmack allegedly proved mathematically that any conceivable graphics operation could be performed just by using the standard OpenGL set of blending operations. This was before the era where PC graphics cards started having pixel shading hardware btw, just to put things into perspective. So you might need horrific number of blending passes for advanced effects, such as running Crysis, but it would work, in theory.
Of course, back then PC graphics was stuck at 8 integer bits per channel (IE 24/32 bits total per pixel). You can't do too many blending ops at that precision level before the screen looks like a brown muddy mess (IE, your average id Software game....

), so this was all theoretical at the time. Even today, with floating point math available and deep color buffers (up to 128 bits/pixel afaik), such an approach would be really slow because pixel fillrate and memory bandwidth is limited. Realtime effects would also be hard to implement I suspect, as you might have to calculate (perhaps multitudes of) texture maps in real time to simulate said effect.
Oh well. Just food for thought. Nobody ever said this approach would be practical.