You need equal amount of backbuffer reads and writes. Blending = Read the backbuffer value, combine existing value with pixel shader output, write the value back to backbuffer. One read and one write.The too. My understanding of alpha blending is that's is basically compositing two or more things so surely you need more reads than writes!?!
The shader itself obviously reads the particle texture as well. I explained this in my post. However the particle texture is DXT compressed (1 byte per pixel) and the backbuffer is RGBA16F (8 bytes per pixel), so the particle texture read is insignificant in usual case. In the tiled case, the particle texture read is much more significant because the backbuffer reads and writes come mostly from the cache (and thus consume zero BW after the first read and first write). That's that why I concluded that in the tiled case you'd likely have roughly 2x reads compared to writes.