The difference between render target writes and MEMEXPORT is that MEMEXPORT uses no coherancy patterns to dictate the data output. For example a vertex shader on Xenon can do this
MEMEXPORT TO Address(0), Val0
MEMEXPORT TO Address(10000), Val1
MEMEXPORT TO Address(2344), Val2
MEMEXPORT TO Address(9990), Val3
And still write fragments via the pixel shader to EDRAM
To do the same thing using a conventional rasterisor would involve 5 seperate triangles (one for each memory write). Thats a vast difference for many GPGPU operations, any GPU can do MEMEXPORT like function but by using lots and lots of triangles...
It basically allows full scatter/gather memory functions, the major difference between CPU and GPUs.