The main problem with GPGPU is random memory access. You basically get none, as it's a stream processor and its main advantage in massive parallelization is that you don't try to jump around in memory, you have to work on predictable datasets with no dependencies.
It's the same reason why raytracing large scenes on a GPU is hard. The hardware is designed around the assumption that you don't need to do such things. So whenever you start to need it, the performance will drop significantly. This is why only certain types of tasks can benefit from GPGPU and why it's not possible to just simply port any kind of code to it.