Branch prediction in a CPU is very important, since the CPU is mostly single threaded.
In a GPU, there are many pixels in flight, and you can consider each, or a group of each, to be a thread. When you hit a branch or a texture fetch, you can sleep that thread and go work on another batch of pixels.
Consequently, branch prediction in VPUs is not useful at all. What is much more important is usage of resources per thread. If each thread uses too many resources, then you can't actually put a thread to sleep when a branch or texture fetch is hit, since you've run out of resources to start another thread. Then you have to stall and wait for something. The higher the latency of items you are waiting for, the longer the wait. That then drops your efficiency. You can virtualize resources and move them into local memory, but then you still are hit by the memory latency.
There are numerous resources defined per pixel, but probably the main one is GPR usage. In a branching case, a compiler cannot know which branch will be taken, consequently a thread must be allocated all resources required for both possible branches. Some union of resources can be done, but it's only on simple true/false branches, and can't be used on loops and other constructs. Consequently, if you skip 100 instructions, it looks good. But if those 100 instructions use lots of resources, you might not be able to start a new thread. Then efficiency drops.
As well, most VPUs are SIMD, so if a branch is taken differently for a pixel vs. its neighbors, you again lose on the efficiency side.
Finally, optimizing accross branches is very difficult. In general, it's very suboptimal. That can kill efficiency too.
If enough efficiency is lost because you cannot operate SIMD or cannot be multi-threaded, then you become a simple single threaded single pixel operation. Given that a CPU runs 5~10x faster, then you've just killed your VPU and are much better running purely on the CPU. The main advantage of the VPU is its ability to run multiple things in parallel and maintain high throughput by having lots of different things to do. If you take that away, you lose.