Why not? Synchronization works both ways. Contrived example: I'm on the PS3 and want to run over some rendertarget with the SPUs. I render into DDR, then DMA half of the buffer down into XDR (to save memory) and tell the RSX to wait for my SPUs. Once the SPUs are done, they release RSX, which copies down the second half of the buffer, overwriting the first half.
Now what happens if, say, I forgot to insert a barrier between the SPU computations and the RSX release. Usually, even with compiler reordering, the SPUs will be done long before RSX gets going.
But you just gave it a clock bump.