Certain tasks just don't scale past a certain number of cores because the added latency goes up with every core added, and eventually gets to a point where the latency penalty outweighs the performance improvement.
Cores don't inherently add latency. Instead it depends on the cache topology etc.