Yeah I have no sympathy for the "chicken and egg" problem at this point. It is well known how to write software that scales up to large core counts at this point. There's no more excused to be made in terms of market... software is still just way too sequential.
Of course it's a hard problem - a very hard problem. There is a ton of legacy code that doesn't get re-written overnight. Fresh starts like what Oxide is doing probably have the best chance of succeeding. So far we're still pretty much in subsystem-parallel hell (audio on a thread, physics on a thread, etc) and unlikely to get out of it until people start ditching a lot of code, libraries and arguably, languages.
But yeah, there's no real debate at this point: software is the gating factor. As rpg.314 says, if there was much scalable software, you'd already have consumer CPUs with more cores. But for now using the power budget to run at silly frequencies (4-5GHz...) is still going to be better for the vast majority of users.
It's not really a chicken and egg problem per se. Software with sufficient expressed parallelism will not pay any penalty running on fewer cores. People just have to stop doing parallelism by "moving X into another thread" and stopping when they get adequate use of 2-4 cores...