Well, take the limit case where Intel has a 60 E-core CPU. According to the efficiency curves in the article, the E-cores at 1W provide around 45% of their performance at 6W, where they can't scale further. (On the 12900K Anandtech saw max power consumption of 48W for all 8 cores at 3.9 GHz:
https://www.anandtech.com/show/1704...hybrid-performance-brings-hybrid-complexity/4). We also know that 8 E-cores give around 25% of the total multithreaded performance of the 12900K at stock (
https://www.anandtech.com/show/1704...ybrid-performance-brings-hybrid-complexity/10). So 60 E-cores at stock would be ~1.875 of a 12900K. Scale that by 0.45 and you get 0.84 of a 12900K. That's for a desktop CPU on Intel 7. So moving to a laptop CPU on Intel 4, you would only need an extra 20% performance per watt for rough parity with the 12900K in multithreading.
A 60 E-core CPU may seem like a crazy idea, but as you can fit 4 E-cores in the space of 4 P-cores, you are really only talking about 15 P-cores worth of die space. And the M1 Ultra is already two 8 high-performance core CPUs sandwiched together.
The same math for a 32 E-core CPU, would see those cores delivering ~45% of a 12900K's performance @1W/core. Then you would have 28W to feed 8 P-cores, or 3.5W per core. The article doesn't show scaling figures beyond 22W/core, however for 22W/core vs. 3.5W/core we see performance roughly halves. Even assuming up to a 60% reduction in performance for the upper clock speed ranges, would give us 0.75 * 0.4 = 0.3 of a 12900K's performance. So 8 P cores + 32 E-cores would be expected to give 0.3 + 0.45 = 0.75 of a 12900K. Then you would need an extra 33% performance per watt for rough parity with the 12900K.