Woah, there! I didn't make any conclusion. I just observed that 8 ACEs is a choice and as yet, unproven. It might be a golden choice, might be overkill. The fact AMD have rolled it into subsequent architectures suggest they think it might be more useful than not. Not sure what the overhead versus 2 ACEs is though. If it's utterly trivial to implement, no reason not to add it even if in 5 years everyone drops back to 2 ACEs because these saturate the ALUs. The future's unknown and it's for the devs to work out how to use async compute. Once they've sussed it, graphic IHVs will build more optimal designs.
Not necessarily. I'd even say likely not. Threading is hardware independent - you don't know how many cores your target hardware has in cross-platform titles. That'll go for GPU and well as CPU workloads. Threading now revolves around jobs and a scheduler managing across available resources.