I don't think AMD and Sony are stupid to use 8 ACE without any chance to use them (7 for games and 1 for OS). If it is not useful why not use only 2 ACE and improve other part of the GPU. Async compute is useful when the graphic pipeline don't use all ALU or is stall because the graphic pipeline is synchronous and some task wait another task completion.
My understanding reading some devs like sebbbi on B3D or graphical slides from GDC or other conference is that the limit of async compute task is that it can trash the L2 cache of graphical task. When it is needed to flush the L2 cache of graphical task by using async compute... With 'volatile bit' it helps if I understand well what Cerny said in Gamasutra interview:
Maybe without all the overhead they can schedule more compute task and this the reason of the 8 ACE choice or maybe I didn't understand anything
Other things I understand is that compute shading task will replace some vertex and pixel task for graphics in the year to comes because it can be more efficient to do some part of graphic with compute(60 to 70% of graphical task will use compute). And GPGPU can be useful for non graphic task if they are parralellisable and working on big dataset at least 64 data at a time if I remember well the Ubi soft presentation about compute task...