Like I noted earlier, the bar for HSA's coherent memory is lower than it is for CPU cores. Coherence between the CPU and other domains is apparently at a page level, given the use of synchronized page tables.
This is why I didn't see a problem on the coherence front for hUMA.
(edit: On second thought, this could be page-granularity, but that might be an over-interpretation on my part without more data than is in the slide.)
The 109 min/204 peak eSRAM bandwidth disparity remains. Perhaps a transcript or writeup on the eSRAM can further illuminate why they make that particular distinction. Perhaps it is the inverse of the usual way bandwidth is disclosed, with peak ignoring banking conflicts. There may be a sustained number with some kind of bank hit or access combining that can yield the peak value.
edit: Note the 2.5% fully gated idle power. With a ~100W SOC, that would leave 2-3W at idle, hence why some designers may be tempted to have a secondary processor for networked idle.