Anyway: „We“ are including SIMD-Units into IPC already, (whether or not that being correct in the strict sense of the word) right?
SIMD units make up a portion of the pipeline that instructions are issued to, so they are part of it in the sense that there needs to be something that executes instructions in a pipeline.
A SIMD instruction is still an instruction, but the FLOP count per instruction is not directly related to IPC. If that were the case, Intel's Knights Landing core would be considered as having higher IPC than the desktop x86 cores.
The SI portion of SIMD is Single Instruction, and IPC would be more concerned with what happens in terms of the instruction stream than the MD portion, which can be scaled horizontally within a pipeline's execution stage without disrupting how the pipeline handles code flow, instruction issue, hazards, or stall conditions.
A major motivation for having SIMD at all is that it amortizes the expensive hardware concerned with IPC over more data.
I doubt that anyone ever counted only the schedulers and dispatcher per CU/SM as indicative of IPC.
I'm trying to find more examples of where AMD used the term IPC for GCN besides Vega. You can find any number of architectural descriptions for IPC for Zen and other CPU cores, although those are superscalar cores that actively work to extract utilization out of one instruction stream.
GCN has for generations defined a ceiling IPC of 1, with any gains found in multithreaded throughput or measures to avoid stalls that would drive instruction issue below 1. Most of the marketing has been about utilization of the hardware and some token measures for single-threaded performance. That Vega's marketing made such a pointed reference to IPC this time around has more implications in part because that's not what GCN has been about.
There are other measurements, such as throughput and utilization of peak that can capture the sort of performance GCN has targeted without invoking IPC and all that it brings up.