Note thought, that AMD also labels the lowest performing card MI6 as an inference accelerator and the most expensive one MI25 as a training accelerator. They are well aware of different target markets.The MI25 is up against two models of Tesla for slightly different roles; P100 (FP64/FP32/FP16 no accelerated Int8 functions) and also the P40 (highest FP32 GPU Nvidia offers but with no 2xFP16 accelerated functions and accelerated Int8 functions instead for inferencing).
You can see the different strategies at work; Nvidia sees the market going more dedicated nodes and splitting GPU requirements between training (P100) and inferencing (P40), but IMO not all HPC-research sites will be doing dedicated inferencing and AMD's solution from a hardware perspective fits them better (putting aside the software-platform considerations).
Probably a balance of pros/cons and does work out the new trend becoming more dedicated nodes but probably not for everyone.
Cheers
That being said, I could imagine that commercial applications of AI would rather have dedicated installations for training (longer run times, not updated every hour to the live environment) and inference (high performance live environment for user experience).