To elaborate, a design such as the C1 (all IMHO) has to have a certain amount of transistors available to allow for the control logic (sequencers arbiters, etc.) while maintaining enough left over for processing units (ALUs). This point where there are enough transistors available to allow for the inclusion of this control logic at the expense of ALUs with increased performance would come earlier with a tailored API. This is the case with the Xbox360. On the other hand when this line is reached with a more generic API I would guess that ATI would be chomping at the bit to utilize this type of design. Furthermore as transistor budgets rise the trade off in die space will tilt in favor of such a design even without a tailored API.
I'd think that IP licensing might be a major hurdle. We still haven't ironed out exactly what parts are Microsoft's and what parts are ATI's. I also think there might be some other agreements between the 2 that would keep ATI from releasing it before Xbox 360 is released.