Obviously not, they were internally proud of it using words like "we are the software company, yet".
I can draw no conclusions from that sentence fragment.
AFAIK, all techniques used for SPEs were highly beneficial for multicore CPU and GPU programming as well.
The DMA list functionality is something that is missed by some when it comes to CPU multicore programming.
The disparate memory spaces is something being moved away from rapidly, and the required care in managing the instruction payload because it could readily impact data capacity and/or data addressing has not carried over.
The lack of coherence meant certain behaviors such as false-sharing and management of the cacheability of data in the system in modern systems was not represented well by Cell.
But it was a good educational move, and somebody must pay for that education.
The additional complexity, poor tools, and lack of a good reference for the exotic design stymied a number of developers and lead to conservative early releases that leveraged a single core on the PS3.
The PC and Xbox360 could provide multicore development opportunities, and did so sooner. Even if the PS3 weren't delayed, and I've seen more blame put on Blu-ray than RSX, the set of of early PS3 games that outright avoided the SPEs meant that it took another round of iteration or developer flameouts before true multi-core adoption would have taken place for a number of developers.
Having to dedicate optimization and programming for a specific model of developer-antagonistic multicore was a source of drag for multiplatform development.
Cell had an impact, but it was this sole driver of the advancement of multicore development knowledge, nor is it fair to say Xenon was Cell. What made Cell what it was versus Xenon is the part that was disposed of by everyone going forward.
edit: Missed a word: it was *not* this sole driver
You mix things here. Leveraging existing APIs is clearly a bad move (in educational sense), but reducing CPU/GPU synchronization is a good one.
What made Cell what it was was a heterogenous multiprocessing system, where the biggest difference was multiple isolated memory spaces, explicit DMA, and incompatible instruction formats.
GCN strives heavily to to remove the first two, and the desired end goal for AMD is to provide sufficient software tools and abstractions to make the third problem no worse than exists for separate APIs, with the hope for a somewhat transparent to the programmer HSA programming model to make it even less obvious.
Reducing synchronization in this case is bringing the GPU closer to what CPUs already do.
And the main problem: it allows people to think in DX11-way, using things like "scene", "render state", "pipeline", etc. When it obviously is not how modern hardware works.
The hardware doesn't fully match what the APIs express, but that's not to say that those concepts are foreign to it.
It allows for the deployment of a platform that is economical and commercially successful, and one that can be done now as opposed to years in the future. It allows iteration and research now, as opposed to years retreading all the old ground to get to the new.
The legacy APIs are not ideal, but they are at the same time not devoid of reason and practical benefit, and this is not an industry of volunteers that do not require food and shelter.
You can aim for perfection that iterates never, or you iterate as much as you can on what you can afford to work on.
We are talking about pure technical/engineering problem. Let's not mix it with business. Engineer, who want to "think business", probably needs to be fired or promoted to manager, it's not their job.
Engineers cannot neglect practical and material constraints. Making choices to meet goals and trading off between what you want versus what it costs, or how it can be made acceptable to the market is part of the job description.