Naughty Dog had similar recommendations (Use smart design and basic hardware characteristics, don't code optimize using assembly because it's hard to maintain). It seems that they went against their advices since one of the interviews mentioned that they used SPU assembly in Uncharted 2. Then again, they also said they had some development "close calls" because they relied too much on paper design instead of trying it out for real, until it's a little too late.
That would be Pal Engstad. If you watch the "Mastering the Cell processor" video he mentions they had more time to attack the hardware during U2's development, so for me it makes a lot of sense that, given the extra time, they wanted to optimize and re-write code in assembly.
It does not contradict the "no premature optimization" rule at all.