I've never tried a direct comparison to see which could push more triangles but it's kind of uninteresting in a vacuum. In real life PS3's vert unit was quite bad, and was indeed a bottleneck for many games. Standard optimizations were SPU backface culling, constant patching, SPU skinning, and interpolator packing... all more or less aimed at helping along the vertex/tri hardware.
Thanks for the info!
And yah, this is the kind of thing I was trying to describe having read about. It does appear to have been standard to use Cell significantly for GPU assistance after the first few years.
Everything I have read suggest that triple buffering reduces the negative impact of vsync on framerate, but doesnt completely eliminate it. Wii U's biggest advantage seems to most certainly be the memory. Both the main ddr3 and edram are over twice as plentiful. If DF's anylsis is to be believe, they suggest that the framerate dips are typically caused by heavy fillrate effects such as particle and post processing effects. If my novice level understanding of things serve me correctly, this should suggest that the edram on the Wii U's GPU is in fact sufficient as to not be a bottleneck.
The overheads for triple buffering should be very small. I'm not a developer by any stretch of the imagination, but my time with admittedly simple OGL programming seemed to show that there was almost no performance penalty for triple buffering, beyond memory used for additional buffers (didn't test for input latency).
It may well be that DF are right, and that fillrate is the cause of the drops. If so, I would expect it to be related to more efficient use of the GPU - perhaps early Z rejection causing obscured fragments/pixel to not be rendered. In a straight contest of fillrate I'd expect the 360 to win ... but it's not like I can say that for sure.
I still have a feeling that vertex transformation and triangle setup will favour the WiiU in this game compared to the 360, and I'd bet ten English pounds that without Cell assistance WiiU Bayonetta will fare massively better than the PS3 here.
I hadnt heard that the PS3 was as such a disadvantage in polygon performance. You mentioned the Cell being good at culling, was the Xenon not very good at culling? I know polygons per second are no longer a big benchmark for games, but still, that was a pretty big deficit.
A Xenon core should be almost as fast as an SPU, I reckon. Which is to say quite fast (128-bit vector units and high clockspeed). You can predict what you'll need to have in the cache and so should be able to prefetch, test, and write back to memory without random accesses or nasty branching. Cell having twice as many cores means it could do additional vertex work on top of matching whatever Xenon was doing though, once developers had got up to speed with Cell..
The extra headroom in Cell seemed to get used in supporting RSX.
The WiiU CPU should be much worse at vector work than either Cell or Xenon, but on the other hand the GPU probably needs the least assistance of the three (I'd guess).