Maybe he looked at the techreport result, where it failed to reach full speed performance (though still notably more than half speed): http://techreport.com/articles.x/19934/6It does.
Maybe they used large textures which didn't fit into caches - techreport at least thinks it fails to reach its potential due to memory bandwidth and/or smaller L2 cache compared to GTX580.
(And I was long saying nvidia got it backwards, full speed fp16 isn't really needed for GF104 as this has higher tmu/alu ratio (and less bandwidth per tmu too) but GF100 should have had it. If this wasn't a bug on GF100 it looks like a major oversight to me.)