The 8 vs. 16 was in regards to TMU count, which you indicated you had last night....
I think the real fun bit would be to try to get them to spill beans on architectural differences between chief scientists. Letting them tell you the differences in leanings of top scientist could give you hints for the next arch (Gauss? -- that's what we called our post-Fermi release :shrug
Real details are going to have to wait for actual cards and tests. 1.6x is pretty low, but would certainly bolster the case for 8bilerps vs. 16 (for example).
I think the other question floating around in my head was to what extent the improvements in efficiency improve overall performance. How much does the second dispatch unit help, how much does the faster context switch help, how much does letting more than one kernel run at the same time help, etc.
Thanks and good luck, wish I was there, but we have had an awful release cycle these past couple of days, and I get to do perf work for the next few days, which is where all the fun in a release is anyway
-Dave