Jawed
Legend
Microsoft has increased precision requirements in D3D I believe - but to suggest that graphics rendering will be "broken" or artefacted by having too much shader precision is a reach in my view.I believe dkanter raised the issue at one point and also pointed out that Fermi did not have the option to do the intermediate rounding. Don't think he's a graphics programmer but nobody has stepped up to say that it's NOT an issue either.
The difference between existing ATI and NVidia cards already entirely undermines this argument. Older versions of DirectX don't have mandated rounding, and shader compilation (i.e. ordering affects precision) further muddies things.
OK, back to square one: HD5870 with less bandwidth is outperforming GTX285 and appears to be moderately bandwidth limited. That's the kind of target, per GB/s performance, we should be expecting from GF100.How is it the same? Increasing efficiency doesn't guarantee that you won't be bandwidth limited.
Z rate and seemingly fp16 post-processing are the heavy bandwidth users. Z rate is 850MHz x 32 x 4 = 108GP/s in HD5870 versus say 600MHz x 48 x 8 = 230GP/s in GF100. fp16 rate is 850MHz x 80 x 0.5 = 34GT/s versus 600MHz x 128 x 0.5 = 38GT/s. So between the two, perhaps 50%+ performance assuming they're reasonably equivalent bottlenecks and presuming that NVidia can improve the efficacy of these units.
Also, I'm presuming that architectural improvements will only remove bottlenecks from non-bandwidth consuming functions (e.g. increased ALU:TEX), i.e. bandwidth limitations in texturing and fillrate will become more important.
I will happily admit it's now really murky trying to assess how a game is bottlenecked - Crysis is fantastically obscure, for example. So only the IHVs can really see how the mix of units and respective clock rates affect things.
That was completely sarcasticExcept 2900XT was slow at nearly everything, not just one setting.
You mean like turning on shadows in games where NVidia's patent (PCF in TMUs) meant that competing graphics cards couldn't implement that algorithm in hardware? Until D3D10. Yes, cherry-picking of that sort has been going on for years.Picking a setting where one architecture has an obvious performance cliff is cherry picking. Using 4xAA isn't cherry picking since performance at that setting typically scales in line with performance at other settings - 0xAA, different resolutions etc. 8xAA is the outlier.
Also, given the theoretical Z rate of GT200, the fact it has a performance cliff is not the reviewer's problem - eye-candy is eye-candy and enthusiast-level cards have no excuses. Overall I think it's ignorance/laziness/perceived-as-irrelevant.
There's still a fundamental question for reviewers: are we trying to assess which is the most powerful card (how can we make them weep?) or are we trying to assess which is faster at de-facto settings (meaningful to typical readers in typical games).
http://www.xbitlabs.com/articles/video/display/radeon-hd5800-crossfirex_6.html
Jawed