HD 6970 being 50% faster than HD 6870 isn't the best case scenario ... especially not in 3DMark 11And that reason is... simply... best case scenario... or something like up to X1680.
HD 6970 being 50% faster than HD 6870 isn't the best case scenario ... especially not in 3DMark 11And that reason is... simply... best case scenario... or something like up to X1680.
I don't think so. Barts doesn't support DP, Cayman does. That are additional transistors, which won't be utilized in rendering. Another thing is the dual-geometry engine - it also consumes transistors, but it won't impact performance in many games (because majority of games isn't limited by geometry performance). The 4D thing seems to be also targeted to HPC (better DP:SP ratio), functionality transfered from T-unit to X/Y/Z/W is also very HPC oriented... it costs transistors, which won't be utilized in 3D. I'd be very surprised, if Cayman brings better performance/transistors than Barts (at the same clock, of course), because it appears to me, that this GPU is oriented to achieve best HPC performance per transistor - not the best 3D performance per transistor (that was Barts job). And the difference in this aspect seems to be significantly higher than between Cypress and Juniper.Given that it's a new and improved architecture, you'd expect perf/mm² to stay at least at the same level as Barts
~3.2-3.5TFlops.
Best case would technically be ~1.8-2x the performance of 6870, depending on exact specs, but as we know that won't translate to realworld performance increase so 1.4-1.6x would be more reasonable in the majority of cases.
What about drivers? These won't be as well optimised for the new architecture, so we can expect some 10-20% better performance during first 6 months compared to launch.@OgrEGT:
30-50% faster than HD 6870 seems rather conservative.
Given that it's a new and improved architecture, you'd expect perf/mm² to stay at least at the same level as Barts.
There was one theory posted in the HD5 AF broken -thread, aka AMD/ATI using more detailed LOD values by default, by "softening" the LOD by +0.65, the shimmering disappears on Radeons - incidently, then "sharpening" the LOD by -0.65, the shimmering appears on GeForces
And you can forget about 1.5 GHz GDDR5 with just 6 GBit/s chips. It will be a bit lower (I guess 1.4 GHz maximum).
6990 (XTX) 775MHz 3840SPs 6.0GFlops (310W)
6970 (XT) 1025M 1920SPs 4.0GFlops (232W)
6950 (Pro) 875M 1536SPs 2.7GFlops (188W)
I don't know if there's really much need for that but imho it makes a lot of sense. Currently dual-channel fp32 and single-channel fp32 blending is performed at the same speed as quad-channel fp32 (well outside of memory bandwidth requirements), at 1/4 the rate of 8bit int blending. Clearly, faster quad-channel fp32 blending wouldn't be helpful (there's not enough memory bandwidth even at quarter rate already...), but this means that for 1-channel fp32 blending the hw currently apparently uses only 1 of the 4 blend units of a ROP, the rest are just idling. So by using all of them (just need to feed 4 consecutive pixels to the 4 rgba blend units) single-channel fp32 blending performance should increase by a factor of 4 (well not quite it will hit memory bandwidth limits) and dual-channel fp32 blending by a factor of 2, with minimal hardware changes.(nvidia is already doing this for a while now.)Blending might be 4x faster? Is there much need for blending of fp32 single-/dual-channel pixels though?
With GTX570 rumored to launch on December 7th, it sure would be a nice move to launch HD 69** just one day before thatIt might launch (really) close to the 570?
Since Fermi...? I think GT200 didn't support it.(nvidia is already doing this for a while now.)
I don't think so. Barts doesn't support DP, Cayman does. That are additional transistors, which won't be utilized in rendering. Another thing is the dual-geometry engine - it also consumes transistors, but it won't impact performance in many games (because majority of games isn't limited by geometry performance). The 4D thing seems to be also targeted to HPC (better DP:SP ratio), functionality transfered from T-unit to X/Y/Z/W is also very HPC oriented... it costs transistors, which won't be utilized in 3D. I'd be very surprised, if Cayman brings better performance/transistors than Barts (at the same clock, of course), because it appears to me, that this GPU is oriented to achieve best HPC performance per transistor - not the best 3D performance per transistor (that was Barts job). And the difference in this aspect seems to be significantly higher than between Cypress and Juniper.
GT200 worked the same.Since Fermi...? I think GT200 didn't support it.
But I don't know if fp32 blending is really used anywhere - probably not...