Really?
Look at DP throughput, and do some analysis on slide 70, it seems there's something more to come...
2 DP MUL or ADD, but only 1 DP MAD/FMA per clock? It seems I was right when speculating about VLIW4 = half-rate DP with semi-specialized, symmetrical units, disabled on Radeons only for products segmentation.
It´s funny the work Dave Bauman´s site gives to himself!. I suspect he would be willing to make the review of this card if he wasn´t its product manager!!
For GPGPU, it would mean a considerable lead (>=1.5TFlops).I don't see why they'd go higher* than 1:4 DP:SP ratio. What good is it except for marketing? I would prefer increasing SP throughput with DP naturally increasing at the same time.
For GPGPU, it would mean a considerable lead (>=1.5TFlops).
As for the 1:2 DP MAD/FMA ratio, with 1:2 ADD and 1:2 MUL DP throughput there's no reason to limit MAD/FMA throughput to 1:4 since "simple" optimisation gives 1:2 rate for free.
What's the difference among concurrent kernel execution for rv870(it's obvious that AMD compare this feature with NV's parallel kernel processing) and execution of multiple compute kernels for rv970? For me it's same two kernels by the number of dispatch processors in both rv870/940 and rv970Maybe you should think about the difference between concurrent and asynchronous
It´s funny the work Dave Bauman´s site
Beside that Cayman will do only ADDs with 1:2 ratio and MUL/MAD/FMA with 1:4. That's most probably just an error in the slide which got carried over from the Cypress presentation (which had the same misleading "2 64 Bit ADD or MUL" in it but only for add it is true).Look at DP throughput, and do some analysis on slide 70, it seems there's something more to come...
2 DP MUL or ADD, but only 1 DP MAD/FMA per clock? It seems I was right when speculating about VLIW4 = half-rate DP with semi-specialized, symmetrical units, disabled on Radeons only for products segmentation.
No mention of a cache hierarchy is odd.
Undecided specs even now is weird.
Judging by this slide, the L2 cache is still read-only.No mention of a cache hierarchy is odd.
Undecided specs even now is weird.
That model is where kernel execution fills all available SIMDs. The easiest way to think about this is when a kernel is "ending", i.e. as SIMDs finish off their final threads for kernel A, they become available to start work on kernel B.What's the difference among concurrent kernel execution for rv870(it's obvious that AMD compare this feature with NV's parallel kernel processing)
This model launches multiple (prolly only 2 per SIMD) kernels regardless of the occupation of a SIMD by any other kernel. i.e. two compute kernels could both fill all SIMDs. Here kernels A and B can be launched independently.and execution of multiple compute kernels for rv970? For me it's same two kernels by the number of dispatch processors in both rv870/940 and rv970
There are none in HQ mode.
So how come that HD5000/6000 series shows noticable texture-shimmering in some games while any Geforce shows little to none even on the Q setting? Is it a hardware limitation then?
With all this raw power, why can't modern Radeon cards filter as clean as possible, providing a smooth calm image? R520/580 did way better in this area, and your main competitor has been offering superb AF-Quality without any "compromises" since 2006...
I get the impression things aren't going to change much based on the current (lack of) architectural change. Unless there's some as yet undisclosed magic it'll probably be similar performance to the 580 with lower die size, power consumption and hopefully cost. The story with geometry doesnt seem to have changed either.
There was one theory posted in the HD5 AF broken -thread, aka AMD/ATI using more detailed LOD values by default, by "softening" the LOD by +0.65, the shimmering disappears on Radeons - incidently, then "sharpening" the LOD by -0.65, the shimmering appears on GeForces
Well, in my opinion, an IHV has no business adjusting the LOD. If the user or an application requests it, fine. Everything else is just inscrutable.
Normally, the LOD should stay at 0, right?