AMD: R9xx Speculation

Some rumored specs from Fuad. If true, how in world could this card be much faster than the 5870?
Remember GF100 is closer to Juniper than Cypress on many metrics...

Considering GF100 is ~10% faster than Cypress with almost half-everything, ~50% faster Cayman with similar specs is 100% possible.


Also consider Cayman Pro has to be >=20% faster than Barts XT, with Cayman XT ~20% faster than Pro.

That's ~50% faster than Cypress Pro, or ~20% faster than Cypress XT, as a lower bound.
 
Remember GF100 is closer to Juniper than Cypress on many metrics...

Considering GF100 is ~10% faster than Cypress with almost half-everything, ~50% faster Cayman with similar specs is 100% possible.


Also consider Cayman Pro has to be >=20% faster than Barts XT, with Cayman XT ~20% faster than Pro.

That's ~50% faster than Cypress Pro, or ~20% faster than Cypress XT, as a lower bound.

GTX 480 is 15%~20% faster than HD 5870 now ..

20% faster than Cypress XT is not gonna cut it against GTX 580 unfortunately .
 
Yup, it doesnt make sense. Charlie Demerjian went on record saying that 4 Cayman shaders represent roughly 98% of the Cypress 5+1 combination.
1536 SP is rather close to 98%, being 96%. If that is case, Cayman performance will depend a lot on uncore parts of the chip, probably, because shader power will stay the same. I have a hard time coping with that.
???

Cayman = 1536 SPs = 384 4-shader-groups arranged in 24 SIMDs
Cypress = 1600 SPs = 320 5-shader-groups arranged in 20 SIMDs

On the basis of the "4 Cayman shaders ~ 5 Cypress shaders" assumption you quoted, that basically means Cayman theoretically has ~ 20% more effective shader power (on average and clock-for-clock). About the same relative increase seems to apply to TMUs (80->96) and memory bandwidth (256bit @ 1,2 Ghz -> 265bit @ ~ 1,4 Ghz).

I'm a bit more concerned about the 32ROPs number - as that might actually turn out to be a bottleneck - especially seeing how massively Barts profited from the increased ROP power. Cayman brings a new architecture, though. So comparing Cayman's 32ROPs to Barts 32ROPs might prove misleading.

If GTX580 relly is just ~10-15% faster than GTX480, a chip generally 20-30% faster than Cypress (and ~ 100% faster in heavy tessellation scenarios) would actually be enough to edge past - especially considering that Cayman will be a lot smaller than GF110 (and hence probably priced more attractively) ...

If they managed to get near to GTX580's performance while not making the chip a whole lot bigger than Cypress ... that'd be a clear win for AMD.

Remember they never really went for the single-GPU performance crown in the first place. If they can take that crown (or at least pull it into very close reach), that's a hell of an achievement given their dual-GPU enthusiast card strategy.
 
GTX 480 is 15%~20% faster than HD 5870 now ..

20% faster than Cypress XT is not gonna cut it against GTX 580 unfortunately .

It would be nice if Cayman proved to be faster than the GTX 580, but it doesn't really have to be. If it's somewhere between the GTX 480 and 580, with a much smaller die and lower power, as long as the price is right, it will do just fine.

Antilles should be more than enough to take care of the 580.
 
GTX 480 is 15%~20% faster than HD 5870 now ...
More like ...
... 5-10% in the majority of cases
... 10-20% in some cases (especially when a lot of memory is required) and
... 20%+ @ heavy tessellation tasks

Fixing tessellation performance and adding another 1GB of RAM will do a lot to fix cases (2) and (3) - it's the 5-10% "base" deficit (+ another 10% improvement added by GF110) that Cayman has to tackle.
 
I'm a bit more concerned about the 32ROPs number - as that might actually turn out to be a bottleneck - especially seeing how massively Barts profited from the increased ROP power.
AMD said that with only 16 ROPs and 2 more SIMDs, an alternative version of Barts would have been within 2% of the performance of the version of Barts that was released.

With a bit of luck the ROPs in Cayman are better than Evergreen's.
 
It would be nice if Cayman proved to be faster than the GTX 580, but it doesn't really have to be. If it's somewhere between the GTX 480 and 580, with a much smaller die and lower power, as long as the price is right, it will do just fine.

Antilles should be more than enough to take care of the 580.
Of course , this is the good old tried and proven sweet spot strategy , it is just that we were prepared for something much more dashing this time around .
 
???

Cayman = 1536 SPs = 384 4-shader-groups arranged in 24 SIMDs
Cypress = 1600 SPs = 320 5-shader-groups arranged in 20 SIMDs

Does this type of connection between organization of shaders AND performance still stand? Because Charlie, which is pretty close to ATI/AMD, said:

Back to the NI family, what are they? Well, that part is easy enough, they are a serious re-do of the Evergreen family. The biggest change is in the shaders, they have gone from a 4 simple + 1 complex arrangement to a 4 medium complexity arrangement.This should end up no slower than the old way for simple calculations, the overwhelming majority of the workload, but also be faster for most of the complex operations.
The reason for this can be summed up by saying that the new 'medium' shaders can't do what a complex one can in the same time, but there are more of them, and they can more than make it up in number.

What I mean is: Cypress has a 4+1 configuration. Cayman seems to have just 4 "medium" shaders. Before you had 320 "powered" shaders (the "1" in the sum). Now you dont have powered shaders, but pretty much equal ones. The way I understand this, is that your calculations for Cayman dont work anymore, because even IF there are 64 more "medium" shaders, those 384 performance is EQUAL to 320 "powered" shaders, except on the more complex operations, where they are faster. In the end, shader power would stay, in average, the same.
 
AMD doesn't have to worry about 580 if Nvidia can only sqeeze out a few cards at a higher price if AMD can churn out a lot of Cayman at a lower price.

Personally I think Cayman will be faster, smaller, cheaper, cooler. AMD was targeting a full Fermi last year as competition, so if 580 is finally what should have been delivered last year, AMD will have no problems dealing with it.
 
Did it? Dave said, that 1280 SPs + 16 ROPs performed only 2% slower then 1120 SPs + 32 ROPs, so the ROPs are pretty under-utilized now.
Seems like a very "special" (low res, no AA) scenerio ... just remember what happened to HD5830 when reviewers tested higher res and AA settings :rolleyes:

Plus - if performance difference really was that low, I wonder why they put in 32ROPs and promoted "full Cypress ROP performance" in the first place?

I'm pretty sure the fact that it (purportedly) "only" has 32ROPs won't hold Cayman back in generall, but I'm still very interested in the first high-res / high AA settings benchmarks ...
 
Comparing to HD6870, same clocks: +37% more shaders, up to +20% for 4D, +XX% for improved geometry and tess , -YY% for unoptimized drivers = .... I guess it can match the GTX580, but it's going to loose in tesselation benchamrks, thus it won't be fastest GPU on planet.
 
Seems like a very "special" (low res, no AA) scenerio ... just remember what happened to HD5830 when reviewers tested higher res and AA settings :rolleyes:

Plus - if performance difference really was that low, I wonder why they put in 32ROPs and promoted "full Cypress ROP performance" in the first place?

I'm pretty sure the fact that it (purportedly) "only" has 32ROPs won't hold Cayman back in generall, but I'm still very interested in the first high-res / high AA settings benchmarks ...

They said 2% overall. You don't think AMD doesn't test differing scenarios?

Also, RV770 once had the same 16 ROP count compared to RV670, but they did tweaks and RV770 did just fine last I saw
 
just remember what happened to HD5830 when reviewers tested higher res and AA settings :rolleyes:
That wasn't ROPs-related but intra-die BW related. Disabling half of ROPs in Cypress results in halved bandwidth between ROPs and memory controller.

Plus - if performance difference really was that low, I wonder why they put in 32ROPs and promoted "full Cypress ROP performance" in the first place?
Marketing. Read Dave's comments...

I'm pretty sure the fact that it (purportedly) "only" has 32ROPs won't hold Cayman back in generall, but I'm still very interested in the first high-res / high AA settings benchmarks ...
RV790 had 16 ROPs + 256bit interface and its percentual MSAA 8x performance drop was often better than on Cypress.
 
that is incorrect :
http://www.computerbase.de/artikel/...te-2/22/#abschnitt_performancerating_mit_aaaf

More like :
10-15% in the majority of cases
5% in some cases
20% in games with tessellation

Different tests, different numbers - and those 10-15% values you posted are averages that already include a significant "DX11/tessellation" bonus hopefully addressed by Cayman; but I really don't mind if we agree to meet in the "generally 10-15% faster" sector ;)

It still means that Cypress +20% would probably be enough to trade blows with GTX580, Cypress +25-30% could win the crown (if ever so slightly).

I only wanted to point out that - once you take NVIDIA's current tessellation and RAM bonus out of the equation (assuming that Cayman will address these issues) - the remaining performance gap isn't as big as it may seem when looking at some current GTX480 vs. HD5870 score tables.
 
I only wanted to point out that - once you take NVIDIA's current tessellation and RAM bonus out of the equation (assuming that Cayman will address these issues) - the remaining performance gap isn't as big as it may seem when looking at some current GTX480 vs. HD5870 score tables.
Yup , I agree , however add to that , less AA hit .
 
@ no-x:

I don't intend to argue with you - as you're most probably right and I'll most probably lose ;)

I just remembered some HD5830 reviews in which performance hits @high resolutions and lots of AA were rather dramatic in direct relation to 32ROP Cypress parts - and the recent Barts reviews seemed to suggest that the amount of ROPs contributed a lot to pulling off nearly HD5870 performance with significantly less shaders.

If the 32ROPs in Barts are just for marketing and it could do almost as well with half of those - that's pretty nice and bodes well for Cayman (especially since Cayman's ROPs might even be improved over their Evergreen counterparts, as Jawed suggested).

I'll admit I mainly visit / read these forums during the weekends - and thus tend to miss some important input given by Dave and others.

So thanks for catching me up where it's needed ;)
 
Back
Top