AMD: R9xx Speculation

6e3xom.jpg

http://bbs.expreview.com/viewthread.php?tid=37918&from=recommend_f

Only 160GB/s bandwidth? Only 1250mhz memory clock?

Looks real but WEIRD at the same time. Basically same mem bw, 30 SIMDs but prolly organized as two clusters and SIMDs and TMUs decoupled all over again.
 
I think the number of TMUs is more interesting. R7xx/8xx had 4 TMUs per SIMD, but Cayman doesn't.

Well it would fit 24 SIMDs with 80 SPs each… but obviously that contradicts the slide itself. Weird. So either the TMUs are decoupled or it's yet another fake.
 
Hmm, so this means ALU:TEX is 5:1 (in terms of cycles) rather than 4:1 as it has been for years now. So perhaps there's something in those patent applications that I've linked several times :p

I expect this will be fine for games, 80 TMUs in Cypress seem to be wasted anyway.

Compute applications which depend on L1->ALU bandwidth might be a bit constrained. Though there's always the possibility that TEX->ALUs could be beefed-up. If, as one of the patent applications seems to suggest, ALU's can write to the L1s, then that'll be more interesting...

2 polys per clock is definitely what we want to see.

After Barts's revealing that 16 ROPs ~ 32 ROPs as far as performance goes, I think it's reasonable to expect Cayman to be significantly more bandwidth efficient, and for 32 Cayman ROPs to be worth significantly more than 32 Cypress ROPs.

I can't see anything here that looks faked, and I'm cautiously optimistic it'll work out well...


One possible arrangement?:
  • 30 SIMDs - each 16 ALUs with 64 ALU lanes
  • 12 octo-TMUs - totalling 96 TMUs
  • Each set of 10 SIMDs has 4 octo-TMUs
Or?:
  • 30 SIMDs - each 16 ALUs with 64 ALU lanes
  • 12 octo-TMUs - totalling 96 TMUs
  • Each set of 15 SIMDs has 6 octo-TMUs
I dare say the latter accords with 2 polys per clock.
 
  • 30 SIMDs - each 16 ALUs with 64 ALU lanes
  • 12 octo-TMUs - totalling 96 TMUs
  • Each set of 15 SIMDs has 6 octo-TMUs
I dare say the latter accords with 2 polys per clock.

Makes sense, considering Barts probably has 2x8 SIMD, there would be the same macro-redundancy in Cayman (1 spare SIMD for each bank of 16).

I was also thinking AMD could deviate from Hemlock and use high yield, partly disabled dies (perfect if the die is big as it'll allow for $400-500-650 prices, or something similar).
 
The spec's from this slide surprisingly coincide with my early prediction for Cayman architectural layout here. :D
My interpretation of the patent applications is that an octo-TMU can deliver 4 texturing results, based on 64-bit texels (e.g. fp16 RGBA texels), per clock to an ALU.

So one arrangement we might see for a shader engine, assuming 2 shader engines (quick and dirty photochop from your picture fellix :p):


b3da032.png


This consists of 3 clusters - each containing 5 SIMDs and 2 octo-TMUs. Each octo-TMU can deliver its results to any of the 5 pairs of ALUs aligned with it, delivering 2 quads of results to the respective ALU quads or a single quad of results to one or the other of the pair.
 
It would be 50% more efficient - now 1 of 2 rasterizers is nearly always idling. With 2 setups and 3 rasterizers, only one of them would be idling...
 
Maybe it's another fake... :???: ... maybe not.
Fake-or-not.jpg


· Difference in size between the "3" and the "0".
· Different font for the "30" and the "32".
· GDDR5 @ 5 GHz seems too slow especially since 2Gb @ 6 GHz is available from Hynix (H5GQ2H24MFR-R0C), Samsung (K4G20325FC-HC03), and Elpida (EDW2032BABG-60-F).
· Number of TMUs vs SIMD seems a little strange.
 
Last edited by a moderator:
That is a lot of SP! ...is it still 4+1? I guess the more SP is needed for MLAA with only 32ROPS....wonder why they did not bump up Barts SP....looks a perf gap between AMD Cayman and Barts will be formed....if the 2GB vram is true...on to the bandwidth...i know AMD recent gpus are not bandwidth limited...i guess they are made with GDDR5 limitations in mind or will it finally hold back Cayman massive SP count...i think Cayman will be powerful...$499?
 
Maybe, maybe not. I guess it's a photo of presentation slide on projection screen taken from an angle so that may very well be the reason for the things you pointed out.

Lol, zeroes don't shrink when they come closer ;)
 
Back
Top