AMD: R9xx Speculation

Can anyone explain what this is and if it's specific for 6000 series or not? The only leaked driver is 8782a inwhich it had to be modded for use with a 5000 series. Some say it's the same as found on the discs for the 6870. However, it's been said there is another beta but not publicly available. Or is that simply a fake.
It's real , one of my friends got the card , and the beta drivers with it , and its real .
 
What benchmark are you referring to? The only one I've seen of Civ 5 is [H] showing up a big SLI failure in that title.

Nvidia released an SLI profile for Civ 5 a day or two after that review. The review I'm referring to is based on single cards.

http://www.techspot.com/review/320-civilization-v-performance/

The Radeon HD 5670 witnessed a 29% increase while the Radeon HD 5870 became 11% quicker. Changing the Tessellation quality did little to improve the Geforce GTX 400 series graphics cards. The GTX 460 for example gained just 5% more performance.
 
replace but with so far?
I don't understand that post. Does that option found in Cat A.I. only show up using the 6800 series using the leaked Cat 10.10 (8782a)? I've been lurking in other threads using the leak driver and so far no one with a 5000 series card reported such a feature in Cat A.I. Perhaps that's my answer to the question.
 
Maybe Dave ought to chime in here. If Barts were a curry, what curry would Barts be? I know he loves the curries so maybe hes going to respond. Lol drunk posting aha!

Also the official word on what Dave is a product manager on this time around would be nice as well as an explanation as to why ATI hasn't disseminated the fact you can do 3D with their cards. I want to know why most people think 3D is Nvidia only!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! I exclaim much!
 
I don't understand that post. Does that option found in Cat A.I. only show up using the 6800 series using the leaked Cat 10.10 (8782a)? I've been lurking in other threads using the leak driver and so far no one with a 5000 series card reported such a feature in Cat A.I. Perhaps that's my answer to the question.

You have just answered your own question, yap!
 
Maybe Dave ought to chime in here. If Barts were a curry, what curry would Barts be? I know he loves the curries so maybe hes going to respond. Lol drunk posting aha!

Also the official word on what Dave is a product manager on this time around would be nice as well as an explanation as to why ATI hasn't disseminated the fact you can do 3D with their cards. I want to know why most people think 3D is Nvidia only!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! I exclaim much!

Because, sadly, you need 3rd party drivers to drive the glasses for ATI, and you need to find 3rd party glasses too.

I'm still waiting for Bit Cauldron's glasses to enter the market, as they're made in tight cooperation with AMD and according to their site, they only work with AMD too
 
ATI Radeon 4870 = 256mm2 (160 x5D)
ATI Radeon 6870 = 255mm2 (224 x5D)
ATI Radeon 5870 = 334mm2 (320 x5D)

Note-
(64 x5D) more streams processors for same die area as "HD4870" for "HD6870".

Calculation: BartsXT at (384 x5D) will be same die are as "HD5870" 334mm2

BartsXT at (464 x5D) stream processors die will be 373mm2
BartsXT at (504 x5D) stream processors die will be 393mm2

CaymanXT should NOT be over 400mm2 die area.

EDIT: I don't understand how it could be - up to 300W DTP
 
Last edited by a moderator:
ATI Radeon 4870 = 256mm2 (160 x5D)
ATI Radeon 6870 = 255mm2 (224 x5D)
ATI Radeon 5870 = 334mm2 (320 x5D)

Note-
(64 x5D) more streams processors for same die area as "HD4870" for "HD6870".

Calculation: BartsXT at (384 x5D) will be same die are as "HD5870" 334mm2

BartsXT at (464 x5D) stream processors die will be 373mm2
BartsXT at (504 x5D) stream processors die will be 393mm2

CaymanXT should NOT be over 400mm2 die area.

Except that the ISA(?) confirmed already there's at least 2 4D ASICs coming?
 
If they are going for half-rate double precession at multiplications, Cayman could be easily over 400mm².

And Cayman also may more than double the INT8 texture fill-rate over Barts, if they use the patents found by Jawed.
 
And Cayman also may more than double the INT8 texture fill-rate over Barts, if they use the patents found by Jawed.
I don't think they increase the throughput for 8-bit texels at the unit level, and I doubt there are more units per core than in Barts. I think the positive effect might only come with filtering of fp16 format textures.
 
I am thinking of SIMDs with 32 4D-ALUs and a double TMU like specified in the patents.
There are not so many possibilities for going 4D and keep wavefront-size of 64:
  • SIMDs with 16 4D ALUs and 4 TMUs (30 SIMDs @ 1920SPs)
  • R600-style with decoupled SIMDs and TMUs (3x 5 ALU-SIMD (each 128SPs) and one TU-SIMD (32 TMUs))
 
Last edited by a moderator:
AMD has a point with their "good enough" approach to tessellation. Civ5 seems to be the first game that shows a marked difference in the performance hit between Fermi and Evergreen with tessellation enabled. Don't think anyone has accused it of being over-tessellated so it will be an interesting title to watch.
GTX460 is notably faster than HD5870, with tessellation off or on in that Techspot review. The gap increases with tessellation. If this game is indicative of "good tessellation" then AMD doesn't have a point - let alone the fact that the performance of HD5870 is not competitive at the baseline.
 
You know, there is also a HD6850 for less than $200. Wouldn't that be a good competitor to the GTX460? :rolleyes:

Btw., competition means competetive performance for a competetive price, not more power for the same money, the same power for less money, or even more performance for less money. Then it isn't a competetion anymore! :LOL:

The 4850 came out 18 months after the 8800GTX. It cost 150 euros at launch while the 8800 GTX cost 550 euros at launch. The 4850 was faster. This is what competition is supposed to bring for the consumer.





The price drops come with a drop in specs....so there is a disappointment the 12 month cycle added little..wrt to specs...so if there is no major changes to the arch..how is 6870 getting a slight higher 3DMV scores than 5850 with less sp.....the first guess would be the higher clocks...i see 6870 would excel in certain of 3DMV tests with a high clocks thus bringing up the scores...

wrt to the specs again...kind of disappointed in the drop in texture units..iirc that was the bottleneck of 4800 series in which a couple of games ran mightily slower than Nvidia because of texturing .... Company of Heroes come to mind....going from 4800 to 5800 was a major stepup in fps!

The drop in the texture units may not be so grave. After all, If I am not mistaken, the texture units belong to the part of the chip that has been equipped with the new architecture.

No surprise that those who are going to be disappointed are those who were disappointed even before anythihng concrete came out :rolleyes:

I don't even see where official pricing has been listed, much less any real performance #'s or architecture deep dives, so why not wait 3 days before jumping to conclusions?

This looks like official pricing to me. Not exact pricing but a very informative field nonetheless.

 
Last edited by a moderator:
I am thinking of SIMDs with 32 4D-ALUs and a double TMU like specified in the patents.
I strongly doubt that a doubling of hardware thread size is likely - 16 is preferable to 128, let alone the problems that 128 causes in terms of blocking (size and shape) in compute algorithms. It's unwieldy.

There are not so many possibilities for going 4D and keep wavefront-size of 64:
  • SIMDs with 16 4D ALUs and 4 TMUs (30 SIMDs @ 1920SPs)
  • R600-style with decoupled SIMDs and TMUs (3x 5 ALU-SIMD (each 128SPs) and one TU-SIMD (32 TMUs))
There's a third option: SIMDs are always in pairs, so a pair consists of two 16-wide VLIW-4 SIMDs. The pair shares an octo-TMU. The TMUs normally operate in dedicated mode for 8-bit texture filtering, where each SIMD has exclusive access to a quad-TMU. In fp16 filtering mode, throughput falls to half as now the octo-TMU serves one SIMD at a time.

This seems pretty much the same as NVidia used in G80...GT200.

I envisage an entire clause of TEX instructions being served by the octo-TMU - this maximises coherency of L1 usage, feeding a stream of results to a single SIMD before the other SIMD gets its turn. fp16 textures consume much more space in cache, so coherency becomes more of a problem.
 
This looks like official pricing to me. Not exact pricing but a very informative field nonetheless.

Yes, we've seen that, but that's nothing conclusive, given that a card can occupy a $100+ range. Quite the difference between $150 and $250 or 200/300$ in terms of perception if nothing else, I'd say
 
ATI Radeon 4870 = 256mm2 (160 x5D)
ATI Radeon 6870 = 255mm2 (224 x5D)
ATI Radeon 5870 = 334mm2 (320 x5D)

Note-
(64 x5D) more streams processors for same die area as "HD4870" for "HD6870".

Calculation: BartsXT at (384 x5D) will be same die are as "HD5870" 334mm2

BartsXT at (464 x5D) stream processors die will be 373mm2
BartsXT at (504 x5D) stream processors die will be 393mm2

CaymanXT should NOT be over 400mm2 die area.

EDIT: I don't understand how it could be - up to 300W DTP

Er unless I'm reading that wrong, did you just add 64 on top of the 5870 and get the 384 x 5D?

Cause HD4870 was 55nm, whereas 6870 is 40nm as is the 5870...
 
I did and I don't see that feature. Is the ability to disable it exclusive to the 6x00 series? It's kinda sad that my 5870 smokes every game I throw at it but AF looks worse than on my old 8800GT.

http://forum.beyond3d.com/showpost.php?p=1484162&postcount=82
Download the "original" station-drivers package (6800 only), there is its C7106344.inf (...\Packages\Drivers\Display\W76A_INF)
So in this INF you will find new options:
HKR,, EQAA_NA, %REG_SZ%, 1
HKR,, MLF_NA, %REG_SZ%, 1
HKR, "UMD",MLF_DEF, %REG_SZ%, 0
HKR,, EnableUlps_NA, %REG_SZ%, 0
HKR, "UMD",EnableUlps_DEF, %REG_SZ%, 0
HKR,, SurfaceFormatReplacements_NA, %REG_SZ%, 0
HKR, "UMD",SurfaceFormatReplacements_DEF, %REG_SZ%, 1
HKR,, TFQ_NA, %REG_SZ%, 0
HKR, "UMD",TFQ_DEF, %REG_SZ%, 1
So start you regedit an add these SZ (-> UMD!). After you restart the CCC, you will see the new options (or you break your CCC). But it doesn't mean, these new options will work.
 
If they are going for half-rate double precession at multiplications, Cayman could be easily over 400mm².
But they are not doing that. For what reason should they give up maybe 25% computational performance in SP for a possible 50% gain in DP? It's simply not worth it.

And the VLIW4 stuff in the driver has 1/4 for multiplication/FMA and 1/2 for ADDs, basically the same as RV670 through Cypress for that matter.
 
Back
Top