AMD: R9xx Speculation

psolord · Jul 15, 2010

eastmen said:
The 5970 isn't very attractive to some because it will only be faster than the 5870 in games which crossfire scales well with. A faster 5890 could also lead the way to a faster 5990 also

The only thing that is unattractive on the 5970, is its price. You could always get two 5850s with considerably less money and have the exact same performance.

Other than that, Crossfire is sure to work in games that matter so a dual GPU card that can get the job done, cannot be unattractive. I've seen instant dual gpu scaling on my 5850 crossfire system, on many games, right from day one of their launch. I mean big titles, like Modern Warfare 2, Metro 2033, Dirt 2, Bad Company 2. So since the performance is there, it doesn't matter if it comes from one gpu or two.

From a manufacturing perspective, it is more profitable to have two smaller gpus working together and have your driver team with their fingers nailed on their keyboards to make a game work (a close collaboration with all developers is a must as well), than having a bigger chip with easier driver development.

Granted, some games are bitches to work in multi gpu systems. Take Bioshock 2 for example. Since its launch, I've never seen it bothering my second 5850. But hey, what's the point, since it can already run at 54fps on my old 3870 and that's at DX10 1920X1080 mind you!

Triskaine · Jul 15, 2010

Some leaks from our Chinese friends:
http://translate.google.de/translat...d-108125-1-1.html&sl=zh-CN&tl=en&hl=&ie=UTF-8

Bullet Points:
- 1920 4D shaders
- Q4 release before Christmas

eastmen · Jul 15, 2010

Unknown Soldier said:
Doesn't matter, the Ultra in all essence was virtually identical to the 8800GTX except it had a new silicon and higher core/memory speeds.
http://techreport.com/articles.x/12379/1

They still refreshed thier product line and was able to introduce a more expensive faster part. Which if you haven't noticed is what I sugested ati should have done with a 5890 part.

With the 8800ultra nvidia was able to create a brand new higher priced sku . Ati could have done the same

Why is it unattractive? Many people I know actually went from a 5870 to a 480 to a 5970. If they have the money to splurge for the top of the range card, then the 5970 is very attractive. A 5890 wouldn't necessarily beat a 480 or even come close to a 5970 and most probably would end up a few fps more than the original 5870 while a 5990 would be a waste if the 5970 is already the top card to beat. If Nvidia had come out with a GX2 or such type card taht beat the 5970, then yes I could've thought it practical to maybe have a 5990. But now? No. Rather spend the money on the HD6xxx.

The 5890 wouldn't have to come close to a 5970 , it just has to be a better value than the g480.

The 5870 currently sits at $400
The 5870 2gig sits at $500
The 5970 sits at $700

The 5870 has very limited performace benfits compared to the 5870 1gig . So a 5890 with faster performance could have sloted in at $550-$600.

For those who want the very best they could have spent on two 5890 and had better performance than a single 5970.

As for why the 5970 is unattractive. It costs $100 more than two 5850s , it costs only $100 less tha dual 5870. The cross fired 5850s are as fast and the crossfire 5870s are even faster

Does it matter? I'd be willing to stake ATI's reputation (snicker) on it and say they sold more 5870/5850 than they have currently sold 5830. btw. I don't work for ATI and I couldn't think of anything to stake.

Don't you find that troubling ? That from $160 to $280 ish there is no compelling ati part. The 5830 up till now cost $250 and had performance closer to the $150 5770 but power usage and heat of the 5850 which was only $50 more in most cases. Even now at $200 its a tough sell for consumers.

Farid · Jul 15, 2010

Folks, may I remind you all that this is a speculation thread about the future Ati architecture R9xx, not a "general talk about recent Ati cards."

These thread are already hard to follow as is, let's not spend pages arguing about

Triskaine said:
Bullet Points:
- 1920 4D shaders
- Q4 release before Christmas

It's now rumored for a before the end of the year release? I'm still hearing early Q1 2011, for what it's worth. Then again, a few cards may be able to find their ways on the shelves for December, that's not a far fetched hypothesis.

And 1920 SPs would represent a ~20% increase over RV870 1600 SPs. Then again, all SPs aren't made equal, so that value in itself isn't very telling.

neliz · Jul 15, 2010

Farid said:
It's now rumored for a before the end of the year release? I'm still hearing early Q1 2011, for what it's worth. Then again, a few cards may be able to find their ways on the shelves for December, that's not a far fetched hypothesis.

And 1920 SPs would represent a ~20% increase over RV870 1600 SPs. Then again, all SPs aren't made equal, so that value in itself isn't very telling.

I'm still hearing an oct/nov launch.
Also those 1920SPs wouldn't be 4D+1D anymore but 4D, if I'm reading it correctly that would be a big boost to the base performance of Caymann.. ehrm.. if I got that name right. CH also suggest a 1/2 SP/DP (2Fat+2Thin SP) rate instead of the current 1/5 rate of Cypress.

(I have no idea if that last sentence made any sense)

Jawed · Jul 15, 2010

Maybe he's suggesting nothing more than one lane out of each pair can be used to add the result of both lanes in the pair. Which is how things are now (ignoring dot product which adds from all four lanes).

Malo · Jul 15, 2010

eastmen said:
Yes and how many of those 11m chips are 5830 up and how many are 5770 and under ?

According to Steam stats, the DX11 systems show 5800 series as 5.20% and 5700 series as 4.92%. That's a fairly close distribution between the 2.

mczak · Jul 15, 2010

So now S.I has the rumored shader changes (going from 4D+1D to 4D somehow) which were assumed to be part of N.I? That's a pretty fundamental change for a part which was supposed to be a mere refresh with mostly setup related changes...
Oh and for the supposed DP with half performance, I'm not sure this makes sense yet - after all AMD still disables even 1/5th performance DP in all but the fastest chips - and apparently nvidia doesn't think DP is important for consumer graphic cards neither.

Novum · Jul 15, 2010

mczak said:
and apparently nvidia doesn't think DP is important for consumer graphic cards neither.

That's because it isn't and never will be.

Jawed · Jul 15, 2010

mczak said:
So now S.I has the rumored shader changes (going from 4D+1D to 4D somehow) which were assumed to be part of N.I? That's a pretty fundamental change for a part which was supposed to be a mere refresh with mostly setup related changes...

Start here:

http://forum.beyond3d.com/showthread.php?p=1416950#post1416950

quite a few posts. There's a lot of "new functionality" in Cypress's ALUs - it looks like functionality that's almost, but not quite, enough to ditch T - well that's my interpretation anyway.

Another possibility is XYZT

Alexko · Jul 15, 2010

Malo said:
According to Steam stats, the DX11 systems show 5800 series as 5.20% and 5700 series as 4.92%. That's a fairly close distribution between the 2.

Steam is a gaming platform, so its stats are heavily biased towards high-end hardware.

Sinistar · Jul 15, 2010

fbuffer said:
I'd be willing to bet the majority of those 11 (umpteen by now) Million were < 5770 parts

near 16 million according to todays earnings conference call.

aaronspink · Jul 16, 2010

Jawed said:
Start here:

http://forum.beyond3d.com/showthread.php?p=1416950#post1416950

quite a few posts. There's a lot of "new functionality" in Cypress's ALUs - it looks like functionality that's almost, but not quite, enough to ditch T - well that's my interpretation anyway.

Another possibility is XYZT

Well yea, there are three possibilities:

From
R G B A T

To
R G B AT
R G B T
RT GT BT AT

Squilliam · Jul 16, 2010

Sinistar said:
near 16 million according to todays earnings conference call.

That could have been much higher had they enough supply of 40nm wafers from TSMC available and decent yields early on.

OT: A part of me is hoping that they are going to release 28nm Southern Island GPUs this year and rumours to the contrary are just misinformation!

mczak · Jul 16, 2010

aaronspink said:
Well yea, there are three possibilities:

From
R G B A T

To
R G B AT
R G B T
RT GT BT AT

Well the "2 thin, 2 fat" suggests something different entirely even.
Still those ides were previously discussed for N.I, not S.I. unless I'm confusing something...

keritto · Jul 16, 2010

rpg.314 said:
While we are it, I'll wager that in BD refresh (if not @ 22nm SOI) they are planning to take a couple (somewhere between 1-4) SIMD engines and put it into a bulldozer module. The SIMD's will have their private L1 data and texture caches (just like BD cores have theirs), and the L2 caches of the CPU and SIMD cores will be unified.

You probably visioned unified L3 that's shared across HT link but unified L2. That's something intel couldnt do with it's conroe cache improvements. To unify L2 caches they need simulate or integrate GPU core inside one of Bulldozers "dual CMT core" and that wasnt never intention for these first Fusion generation (announced for 2008), and wont be in foreseeable future (next 6yr). Also CPU alike L1 datapaths are nowhere mentioned when RV770/870->Southerns (R9xx) L1/L2 texture cache doing pretty damn good job.

Jawed said:
Bulldozer's been work in progress for yonks, longer than the next GPU.

Yap, that's good example of AMD's poor decision making when they ditch years of developement first Bulldozer design, that couldnt done FMA effectively enough as Intel's Sandy Bridge design (well that shrink that actaully support FMA), and postpone Bulldozer release for 2nd/3rd "improved" design. And now some wise guy bubbletalk about idf fall 2008 on intel forum claims noFMA in SB, and nothing until tick (Sandy Bridge shrink) on 22nm according to wiki.

Anyway they could release working BD with as it is today just with old school coding scheme that's less effective than a current one in actual version. But AMDs design will still be more registry hungry. (And all that is tied way back to first NetBurst cache organization.) So probably no bad impact on AMDs approach after all. It would be good enough even if it was released 2 year earlier than it will be.

rpg.314 said:
Rv770's developement began in 2005

RV770's developement by their own words i rread somwhere wasn't never begun it was done in parallel when they figure out that nor the node (90nm/80nm) nor timefame isn't on their side when nV released G80 and they were troublshooting their R500/R600 design and they need that out as HD29000XT desperately quick enough. So RV770 dev begun somwhere in late 2006 as spunoff R600 refurbishing design.
(But if we really acknowledge RV770 as original R500 design then it's design begun far earlier than 2005. Somewhere with release of X800Pro in Spring 2004 or even earlier

... Just of Cypress that's really polished out bugfree product based on that 5+ years of redesigning )

rpg.314 said:
Evergreen's development began in 2006

IMHO, Evergreen is R600 descendant with 4gen teselation now part of dx api, and other small improvements (compresion schemes, larger texture buffers, threading) over previous RV770. Maybe Northern Islands begun their R&D somwhere in 2007-08 cause as they said we'll only see smaller improwemnets in SI already. SO integration takes small steps in 2yrs now.

Jawed said:
But hasn't AMD junked at least two designs since K8?

They never junk they reiterate

They put their "failures" into drawer for later recycling. So the originality of K8 idea that couldnt be placed effectively on silicont at that time came out now as brand new Bulldozer idea.

And yes they ditched "original" K9 & K10 and redesign actual K8 to meet 2006 standards with full width SSE and larger caches and to became today known as K10/K10.5 (Stars) --according to some media

entity279 said:
Back in the days, AMD had also delayed the Hammer architecture for quite a while.

They could delay it even a year more if they wish. K8 when arrived was really much ahead of it's time when it came to price and availability and customers are satisfied with old K7/P4 until Intel Core2 was taking it's prime role 6-9 month after it's release.

The point ... well if they came out with Hammer even 2 years earlier (impossible cause of their 180nm process of that time --- skyrocket high cost to end customer) they wouldn't penetrate market much more they did with ultra successful K7 product line. For Hammer they had much more ideas but not enough silicon to waste, and thankfully they done right decision just to clamp old K7 with 64 bit registers and IMC which was extremely good enough for poor underperforming P4(D/E) but showing it's age at Core2 arrival

Drazick · Jul 16, 2010

Novum said:
That's because it isn't and never will be.

Why not?
Let's say I use Jacket with my Matlab programs (Mathworks already working on Open CL support on their on), I do need a solid DP performances and yet don't have the money for a professional Tesla card.

With more applications taking advantage of the GPU that might be a factor, much smaller than the Gaming performances but a factor (One more name comes into my mind is Pixel Bender).

Entropy · Jul 16, 2010

Drazick said:
Why not?
Let's say I use Jacket with my Matlab programs (Mathworks already working on Open CL support on their on), I do need a solid DP performances and yet don't have the money for a professional Tesla card.

With more applications taking advantage of the GPU that might be a factor, much smaller than the Gaming performances but a factor (One more name comes into my mind is Pixel Bender).

Most MatLab users are students who run it on laptops. How many MatLab users are free to both buy and configure their systems as they please, and of those who do (I do) how many would choose to configure it with a top-end videocard (I wouldn't)? And how do those numbers compare to the number of people buying graphics cards to play games?

It makes sense to optimize your product to fit your market - it allows higher performance/lower power draw at lower cost, benefiting customers while still allowing healthier margins and greater market flexibility.

Alexko · Jul 16, 2010

I believe flight simulators sometimes use DP…

Jawed · Jul 16, 2010

aaronspink said:
Well yea, there are three possibilities:

From
R G B A T

To
R G B AT

Which I equate with XYZT (just making sure we're on the same page).

R G B T

Which would leave register file bandwidth on the table, so is unlikely (since there's no MAD in the T lane).

RT GT BT AT

Which is either like my earlier proposal or perhaps along the lines of what is done in SSE or LRBni, etc.

The question arises, is the performance hit for transcendentals when done accurately relevant when rearranging the ALU:

http://www.bigncomputing.org/Big_N_Computing/Big_N_Computing/Entries/2010/6/21_Give_Me_a_SINE.html

as the native implementation isn't necessarily useful for non-graphics.

Accuracy problems might only be solved by using both higher-precision LUTs and higher-precision MUL/ADD. Then the question of double-precision results still has to be answered.

I'm not sure of the status of double-precision transcendentals in OpenCL on ATI. AMD staff have said OpenCL specification is quite picky about accuracy for official double-precision support, which is why it's taking a while to implement.

AMD: R9xx Speculation

psolord

Triskaine

eastmen

Farid

Artist formely known as Vysez

neliz

GIGABYTE Man

Jawed

Malo

Yak Mechanicum

mczak

Novum

Jawed

Alexko

Sinistar

I LIVE

aaronspink

Squilliam

Beyond3d isn't defined yet

mczak

keritto

Drazick

Entropy

Alexko

Jawed

Similar threads