AMD: R7xx Speculation

Status
Not open for further replies.
So far it looks promising.

RV770 core
45nm (maybe)
480 streams (96 pipelines)
32 TMU's
16 ROP's
1050MHz-GPU core
Fixed AA problem
GDDR5 256bit (maybe) 2200MHzx2 = 4400MHz ~141GB bandwidth.

If single RV770 beats GF9800GTX we may have another R300.

Gah! :runaway:

Multiple discussions have proved that there is no inherent 'problem' which needs to be fixed. R600 series chips perform as expected in AA. Hopefully, R700 chips will have higher specifications in this area so they can match up a bit better with the G90 family.

I personally doubt they will be 45nm initially - seems too soon to me for this leap though I suppose the relatively rapid move to 55nm could indicate I could be wrong in this regard.
 
I guess I still have difficulty believing that ATI actually intended R600 to take more of a performance hit on AA than R580 did.

Yet it still outperforms R580 in AA regardless of the % performance hit, which is obviously how it was designed.

Shader resolve does have a small performance hit compared to fixed function resolve but ATI have obviously decided this is the way forward. The R600 family AA performance is relatively unimpressive because the chips are underspecified in other areas, especially when compared to G80 et al.

Don't consider R6XX AA performance as 'broken', consider it as under-specified for the market.
 
Don't consider R6XX AA performance as 'broken', consider it as under-specified for the market.

Which is a subtle difference that the customer doesn't care about, as both "broken" and "under-specced" gives the same low performance and less than was required for a competitive product.

ATI's engineers giving us a "working version of the wrong thing" isn't really that much different from them giving us a "broken version of the right thing" when it comes down to the crippling lack of AA performance - especially when compared to their previous generations and competitor products.

Once again, ATI looked too far forwards into the time when devs would be doing all the AA themselves in shaders, and got caught out when the devs were running a couple of years behind that idea - and devs didn't care because Nvidia didn't have the same failing/foresight.
 
Last edited by a moderator:
Which is a subtle difference that the customer doesn't care about, as both "broken" and "under-specced" gives the same low performance and less than was required for a competitive product.
But it's just not true that AA is slow. There are benchmarks out there which show that usually the larger performance hit is from enabling AF (especially HQAF at higher levels), not AA, http://www.computerbase.de/artikel/..._3870_rv670/5/#abschnitt_aa_und_af_skalierung. That's mostly just readers perception that the hit comes from enabling AA, probably because traditionally AA had a larger performance hit than AF and they are almost always enabled together...
Guess why the AF hit is quite large...
So can we PLEASE PLEASE PLEASE this stupid "but AA is broken" theme coming up in half of the threads discussing r6xx architecture put to rest once and for all? The poor horse has been beaten to death quite badly already.
 
But it's just not true that AA is slow. There are benchmarks out there which show that usually the larger performance hit is from enabling AF (especially HQAF at higher levels), not AA, http://www.computerbase.de/artikel/..._3870_rv670/5/#abschnitt_aa_und_af_skalierung. That's mostly just readers perception that the hit comes from enabling AA, probably because traditionally AA had a larger performance hit than AF and they are almost always enabled together...
Guess why the AF hit is quite large...
So can we PLEASE PLEASE PLEASE this stupid "but AA is broken" theme coming up in half of the threads discussing r6xx architecture put to rest once and for all? The poor horse has been beaten to death quite badly already.

Fablemark is a stencil fillrate limited synthetic and tries to mimic the ancient Doom3 engine in a way:

http://www.computerbase.de/artikel/...0_rv670/10/#abschnitt_theoretische_benchmarks

While such a case isn't all that relevant anymore for today's scenarios, it does expose at least one weakness and it's not at all filtering related.

Further down STALKER which uses MRTs afaik (not the most advance engine around anymore either heh), seems to prove your point; it's just unfortunate that AA cannot be enabled in that application:

http://www.computerbase.de/artikel/...ti_radeon_hd_3870_rv670/21/#abschnitt_stalker


I wouldn't be in the least surprised if the future RV770 is stronger both in terms of Z/Pixel- as well as Texel-Fillrates compared to it's predecessor.
 
Judging by the DX10 applications benchmarks i'd say that you're wrong here.
Show me such a benchmark that isn't TEX/ROP limited.

There is a problem of utilization of resources which makes 6x0's 320 SPs slower then 8x's 128 SPs most of the time in the real world applications even with the clocks correction.
You've corrected-out the vastly higher texel and zixel rates of NVidia's architecture, have you?

SPs is the area of a chip which interests me the most because this is the place where we still don't have nearly enough power for the DX10 applications.
Well I've just argued that ATI will be delivering a 2TFLOP (AFR-based?) board to compete with GT200Ultra, which appears to be a 1TFLOP board. I don't think that'll be enough to tame the toughest D3D10 applications, for what it's worth, but every step in that direction is useful. ATI's competitiveness is seriously hampered elsewhere...

Though it is a good point that RV770's improvements in other areas could bring more than 50% perfomance improvement overall -- in the general sence of all applications combined.
Bandwidth, texel and ALU rates are all ~100% higher according to the rumour. If the zixel rate is ~3x higher, then I think ATI's in business.

But, if the multi-chip R780 is relying upon AFR, then GT200U looks like it'll have the high-end to itself.

But do you really want more perfomance in something like FEAR or you'd better get more perfomance in something like Crysis? If it's the latter then imho SPs power is more important than anything else, and 50% increase in that power even coupled with (doubtful btw) 1050 MHz core clocks isn't that impressive -- you still won't get playable framerates on single RV770 in Very High settings.
RV770, according to the rumour, has 50% more ALU capability without any increase in clocks. I don't see any point in quibbling over the actual clock - whereas I think a unit count increase (ALUs, TUs and Zs per RBE) is guaranteed.

If what you're trying to say is that the rumour is wildly wrong and RV770 is actually only ~50% faster than RV670, then fair enough. Clearly that puts R780 in a much tighter position against GT200U. I was describing R780 competitiveness in terms of the rumour, not something else.

Jawed
 
Which is a subtle difference that the customer doesn't care about, as both "broken" and "under-specced" gives the same low performance and less than was required for a competitive product.

ATI's engineers giving us a "working version of the wrong thing" isn't really that much different from them giving us a "broken version of the right thing" when it comes down to the crippling lack of AA performance - especially when compared to their previous generations and competitor products.

Once again, ATI looked too far forwards into the time when devs would be doing all the AA themselves in shaders, and got caught out when the devs were running a couple of years behind that idea - and devs didn't care because Nvidia didn't have the same failing/foresight.

You're missing the point. It's not Shader-based resolve that's causing poor(by comparison) performance. It's the number of functional units themselves. Even if it would've had HW-based resolve, it still would've underperformed.
 
You're missing the point. It's not Shader-based resolve that's causing poor(by comparison) performance. It's the number of functional units themselves. Even if it would've had HW-based resolve, it still would've underperformed.

And you're missing the point that saying "it was designed to be that way" isn't really very different/usful for the end user. It's just splitting hairs between "broken at design" or "broken at implementation".
 
And you're missing the point that saying "it was designed to be that way" isn't really very different/usful for the end user. It's just splitting hairs between "broken at design" or "broken at implementation".

I have to disagree.

The chips could be considered broken if the situation was similar to the Phenom TLB problem where the chips don't perform as expected/claimed due to a hardware problem. As it is, they perform as you would expect from what we know of the design.

A poor design decision leading to lower performance than your competitor just doesn't mean something is broken regardless of the viewpoint of the end user who, of course, always has the option of buying the better-performing competitor!
 
The way I see it, this coming generation (May/June) R780 (2xRV770) will compete directly against GT200Ultra.

The real question then is whether R780 is architected to compete more effectively than R680 is doing in competing with 8800Ultra.

Jawed

Well since GT200Ultra(2Tflop) will most likely be a dual chip solution just like R780, unless Nvidia either magically goes to 45nm or try to fab a chip the size of a quater. The match will be more like between CF vs. SLi right? ;)
 
And you're missing the point that saying "it was designed to be that way" isn't really very different/usful for the end user. It's just splitting hairs between "broken at design" or "broken at implementation".

What are we talking about again?Semantics?It's still wrong to say :AA IS FIXED!!The correct formulation would be:AA is IMPROVED. Because there is no friggin thing broken that needs fixing. WTF are we arguing about?What has the end-user got to do with the ever-present:Shader resolve=broken AA on R600 party line?Does he even care?:-?
 
yes, for the end user it's not about fixing faulty AA hardware. but the end user cares about fixing the performance.

Please tell me you're yanking my chain. You're a meanie for doing that, BTW:p

THERE IS NO FAULTY AA HW TO BE FIXED. There's nothing that doesn't work according to what ATi specced it to work. They simply underpowered it-too few RBEs/too skimpy, only 2 samples per clock etc. But there's nothing broken...unless we're really clinging to wording here, and then, it should be written "broken", as in figurately speaking.

And you're still a meanie:D
 
Multiple discussions have proved that there is no inherent 'problem' which needs to be fixed. R600 series chips perform as expected in AA. Hopefully, R700 chips will have higher specifications in this area so they can match up a bit better with the G90 family.

What I meant by AA problem with R6xx generation is comparing to R580; penalty on R6xx is greater compare to R580.

No AA http://www23.tomshardware.com/graphics_2007.html?modelx=33&model1=859&model2=722&chart=275
4xAA http://www23.tomshardware.com/graphics_2007.html?modelx=33&model1=859&model2=722&chart=277
 
Last edited by a moderator:
Well I've just argued that ATI will be delivering a 2TFLOP (AFR-based?) board to compete with
Any alternatives? All the X2/GX2 cards will be AFR cards imo.
I don't think that AMD will try to develop something better for just 1 card in the line-up.
2-chips AFR gives good scores and works more or less OK in most of applications (although from my X2 experience i tend to think that it works OK in "previous DX version" games, but failing in newer DX10 games and older pre-DX9 games -- and if it's driver in the latter case, it might be some kind of nasty incompatibility in the DX10 games case.)

GT200Ultra, which appears to be a 1TFLOP board.
I've no idea what the hell is GT200, but G100 is hardly a 1TF board.
I think that G92GX2 will be close to 1TF.
But then again i don't think that G92 will compete with RV770.
 
Any alternatives? All the X2/GX2 cards will be AFR cards imo.
http://www.hardforum.com/showpost.php?p=1031906993&postcount=28

It does fly in the face of all the "program for AFR" noise that AMD is making (there's a presentation on this subject at GDC this week). Now, you could rationalise that as 2xR780 (i.e. two cards, 4 chips) would still be dependent upon AFR...

I don't think that AMD will try to develop something better for just 1 card in the line-up.
Multi-chip is their strategy going forwards - every iteration of their highest board will be dependent upon this scheme. It'll be interesting to see if they do the same at the low end.

I've no idea what the hell is GT200, but G100 is hardly a 1TF board.
http://forum.beyond3d.com/showthread.php?t=46800

This is the board that has long been rumoured to be "almost 1 TFLOP" (and was supposed to launch in November). The rumours are now shifting it gradually higher. This slide is a few months old, but it indicates quite clearly ~1 TFLOP single precision, with 1/8th that double precision:

http://forum.beyond3d.com/showpost.php?p=1129054&postcount=51

Jawed
 
A 4870 with 480SP at 850MHz with GDDR5 2XXXMHz (if true) should provide double the frame rate of a 3870 regardless of what the condition the drivers are in, right?
 
What I'm more interesting is how ATI will make two RV770 on single die instead 2 chips on single PCB. I know it will happened; but what type of render mode will they use? most likely AFR. If so, on single die it should increase 50% boost. That puts 100% increase over single RV670.
 
Status
Not open for further replies.
Back
Top