AMD: R9xx Speculation

no-X · Sep 28, 2010

UniversalTruth said:
Of course we will complain. What do you want? Just to sit, "enjoy" the second life of some chips which are morally old and need to be replaced by something new. 5770 is nothing interesting and performance wise it sucks. Especially in DX11.

Sorry, but GF1xx isn't morally old? Or its delayed launch makes it fresher? nVidia still has no counterpart for HD5770 and you blame ATi?

Given the size of Junipers die, ATi can reduce its price quite significantly. Juniper has also full tesselation performance of HD5870 and that's really not bad. Especially in DX11 games, Juniper offers better performance than GF106.

Robert Varga · Sep 28, 2010

no-X said:
Sorry, but GF1xx isn't morally old? Or its delayed launch makes it fresher? nVidia still has no counterpart for HD5770 and you blame ATi?

Given the size of Junipers die, ATi can reduce its price quite significantly. Juniper has also full tesselation performance of HD5870 and that's really not bad. Especially in DX11 games, Juniper offers better performance than GF106.

Juniper is the sexiest one from evergreen family for me. Looking at the sales, I'm not alone...

PSU-failure · Sep 28, 2010

OlegSH said:
It's quite elegant considering slide below:

It's not "elegant", it's:

- efficient (internal vertex data loop, via L1/L2?)
- brutal (16 tessellation units)

It's almost certain Evergreen has been engineered to reduce complexity, so NI has medium to high chance to speed up in this area.

About this, R600 was very elegant, much more than R700 and Evergreen with its ring bus and programming model (not "graphics" oriented at all as gpusa shows).

I think what Evergreen lacks is a general-purpose R/W L1, allowing the same efficiency as GF100 without the absurd waste.

liolio · Sep 28, 2010

Robert Varga said:
Juniper is the sexiest one from evergreen family for me. Looking at the sales, I'm not alone...

And for me the sexiest GPU out of ATI last years production was the RV740

OlegSH · Sep 28, 2010

PSU-failure said:
About this, R600 was very elegant, much more than R700 and Evergreen with its ring bus and programming model (not "graphics" oriented at all as gpusa shows)

Maybe i'm skipped something, but what's elegant about R600 programming model? It's even not support very reduced CS4 preset

Dave Baumann · Sep 28, 2010

PSU-failure said:
I think what Evergreen lacks is a general-purpose R/W L1, allowing the same efficiency as GF100 without the absurd waste.

R/W aside, what do you think that EV does different in tessellation than GF100 does according to that slide?

CarstenS · Sep 28, 2010

AlexV said:
When no amplification takes place, you get jack on a Geforce(not for any technical reason, mind you). And they don't exactly stay on SM, but rather take a round-trip via the L2 most of the time.

Sorry?

edit: I mean, what do you mean with I'd get Jack on a Geforce. You're right about non-amplified stuff not staying in the SM. What I meant but unfortunately did not write was: Stays inside the SM mostly without touching global memory and no need to go back to the general scheduler.

caveman-jim · Sep 28, 2010

CarstenS said:
Sorry?

I think he said that when no amplification is processed, nothing else happens either - stalls; and the relevant contents of the SM get flush to L2 and re-read?

AlexV · Sep 28, 2010

CarstenS said:
Sorry?

edit: I mean, what do you mean with I'd get Jack on a Geforce. You're right about non-amplified stuff not staying in the SM. What I meant but unfortunately did not write was: Stays inside the SM mostly without touching global memory and no need to go back to the general scheduler.

I mean that DP isn't the only thing that gets trimmed on consumer cards.

CarstenS · Sep 28, 2010

So, non-ampflified geometry also takes unnecessary twists and turns through the chips innards?

PSU-failure · Sep 29, 2010

Dave Baumann said:
R/W aside, what do you think that EV does different in tessellation than GF100 does according to that slide?

From this slide, data flow *could* be different (feed-forward? I was under the impression it was a given), but it's some marketing slide after all.

OlegSH said:
Maybe i'm skipped something, but what's elegant about R600 programming model? It's even not support very reduced CS4 preset

What I find elegant is the way everything is considered "data", plus the arch in itself.

Shader model is clearly irrelevant there as it's more of an implementation detail, it's mostly a conceptual thing. I'd not be surprised if we were to see the resurgence of some of R600's characteristics some day.

ShaidarHaran · Sep 29, 2010

PSU-failure said:
I'd not be surprised if we were to see the resurgence of some of R600's characteristics some day.

On the surface this statement "feels wrong", but given the cyclical nature of things it's certainly possible. After all, some of the architectural concepts from P4 are now being revisited in Sandy Bridge (trace cache -> micro-op cache).

trinibwoy · Sep 29, 2010

PSU-failure said:
- efficient (internal vertex data loop, via L1/L2?)
- brutal (16 tessellation units)

It's certainly overkill for games but in targeted tessellation workloads the measured advantage over Cypress is over 6x and that's not even a fully enabled GF100. If you look at throughput per clock, the advantage goes up to 7.5x.

Are the 15 polymorph engines and 4 GPC's of the GTX 480 taken together 6x larger than the front-end in Cypress? The efficiency question is hard to answer unless you know the relative costs.

racca · Sep 29, 2010

no-X said:
That wouldn't be nice. XT was linked with xx7x-models for generations... Why should they change it?

That's not true. If it were, then there must have been quite a few exceptions, in fact, it's almost as many as those XT's that are called XX70

Juniper (LE) 5670 (640SP version)
Redwood PRO 5570
RV790 XT 4890
RV740 N/A 4770
RV710 N/A 4350/4550
RV620 PRO 3470
All X2 parts are codenamed RXX0 instead of RVXX0

There's some truth to your claim here:
They are NOT linked to each other. But on the other hand, there was indeed no XT chip marketed as XX50/XX30.

racca · Sep 29, 2010

MarkoIt said:
mm.. that leak doesn't sound right. 960 sp of Bart XT can beat 1440sp of Cypress PRO?

Without saying the specs are true,
if you actually look at it, 240TP@850MHz vs 288TP@725MHz, it's almost a tie here with raw power. Plus some tweaks here and there, ncreased ROP/Rasterizer etc, it can be ture.

caveman-jim said:
interestingly, thats the same mistake BSN made, is it not?

I call it bs or fud. Pure smoke. And I think for our health sake, we shoul quit somking.

LordEC911 said:
Turks... 1H '11.

Ooooow, that's not good. 1H'11? We might as well be expecting a 28nm version of Cayman (~200sqmm) by then. A GloFo part if we are really lucky (for the curiosity on how well it can be).

LordEC911 · Sep 29, 2010

racca said:
Ooooow, that's not good. 1H'11? We might as well be expecting a 28nm version of Cayman (~200sqmm) by then. A GloFo part if we are really lucky (for the curiosity on how well it can be).

Just haven't heard anything overly concrete but I would still expect it Q1.
The door isn't completely closed on 28nm Turks/Caicos, unless they are already in mass production and just need to build up massive stock for OEMs.

no-X · Sep 29, 2010

racca said:
That's not true. If it were, then there must have been quite a few exceptions, in fact, it's almost as many as those XT's that are called XX70

Juniper (LE) 5670 (640SP version)
Redwood PRO 5570
RV790 XT 4890
RV740 N/A 4770
RV710 N/A 4350/4550
RV620 PRO 3470
All X2 parts are codenamed RXX0 instead of RVXX0

There's some truth to your claim here:
They are NOT linked to each other. But on the other hand, there was indeed no XT chip marketed as XX50/XX30.

I speak about high-end / midrange. Your examples are mainly low-end products, where a single GPU is used for SKUs of several product lines. It's not logical to apply principles valid in low-end segment to high-end.

PRO / XT
HD3850 / HD3870
HD4850 / HD4870
HD5850 / HD5870

There's not a single reason to expect, that HD6850 and 6870 will be based on different GPUs.

Arnold Beckenbauer · Sep 29, 2010

Kaotik said:
The 3D games part of "HD3D" is already supported on all HD-series Radeons AFAIK?
(Since Cat 10.3)

Transcoding works on any HD-series card, it uses stream processors, not UVD

Nope.

HD2900...
HD2000/3000(HD4200?) with UVD: you can use their UVD chip for decoding, CPU encodes the video.
HD4000: SPs encode the video, and if you want, you can let UVD decode the video.

fellix · Sep 29, 2010

SPs are only being used for intermediate image processing (like deinterlacing, re-sampling, denoise), not the encoding phase?!

Kaotik · Sep 29, 2010

Arnold Beckenbauer said:
Nope.
HD2900...
HD2000/3000(HD4200?) with UVD: you can use their UVD chip for decoding, CPU encodes the video.
HD4000: SPs encode the video, and if you want, you can let UVD decode the video.

If that's true, how on earth did they conjure up a transcoder which is leaps and bounds faster than any other encoder/transcoder I've ever used (I have HD3800 series card)
Or are you suggesting that majority of time on transcoding goes into decoding rather than encoding?

AMD: R9xx Speculation

no-X

Robert Varga

PSU-failure

liolio

Aquoiboniste

OlegSH

Dave Baumann

Gamerscore Wh...

CarstenS

Moderator

caveman-jim

AlexV

Heteroscedasticitate

CarstenS

Moderator

PSU-failure

ShaidarHaran

hardware monkey

trinibwoy

Meh

racca

racca

LordEC911

no-X

Arnold Beckenbauer

fellix

Kaotik

Drunk Member

Similar threads