AMD: R9xx Speculation

I'm not sure why people are disappointed, ever since the Taiwanese did the dirty on them by cancelling 32nm I think expectations should have been lowered on "pure speed".

I bought a HD 5850 a while back given that, can't say I feel the need top upgrade, 28nm will likely be the next time.
 
No, it isn't. RAM heavy stuff and post-process is in the per-pixel category, where the 6970 is faster than the GTX 580 here. I don't think you understood my post. The 6970's disadvantage comes from workload that is unrelated to pixel count.

Shadow maps and reflection maps are geometry limited because the former has very simple pixels (40 GPix/s is 0.025 ns/pix as opposed to the 2 ns/pix extracted from the data), while the latter has simple pixels and reduced resolution (e.g. 6x512x512 for a cube map).

Bandwidth alone could give the GTX 580 a 2-3% advantage over the 6970, but not more than that. Games are generally under 30% BW limited.

On chip cache bandwith and sizes are completly different for both cards. It could be that the main difference is there.
The 2ns/pix (just theoretical average anyway) could mean that the delay is caused by moving the data or a different stage in the pipeline.
U cant know if they are geometry limited from just fps and resolution.
 
This is quite telling picture from computerbase.de:
2.jpg


We'll see how drivers are going to help for smaller resolutions and settings.

Yeaaaaah! It is telling!

In DX11 games on 2560x1600 8xAA/16xAF you have maks. 25-30 frames. In Starcraft 2 you have 15 frames.
 
No, it isn't. RAM heavy stuff and post-process is in the per-pixel category, where the 6970 is faster than the GTX 580 here. I don't think you understood my post. The 6970's disadvantage comes from workload that is unrelated to pixel count.
So, that's actually worse than the other way around, isn't it? I'd think it would be easier to just have to throw more SIMDs at the problem than - well, apart from major reworkings it's only clock speed increases, that may help, no?
 
I think people are being a little too harsh on this architecture.

1. Its a new internal architecture revision so driver maturity won't quite be as easy to achieve on launch whereas the GF110 was effectively a stone skip from GF100. You can expect some significant improvements.

2. It comes with something completely new in Power Tune. No other graphics card has it and the benefits throughout their range are obvious. You can save power and reduce noise or you can extend the limits and overclock within even more safe boundaries. If you buy a non referrence 6970 you can overclock without having to worry about corner cases and with even more consistant temperature settings. It is also transparant unlike temperature throttling which can vary due to ambient temps and the game.

3. The price/performance look more than adequate. It is hard to call either over priced. Maybe people are too used to the idea that an AMD card ought to blow nVidia out of the water in price/performance?

4. The overall quality of the cards are higher, you can get a nice $300 6950 with a vapour chamber. It has much better accoustic properties and you get 2GB of RAM so you mod lovers out there can use the really really good texture packs and gloat that consoles only have 1/16th of your total video RAM.

5. Dave still saves! :p
 
With regard to the new architecture, AMD explicitly states on their slides that the VLIW4 cores simplify scheduling and register management. Plus, they've had ~3 weeks over their original launch plan to fine-tune the drivers.

And honestly: As a company you try everything to make sure your product really shines at launch because that's the impression that sticks. If they were only at, say, 85% with the drivers, any sane person would have delayed the launch once more. Especially, if those percentages would make the difference between a wash against the targeted competition (570) and a 10-15% performance advantage in independent tests.
 
...just like 40% more expensive GTX580 offers


and if you won't ignore the rest, you'll notice, than in half of the tested games framerate is between 35-60 FPS at these settings...

This resolution and settings are totally irrelevant for 95% of gamers.

Stalker CoP and CS, Crysis and Warhead, Metro, LP2, Starcraft 2... you can't play on maks. with this cards and that resolution.

Other games are mostly console ports.
 
I think people are being a little too harsh on this architecture.

1. Its a new internal architecture revision so driver maturity won't quite be as easy to achieve on launch whereas the GF110 was effectively a stone skip from GF100. You can expect some significant improvements.

2. It comes with something completely new in Power Tune. No other graphics card has it and the benefits throughout their range are obvious. You can save power and reduce noise or you can extend the limits and overclock within even more safe boundaries. If you buy a non referrence 6970 you can overclock without having to worry about corner cases and with even more consistant temperature settings. It is also transparant unlike temperature throttling which can vary due to ambient temps and the game.

3. The price/performance look more than adequate. It is hard to call either over priced. Maybe people are too used to the idea that an AMD card ought to blow nVidia out of the water in price/performance?

4. The overall quality of the cards are higher, you can get a nice $300 6950 with a vapour chamber. It has much better accoustic properties and you get 2GB of RAM so you mod lovers out there can use the really really good texture packs and gloat that consoles only have 1/16th of your total video RAM.

5. Dave still saves! :p

2x6950 for the price of one 580gtx?
its in mail, likely get them tomorow.
its just was a great deal to pass up.
replacing my aging 6850x2 setup.
 
Yes I think any driver improvement from ATi will be mirrored by Nvidia at this point.

I think there has been a serious targeting problem at ATi. They seem to have made 6970 to compete with GTX480, but in doing so they made the assumption that Nvidia would stagnate, it's almost as if they started believing their (Charlie's to be specific) own hype that Nvidia were done for and their big die was never going to produce the goods. It has resulted that ATi's top single chip card just about matches the performance of the mid-high range single chip card from Nvidia leaving the highest end unchallenged until a dual chip card appears at some point in Q1.

Their strategy has also made the assumption that Nvidia would never be able to get a dual chip solution going, but with GTX570 and 6970 having similar power consumption the dual chip strategy isn't one you can rule out just yet, especially since GTX570X2 would come with 2x1.25GB RAM vs 2x2GB RAM. The ATi chip could conceivably consume more power and have to sacrifice more performance to stay within the 300W limit, though powertune will make that very easy.

I think with the GF110 Nvidia have surprised ATi and because Cayman was already so far into the design and even manufacturing stage when performance indications of GF110 started to leak they had to pump up the clocks really high to stay competitive and they still fell short of GTX580.
 
2x6950 for the price of one 580gtx?
its in mail, likely get them tomorow.
its just was a great deal to pass up.
replacing my aging 6850x2 setup.

Where are you getting $250 per 6950?
Microcenter has 580 GTX for $499 right now.
 
On chip cache bandwith and sizes are completly different for both cards. It could be that the main difference is there.
The 2ns/pix (just theoretical average anyway) could mean that the delay is caused by moving the data or a different stage in the pipeline.
You're still not getting it. This is not a "delay", nor is it theoretical. It is the measured processing time for the workload of Dirt 2. Take any resolution as a base case. If I want to render a million more pixels, it takes 2.04ms longer on the 6970, but 2.36ms longer on the GTX 580. Vice versa for a million less. The 6970 is faster at rendering screen pixels in Dirt 2.

I provided the numbers so that you can check for yourself. For example, at 1920x1200, the 6970 will take, on average, 8.1+1.920x1.200x2.04=12.8 ms, while the GTX 580 will take 5.0+1.920x1.200x2.04=10.4 ms. Those figure correspond to 96 and 78 fps respectively, which are within 1fps of Guru3D's results. It's the 8.1 vs 5.0 that is hurting Cayman - the resolution related part is (2.04 vs 2.36) is an advantage.
U cant know if they are geometry limited from just fps and resolution.
What other part of the workload is independent of resolution? I mentioned shadow maps and reflection maps as examples, but the fillrate part of shadow maps can be done 100x faster than screen pixels, so it's negligible, and reflection map pixels are either similar to screen pixels or simpler, so they won't be any slower on the 6970 than the GTX 580.

It's not the CPU (I looked as multiGPU results from other articles), so it's has to be geometry. Geometry during shadow map rendering, during reflection map rendering, and during main scene rendering.
 
YIt's not the CPU (I looked as multiGPU results from other articles), so it's has to be geometry. Geometry during shadow map rendering, during reflection map rendering, and during main scene rendering.
You can be front end limited without being geometry bound, Dirt 2 is one such title in fact (we looked at this quite a lot). Like I say, the dual geomtry engines are doing their stuff - look at any triangle or vertex test.
 
I think there has been a serious targeting problem at ATi. They seem to have made 6970 to compete with GTX480, but in doing so they made the assumption that Nvidia would stagnate,

AMD designed likely with APU in mind, their fusion concept needs power efficiency, so nvidia isnt the only concern and if they do it right, Nvidia is going away from the market.

crossfire scaling is off the charts btw.
with the design, they made crossfire a entusiast thing and also more commonplace.
 
You can be front end limited without being geometry bound, Dirt 2 is one such title in fact (we looked at this quite a lot). Like I say, the dual geomtry engines are doing their stuff - look at any triangle or vertex test.
So, hmm, what is the problem? Ever since Cypress the chips just don't scale well to high end. I know you said geometry throughput isn't really a problem (and I guess you were right then :)) but still, there is something in the frontend which prevents scaling. You could see that with the small difference more simds made (5850 vs 5870 at same clock was roughly a 2% difference in actual games), or with Barts (still almost same performance than Cypress despite it's cut down quite a bit) and now with Cayman (which has twice the geometry throughput, and it still didn't make a difference, except with games using tesselation).
(Speaking of simd scaling, I haven't seen results yet at the same clock for HD 6950 vs HD 6970, but from the posted results every indication points to simd scaling to be even worse than Cypress, my extrapolation puts this at about 1% difference in actual games for 22 vs 24 simds...).
So what's needed to address that front-end limit? Was that something which was cut when Cayman migrated to 40nm (ok I don't really expect an answer here...). More dispatch processors? More internal bandwidth? Revised cache architecture?
 
I'm not sure why people are disappointed, ever since the Taiwanese did the dirty on them by cancelling 32nm I think expectations should have been lowered on "pure speed".

I bought a HD 5850 a while back given that, can't say I feel the need top upgrade, 28nm will likely be the next time.

I agree, not sure what all the complaining is about....cards are fast, cards are reasonably priced, cards come with some great features like driving three monitors on a single card, new AA mode and are pretty well a wash in performance on most game titles (if you want to argue that 101 fps vs 103 fps is a reason to choose one over the other I think you need your head examined)
 
So, that's actually worse than the other way around, isn't it? I'd think it would be easier to just have to throw more SIMDs at the problem than - well, apart from major reworkings it's only clock speed increases, that may help, no?
From that point of view, yes.

However, the fact that Cayman and Barts have the same per-frame time tells you that it could easily be a driver issue. Your graph from the DXSDK sample shows Cayman performing slower than Cypress at high tesselation levels. My numbers show parity between Cayman and Barts for Dirt 2.

It could be drivers or it could be a hardware bug. Either way, I don't think ATI is hitting architectural limitations.
 
Back
Top