Sir Eric Demers on AMD R600

sireric · Jun 15, 2007

Couple of comments:

1) I did let the weasels review the docs; but I also let various other engineering teams review too. I wrote all the answers and just let people review and comment. I made all the changes. At the end, I believe everything I wrote, and I think it's a reasonable set of answers.

2) I do believe that there's a lot that can be done in driver improvements. I do expect that, at least in non-ROP limited cases (where R580 and R600 are more like 10~20% apart only), that we should be at least 1.5x and up above 2.0x over R580. There are certainly apps where we are around 2.0x already, so this isn't just wishful thinking now.

3) In terms of filtering quality and performance, generally, it should be higher than our previous generation in terms of quality (if nothing else, subpixel improvements will help), and should have similar performance degredation to R5xx -- Perhaps a little worst, as the quality is generally better. However, GTX does have the advantage of significantly more texture filters; though they don't always have the BW to supply that many consummers.

Thanks

Geo · Jun 15, 2007

Well, NV had basically a 3.5 year trauma over filtering. In the first 1/2 of the play (NV3x) they got beat up for crap filtering performance. In the second 1/2 of the play (NV4x) they got beat up for crap filtering quality. When I look at G80 I can nearly hear some NV engineer saying to himself "Let them find some @#@% different area to complain about now!"

trinibwoy · Jun 15, 2007

It's very cool that sireric (and ATI by extension) is so forthcoming. I had no idea he was lead on R600 and still finds time to hang out in the forums with us plebs

But some of those answers are really puzzling to me. The implication is that R600's design and hardware attributes are fantastic and the drivers are poo-pooing all over performance. Aren't the two inextricably linked? A lot of what is being said seems to have no basis in benchmark results we've seen so far.

The following answer was especially curious as a response to "Does it make their architecture more balanced for that target market, or just less future-proof?"

"Iâ€™m not sure about what the competition used or the restrictions in their architecture that could force sub-optimal solution."

Where has R6xx demonstrated that its ALU:TEX ratios are more optimal than the competition's? Or am I misreading what is being said here?

Dave Baumann · Jun 15, 2007

cadaveca said:
I'll leave the video-process bit out yet...still not working, so I'll make no judgements until you guys can pull your head out of the sand and start being honest with consumers.

We've validated that it does work under Cat 7.5 and the latest version of the Cyberlink player; others have it working as well. If you are still having issues then I suggest that you report all the hardware and software details through the driver feedback form on the site.

AlexV · Jun 15, 2007

Dave Baumann said:
As Geo suggests, you do need to separate an aim for stability first from performance later.

This has, actually, always been the point of view. For the most part quality is video post-processing and the more processing you throw at it the more quality you can extract from it. Given the differences between the render capabilities of R600 and the rest of the line the expectation is that R600 would set the benchmark, right off, and it was a question to see how close the others get.

Fair enough. Don`t get me wrong, I`m not really hot for jumping on the "let`s kick ATi till the cows come home" bandwagon, and I certainly appreciate your efforts. It`s only that there seem to be a lot of strings to tie-up with the R600(or maybe I`m delusional or something).

Tim Murray · Jun 15, 2007

trinibwoy said:
Where has R6xx demonstrated that its ALU:TEX ratios are more optimal than the competition's? Or am I misreading what is being said here?

I think you're misunderstanding slightly. I think the argument is similar to the one they used with R580--they've shot for a architecture that looks decent on both today's apps and the kinds of shaders you'll see in the next six months. If they're right (and they probably are, considering they were right about R580), R600 will probably look better in games that come out in Q3/Q4 than it does with current apps (and perhaps a big improvement once you see D3D10 apps).

Geo · Jun 15, 2007

trinibwoy said:
The following answer was especially curious as a response to "Does it make their architecture more balanced for that target market, or just less future-proof?"

"Iâ€™m not sure about what the competition used or the restrictions in their architecture that could force sub-optimal solution."

Where has R6xx demonstrated that its ALU:TEX ratios are more optimal than the competition's? Or am I misreading what is being said here?

It's inherently a question about scalability factors of G8 vs R6 families architectures. The two key phrases in the part of Eric's answer that you quoted, as I read it, was "not sure" and "could". As in he doesn't know the scalability limitations built in to the G8 arch well enough to know how those might have impacted NV's decisions on ALU:TEX ratios in the down-market derivatives. Maybe they didn't, but it's a reasonable thought to think they probably did to some degree.

Edit: I think that was actually polite engineer-speak for "How would I know? Go ask Eric Lindholm."

sireric · Jun 15, 2007

trinibwoy said:
It's very cool that sireric (and ATI by extension) is so forthcoming. I had no idea he was lead on R600 and still finds time to hang out in the forums with us plebs

;-)

But some of those answers are really puzzling to me. The implication is that R600's design and hardware attributes are fantastic and the drivers are poo-pooing all over performance. Aren't the two inextricably linked? A lot of what is being said seems to have no basis in benchmark results we've seen so far.

I'm sorry if that's exactly the impression I give. It's not exactly what I meant. The driver development has been tremendous and I applaud all the efforts that the driver team has been doing. But, having a BRAND new architecture, with needs for new DX driver (our DX10 is from scratch), updated DX9, new OGL, and all this with a new OS and required support for that and the old one is just too much right now. It's going to take time to get the best performance out of this chip -- Both in terms of coding all the elements, and also because it's a new arch and the teams need to learn its ins&outs. Having said that, the board is priced right. It's certainly very competitive (and in my view better) than the competition at this price point. But it's power isn't as low as one would hope and the performance of this arch on current games is less than it will be on future games.

The following answer was especially curious as a response to "Does it make their architecture more balanced for that target market, or just less future-proof?"

"Iâ€™m not sure about what the competition used or the restrictions in their architecture that could force sub-optimal solution."

Where has R6xx demonstrated that its ALU:TEX ratios are more optimal than the competition's? Or am I misreading what is being said here?

All I meant is that I don't understand completely the decisions done by the competition. Nor do I really care quite that much, though it's intriguing. It's also clear that our ratio is more inline with future applications than past ones. A lower ratio does work well with a lot of older apps and even quite a few current apps. But what we've been seeing is that applications are moving more and more towards larger ratios. In that sense we are more forward looking. But it's a tough edge; it's costly to be too forward looking.

sireric · Jun 15, 2007

BTW, I must admit that seeing that picture makes me cringe. I'll have to get Rys back somehow...

Geo · Jun 15, 2007

Actually, the scalability answer I wanted to hear Eric expound on a bit more was this one:

We want to give the developer a similar experience to what they have on the high end as well, which sometimes leads to non-linear reductions in various elements

Which suggests that even if you had infinite scalability flexibility (and of course neither company does nor ever will) there are still instances where you'd want non-linear reductions. I'd like to hear more about why that would be, and maybe an example.

Davros · Jun 15, 2007

Tim Murray said:
If they're right (and they probably are, considering they were right about R580),

Wasnt the generall consensus they were wrong - as in three times the shading power only lead to a small increase in frame rate ?

Geo · Jun 15, 2007

sireric said:
BTW, I must admit that seeing that picture makes me cringe. I'll have to get Rys back somehow...

Which one? :smile:

sireric · Jun 15, 2007

Oh, and the comment on hating me -- Andy came by the other day to remind me that he hates me. So, it's true too.

Dave Baumann · Jun 15, 2007

I didn't realise Andy was really that selective in who he tells he hates!

Dave Baumann · Jun 15, 2007

Davros said:
Wasnt the generall consensus they were wrong - as in three times the shading power only lead to a small increase in frame rate ?

At the time of release, yes. However, go back and test an X1800 vs and X1900 now, on current apps, and I'm sure you'll see the gap widen.

Tim Murray · Jun 15, 2007

Davros said:
Wasnt the generall consensus they were wrong - as in three times the shading power only lead to a small increase in frame rate ?

I think that was true at the time of launch but wasn't nearly as true later on; I'm fairly sure that R580 simply kicks the crap out of R520 in games like STALKER.

Dave Baumann · Jun 15, 2007

Geo said:
Which one? :smile:

Yes...

Geo · Jun 15, 2007

Davros said:
Wasnt the generall consensus they were wrong - as in three times the shading power only lead to a small increase in frame rate ?

General consensus? No. Any consensus I'm not a part of is not general, by definition. My definition, of course. :smile:

The numbers context was a big part of the disagreement. Is the appropriate context "3x", which the nay sayers liked to harp on, or was the appropriate context 15% more die size which the children of light and truth* liked to point out?

*This is a subtle hint of my own position on the matter, just in case you missed it. :yep2:

vertex_shader · Jun 15, 2007

Too bad no real questions asked about rv630

(only about 65nm).

sireric · Jun 15, 2007

Geo said:
Actually, the scalability answer I wanted to hear Eric expound on a bit more was this one:

Which suggests that even if you had infinite scalability flexibility (and of course neither company does nor ever will) there are still instances where you'd want non-linear reductions. I'd like to hear more about why that would be, and maybe an example.

We have internal tools that model various apps and apply theroretical versions of our architecture to it, and see how the performance changes. This allows us to tailor numbers of, to hit performance targets.

As we reduce elements, bottlenecks can shift, which forces you to restrain certain reductions, but can increase others. For example, when numbers of bits for memory drops, not only does the total BW drop, but also the number of channels. Consequently, if you have nbr of rops or texture inline with the BW, but that were depending on the channels to avoid collision, you might discover that now the new ratio is less than you would expect from simply the BW reduction. That happened on the 600 -> 630 for us -- Textures were allowed to drop less than ROPs. It helps to maintain the shader busy as well. Gives a better bang for your buck at that level.

It's really not a science, regretfully -- Lots of trial and error work to find the best configs.

Sir Eric Demers on AMD R600

sireric

Geo

Mostly Harmless

trinibwoy

Meh

Dave Baumann

Gamerscore Wh...

AlexV

Heteroscedasticitate

Tim Murray

the Windom Earle of mobile SOCs

Geo

Mostly Harmless

sireric

sireric

Geo

Mostly Harmless

Davros

Geo

Mostly Harmless

sireric

Dave Baumann

Gamerscore Wh...

Dave Baumann

Gamerscore Wh...

Tim Murray

the Windom Earle of mobile SOCs

Dave Baumann

Gamerscore Wh...

Geo

Mostly Harmless

vertex_shader

sireric

Similar threads