Prohardver! Interview with Eric Demers

BrynS

Regular
Behind the scenes with ATI -- Eric Demers Interview

Prohardver! interview snippet said:

Prohardver!:
Why do the top ATI GPU codenames always end with “20â€￾ instead of “00â€￾, as in R420, R520? Is the “00â€￾ ending reserved fore something special?

Eric Demers: You know, I've been confused by our part numbers and have ceased to really try to understand the numbering scheme. Yes, the first number indicates architecture, but even that can be partially wrong. It's a number that we sometimes try to make mean something (i.e. engineering wise), sometimes meant to mean something else (i.e. marketing wise). I would not attach too much to those numbers. Even within engineer ing, we use codenames, since numbers change and aren't always meaningful. And yes, sometimes they will end in '20' and sometimes in '00' but it's more random than most things – I don't remember why we picked R520, for example. Perhaps it's done to confuse the enemy J We need cheat sheets to remember them all :)
Thanks to The Tech Report for the heads-up.
 
Eric Demers said:
In some of the protein folding and signal processing fields, we see 2x to 7x increases in performance relative to the fastest single core CPUs.
I think the Folding GPU client could probably be a quite nice "dynamic branching" benchmark; maybe Dave could get a beta to use in R580 review ;)?
 
Sorry for the theft!

Sorry for using Beyond3D-s photo, and not noting it. The source is noted in the hungarian version, but it was accidentali taken out with the hun legend. Now its corrected, sorry!

best regards,

rudi
PROHARDVER team
 
ED: no, the early production problems has nothing to do with any specific part of the design. It wasn't the MC or graphics or any other place. It was all over the place. There was a design flaw in a circuit that did not show up in any of the checks that we do in the process of producing ASICs. It was internal to a non-ATI design. Once we found the problem, it was trivial to fix, but it delayed our products many months.
Well that seems like slightly different information than before.

Jawed
 
Jawed said:
Well that seems like slightly different information than before.
Not as far as I understand: It was internal to a non-ATI design = "third-party library" (as I remember the earlier explanation).
 
The older suggestion was that RV515 (and Xenos - though prolly not material to the comparison) didn't suffer from this issue and that this was because it doesn't have the ring-bus memory controller.

Jawed
 
There was quite a lot in that interview, actually. I started a thread in Tech on one piece (missed this was here, oh well --wandering mod feel free to do your job).

He also promised that AAA would be officially supported for previous products (rather than tweakers-only), which is the first time I've seen that.

On filtering, he said,

There are many, many HDR formats. These include 10b, FP16, RGBe, etc… Probably over 2 dozen formats. All of these formats have advantages and disadvantages. FP16 (common 64b format) is great for dynamic range, but loses out as it doesn't have much more precision and usually comes at a hefty performance loss. 10b gives the equivalent precision, but at 2x the speed.


Is that statement consensus, or contentious? :smile:
 
geo said:
There was quite a lot in that interview, actually. I started a thread in Tech on one piece (missed this was here, oh well --wandering mod feel free to do your job).

He also promised that AAA would be officially supported for previous products (rather than tweakers-only), which is the first time I've seen that.

On filtering, he said,



Is that statement consensus, or contentious? :smile:

prescison is not range
 
rudi said:
Sorry for using Beyond3D-s photo, and not noting it. The source is noted in the hungarian version, but it was accidentali taken out with the hun legend. Now its corrected, sorry!
Fair enough, no problem.

Thanks.
 
geo said:
On filtering, he said,



Is that statement consensus, or contentious? :smile:

Well, from a mantissa precision standpoint (at least normalized to 0.5 to 1.0), it's an equivalent precision between 10b and FP16. So if an ISV just wants more precision (i.e. less quantas in blacks, for example), then 10b is a good alternative, since it's 4x more levels, at no BW increase. Almost something we could force in the control panel (At least if destination alpha isn't required).
 
sireric said:
Well, from a mantissa precision standpoint (at least normalized to 0.5 to 1.0), it's an equivalent precision between 10b and FP16.
s10e5 has twice as many values between 0.5 and 1.0. And towards the the darker colors FP16 is much better.

So if an ISV just wants more precision (i.e. less quantas in blacks, for example), then 10b is a good alternative, since it's 4x more levels, at no BW increase. Almost something we could force in the control panel (At least if destination alpha isn't required).
This is something I'd very much like to see in the drivers. Replacing 8:8:8:8 with 10:10:10:2 in the framebuffer is something that should have happened 4 years ago. Destination alpha is rarely used (devs that pick ARGB over XRGB when they don't need it should be tortured with RGB332 for life), and the reduction in color banding is well worth it.
Why almost? Microsoft?
 
Last edited by a moderator:
Xmas said:
s10e5 has twice as many values between 0.5 and 1.0. And towards the the darker colors FP16 is much better.


This is something I'd very much like to see in the drivers. Replacing 8:8:8:8 with 10:10:10:2 in the framebuffer is something that should have happened 4 years ago. Destination alpha is rarely used (devs that pick ARGB over XRGB when they don't need it should be tortured with RGB332 for life), and the reduction in color banding is well worth it.
Why almost? Microsoft?

Between 0.5 and 1.0, off by 1 bit, though same total number of bits. Yes, it looses to the negative exponents near 0, but it's 4x more than 8888. And it's got the same profile as 8888. It's over 2x faster than FP16 (which usually doesn't compress as much), and can be even faster in some cases for filtering. That's why it's an interesting compromise. That's all.

Edit: Why "almost"? I'm guessing, though I have no proof, that the driver support for that could be a nightmare. I'm sure there's lots of exceptions to forcing that mode, which would make it hard to determine when it can and cannot use it. In the past, on things like that, that's been the case. We shall see.
 
Last edited by a moderator:
Back
Top