The People Behind DirectX 10: Part 3 - NVIDIA's Tony Tamasi

Hanners · Jul 11, 2006

ExtremeTech have rounded off their three-part article talking to DirectX 10's big players (after chatting with Microsoft and ATI) by quizzing NVIDIA's Tony Tamasi. Nothing 'new' per se, but you might find some interesting tidbits in there. The answer to the question of using unified shaders was pretty inevitable, of course...

Frankly speaking, however, the graphics industry has gotten very good at extracting a lot of performance from the current vertex/pixel shader architectures, so the competition for anything new architecturally is a highly evolved and efficient architecture. The first rule of any new GPU is to be better at the previous API. Any trade-off which might move you away from that goal has to be evaluated carefully. If you look at some of the existing GPU architectures you can see some pretty big differences in terms of architectural efficiencies, and that is with architectures that, at least at the highest level, would present themselves as "non-unified" architectures. Even within that environment, you can see huge (almost 2x) differences in terms of performance delivered per unit of area or power.

If you assume that the competitors will be bound by the same laws of physics and economics, that alone would put one competitor in a dramatically better position. Would consumers be willing, for example, to pay twice as much for a graphics card that delivered the same performance as another, just because a particular graphics card was unified, or to pay the same price, but run at Â½ the performance just because it offered some new architectural block diagram? I doubt it.

The full question and answer session is here.

Jawed · Jul 11, 2006

Fear the FUD. Is it worth reading?

Jawed

Rys · Jul 11, 2006

Tony answers well, IMO. Definitely worth a read, even for pure red bloods.....

Cheers, Hanners, for the linkage

_xxx_ · Jul 11, 2006

Not really that much FUD at all, methinks. But the obligatory "we expect to be the leader with DX10 just like we were with SM3.0" is there

Zeross · Jul 11, 2006

Oh I liked that one :

For developers to be able to more easily develop a "single" DX10 code path with a uniform feature set will mean they can invest more in their game than in implementing special case behavior for a particular IHV's [independent hardware vendor] implementation; say a choice to not include vertex texturing, for example.

A purely coincidental example

Geo · Jul 11, 2006

Hanners said:
ExtremeTech have rounded off their three-part article talking to DirectX 10's big players (after chatting with Microsoft and ATI) by quizzing NVIDIA's Tony Tamasi. Nothing 'new' per se, but you might find some interesting tidbits in there. The answer to the question of using unified shaders was pretty inevitable, of course...

Yeah, nothing there to challenge the conventional wisdom that they are on a "of course we're going to unify. . .just not this time" path.

And this one,

Nvidia has been working on DX10 for many years, and as we were with Shader Model 3, expect to be the leader with DX10.

Which is the coy way of saying, "yep, like you've heard, we expect to be out first"

trinibwoy · Jul 11, 2006

They seem convinced that the unified approach isn't the most effective solution today. I wonder if they'll go the smaller die / lower power route of G70 or if G80 will be the around the same size as R600.

Geo · Jul 11, 2006

I'm going to predict that R600 will be no bigger than +15% bigger (mm2) than G80 (compared to 80% bigger R580 vs G71). Now, understand, that prediction does not foreclose R600 being significantly smaller than G80 --that's a limit on the top end of the range I'm predicting, while not addressing the bottom end of the range at all.

Why? Because we already have NV down, officially, for 500M+ transistors, and the timing demands it can't be on anything smaller than 80nm.

There is a minority voice out there insisting that R600 is 65nm. Haven't bought that mule yet tho.

Jawed · Jul 11, 2006

With G80 set to be more than 500M transistors, the smaller die/lower power route in comparison with R600 would have to mean that R600 was 700M+ transistors - to get the effect we currently see with G71 versus R580.

On the other hand, R580's architecture already has a pile of stuff needed to build a (unified) D3D10 GPU, such as the out of order threading, decoupled texturing, AA+HDR - so it's down to how much growth is required from this "high base". And since I like to argue that Xenos is a 64-pipe USA in 232M transistors (without ROPs - which could be argued to be about 70M transistors for a 16 ROP design - but who knows? - and including southbridge functionality) I see it as fairly unlikely that R600 will be bigger than G80.

G71 has FP16 texture filtering and ...?

G71->G80 looks like a much steeper mountain to me.

Jawed

_xxx_ · Jul 11, 2006

Don't forget the added functionality for the GS, HDR+AA etc. We know nothing about the respectable implementations or the number of "pipes" and ROP's, thus it's hard to predict anything. If we only knew more about the features of the two chips it'd be a bit easier to guess.

trinibwoy · Jul 11, 2006

How much of R580's girth is really attributable to those things we're sure to see in G80? I would think a large part of that is the memory controller and the small batch logic - two things that may not receive as much emphasis in G80's design.

Given G80's 500M+ count, is it possible that ATi would require less than 116M transistors to move from DX9 discrete R580 -> DX10 unified R600? I would think there still remains a fair chunk of logic to implement a unfied architecture even using R580 as a base.

Jawed if I'm reading you right, you're saying that a unified DX9 part from ATi would've been smaller than R580?

Rys · Jul 11, 2006

trinibwoy said:
I would think there still remains a fair chunk of logic to implement a unfied architecture even using R580 as a base.

Except they're not using R580 as a base, so it's fairly moot. Jawed's suggestion that unified DX9 would be smaller for ATI for the same shader power as R580 is absolutely right. I'd expect R600 to be the same size as R580, give or take, given 80nm and what R600 is.

G80 is something else entirely, as we're finding out. If it's anything less than 400mm^2 then I'll be surprised.

Geo · Jul 11, 2006

trinibwoy said:
Given G80's 500M+ count, is it possible that ATi would require less than 116M transistors to move from DX9 discrete R580 -> DX10 unified R600? I would think there still remains a fair chunk of logic to implement a unfied architecture even using R580 as a base.

Who is going for what clocks? Don't forget that G71 and R580 are clocked the same, yet have a very large size difference. Don't you think it is reasonable to assume some of those extra transistors in R580 might have gone to making that possible? And that factor will be less so when the chips are closer in size.

trinibwoy · Jul 11, 2006

Rys said:
I'd expect R600 to be the same size as R580, give or take, given 80nm and what R600 is.

Well I'm glad that somebody here knows what R600 is!

Rys · Jul 11, 2006

trinibwoy said:
Well I'm glad that somebody here knows what R600 is!

We really better hope that at least sireric does

Geo · Jul 11, 2006

Rys said:
G80 is something else entirely, as we're finding out. If it's anything less than 400mm^2 then I'll be surprised.

Whee!

I'm tempted to say "quoted for permenance", but that won't work with you!

Not that you'd ever do something so base, of course.

I wonder what kind of clocks they can get out of that? Surely not higher than 650mhz?

Jawed · Jul 11, 2006

_xxx_ said:
Don't forget the added functionality for the GS, HDR+AA etc. We know nothing about the respectable implementations or the number of "pipes" and ROP's, thus it's hard to predict anything. If we only knew more about the features of the two chips it'd be a bit easier to guess.

I was listing architectural functionality that is in existing GPUs that are known features of D3D10 (but not features of D3D9), which is why I included HDR+AA but not GS.

Obviously sometimes it's a matter of concepts rather than specific function blocks - e.g. simply by decoupling texturing, the remaining pipeline (in an NVidia GPU) can become much shorter (I guess) which could lead directly to smaller batches. So NVidia could conceivably produce a GPU that has better dynamic branching granularity, without going the whole hog of implementing out of order threading.

Jawed

_xxx_ · Jul 11, 2006

What do we really know about G80? AFAICR absolutely nothing so far, just maybes. Was there any real info yet?

EDIT: also, we kinda "know" that R600 will be unified, but what else besides that?

Geo · Jul 11, 2006

_xxx_ said:
What do we really know about G80? AFAICR absolutely nothing so far, just maybes. Was there any real info yet?

EDIT: also, we kinda "know" that R600 will be unified, but what else besides that?

We know G80 is 500M+ transistors and that it has HDR+AA. Those aren't probably's; they've been stated.

We know that R600 leverages C1 tech in significant fashions (stated directly). We know that the ring-bus is coming across (stated indirectly, but solidly enough in my book).

We know they are both DX10 parts, and that means there will have to be GS functionality.

trinibwoy · Jul 11, 2006

That's the thing, we don't really know much more about R600 than G80. The assumption seems to be that it's just some Xenos/R580 mish-mash but there could be some new cool stuff in there too.

The People Behind DirectX 10: Part 3 - NVIDIA's Tony Tamasi

Hanners

Jawed

Rys

Graphics @ AMD

_xxx_

Zeross

Geo

Mostly Harmless

trinibwoy

Meh

Geo

Mostly Harmless

Jawed

_xxx_

trinibwoy

Meh

Rys

Graphics @ AMD

Geo

Mostly Harmless

trinibwoy

Meh

Rys

Graphics @ AMD

Geo

Mostly Harmless

Jawed

_xxx_

Geo

Mostly Harmless

trinibwoy

Meh

Similar threads