The People Behind DirectX 10: Part 3 - NVIDIA's Tony Tamasi

Hanners

Regular
ExtremeTech have rounded off their three-part article talking to DirectX 10's big players (after chatting with Microsoft and ATI) by quizzing NVIDIA's Tony Tamasi. Nothing 'new' per se, but you might find some interesting tidbits in there. The answer to the question of using unified shaders was pretty inevitable, of course...

Frankly speaking, however, the graphics industry has gotten very good at extracting a lot of performance from the current vertex/pixel shader architectures, so the competition for anything new architecturally is a highly evolved and efficient architecture. The first rule of any new GPU is to be better at the previous API. Any trade-off which might move you away from that goal has to be evaluated carefully. If you look at some of the existing GPU architectures you can see some pretty big differences in terms of architectural efficiencies, and that is with architectures that, at least at the highest level, would present themselves as "non-unified" architectures. Even within that environment, you can see huge (almost 2x) differences in terms of performance delivered per unit of area or power.

If you assume that the competitors will be bound by the same laws of physics and economics, that alone would put one competitor in a dramatically better position. Would consumers be willing, for example, to pay twice as much for a graphics card that delivered the same performance as another, just because a particular graphics card was unified, or to pay the same price, but run at ½ the performance just because it offered some new architectural block diagram? I doubt it.

The full question and answer session is here.
 
Tony answers well, IMO. Definitely worth a read, even for pure red bloods.....

Cheers, Hanners, for the linkage :D
 
Not really that much FUD at all, methinks. But the obligatory "we expect to be the leader with DX10 just like we were with SM3.0" is there ;)
 
Oh I liked that one :
For developers to be able to more easily develop a "single" DX10 code path with a uniform feature set will mean they can invest more in their game than in implementing special case behavior for a particular IHV's [independent hardware vendor] implementation; say a choice to not include vertex texturing, for example.


A purely
coincidental example :D
 
Hanners said:
ExtremeTech have rounded off their three-part article talking to DirectX 10's big players (after chatting with Microsoft and ATI) by quizzing NVIDIA's Tony Tamasi. Nothing 'new' per se, but you might find some interesting tidbits in there. The answer to the question of using unified shaders was pretty inevitable, of course...

Yeah, nothing there to challenge the conventional wisdom that they are on a "of course we're going to unify. . .just not this time" path.

And this one,
Nvidia has been working on DX10 for many years, and as we were with Shader Model 3, expect to be the leader with DX10.

Which is the coy way of saying, "yep, like you've heard, we expect to be out first"
 
They seem convinced that the unified approach isn't the most effective solution today. I wonder if they'll go the smaller die / lower power route of G70 or if G80 will be the around the same size as R600.
 
I'm going to predict that R600 will be no bigger than +15% bigger (mm2) than G80 (compared to 80% bigger R580 vs G71). Now, understand, that prediction does not foreclose R600 being significantly smaller than G80 --that's a limit on the top end of the range I'm predicting, while not addressing the bottom end of the range at all.

Why? Because we already have NV down, officially, for 500M+ transistors, and the timing demands it can't be on anything smaller than 80nm.

There is a minority voice out there insisting that R600 is 65nm. Haven't bought that mule yet tho.
 
With G80 set to be more than 500M transistors, the smaller die/lower power route in comparison with R600 would have to mean that R600 was 700M+ transistors - to get the effect we currently see with G71 versus R580.

On the other hand, R580's architecture already has a pile of stuff needed to build a (unified) D3D10 GPU, such as the out of order threading, decoupled texturing, AA+HDR - so it's down to how much growth is required from this "high base". And since I like to argue that Xenos is a 64-pipe USA in 232M transistors (without ROPs - which could be argued to be about 70M transistors for a 16 ROP design - but who knows? - and including southbridge functionality) I see it as fairly unlikely that R600 will be bigger than G80.

G71 has FP16 texture filtering and ...?

G71->G80 looks like a much steeper mountain to me.

Jawed
 
Don't forget the added functionality for the GS, HDR+AA etc. We know nothing about the respectable implementations or the number of "pipes" and ROP's, thus it's hard to predict anything. If we only knew more about the features of the two chips it'd be a bit easier to guess.
 
How much of R580's girth is really attributable to those things we're sure to see in G80? I would think a large part of that is the memory controller and the small batch logic - two things that may not receive as much emphasis in G80's design.

Given G80's 500M+ count, is it possible that ATi would require less than 116M transistors to move from DX9 discrete R580 -> DX10 unified R600? I would think there still remains a fair chunk of logic to implement a unfied architecture even using R580 as a base.

Jawed if I'm reading you right, you're saying that a unified DX9 part from ATi would've been smaller than R580?
 
Last edited by a moderator:
trinibwoy said:
I would think there still remains a fair chunk of logic to implement a unfied architecture even using R580 as a base.
Except they're not using R580 as a base, so it's fairly moot. Jawed's suggestion that unified DX9 would be smaller for ATI for the same shader power as R580 is absolutely right. I'd expect R600 to be the same size as R580, give or take, given 80nm and what R600 is.

G80 is something else entirely, as we're finding out. If it's anything less than 400mm^2 then I'll be surprised.
 
  • Like
Reactions: Geo
trinibwoy said:
Given G80's 500M+ count, is it possible that ATi would require less than 116M transistors to move from DX9 discrete R580 -> DX10 unified R600? I would think there still remains a fair chunk of logic to implement a unfied architecture even using R580 as a base.

Who is going for what clocks? Don't forget that G71 and R580 are clocked the same, yet have a very large size difference. Don't you think it is reasonable to assume some of those extra transistors in R580 might have gone to making that possible? And that factor will be less so when the chips are closer in size.
 
Rys said:
G80 is something else entirely, as we're finding out. If it's anything less than 400mm^2 then I'll be surprised.

Whee!

I'm tempted to say "quoted for permenance", but that won't work with you! :p

Not that you'd ever do something so base, of course.

I wonder what kind of clocks they can get out of that? Surely not higher than 650mhz?
 
_xxx_ said:
Don't forget the added functionality for the GS, HDR+AA etc. We know nothing about the respectable implementations or the number of "pipes" and ROP's, thus it's hard to predict anything. If we only knew more about the features of the two chips it'd be a bit easier to guess.
I was listing architectural functionality that is in existing GPUs that are known features of D3D10 (but not features of D3D9), which is why I included HDR+AA but not GS.

Obviously sometimes it's a matter of concepts rather than specific function blocks - e.g. simply by decoupling texturing, the remaining pipeline (in an NVidia GPU) can become much shorter (I guess) which could lead directly to smaller batches. So NVidia could conceivably produce a GPU that has better dynamic branching granularity, without going the whole hog of implementing out of order threading.

Jawed
 
What do we really know about G80? AFAICR absolutely nothing so far, just maybes. Was there any real info yet?

EDIT: also, we kinda "know" that R600 will be unified, but what else besides that?
 
Last edited by a moderator:
_xxx_ said:
What do we really know about G80? AFAICR absolutely nothing so far, just maybes. Was there any real info yet?

EDIT: also, we kinda "know" that R600 will be unified, but what else besides that?

We know G80 is 500M+ transistors and that it has HDR+AA. Those aren't probably's; they've been stated.

We know that R600 leverages C1 tech in significant fashions (stated directly). We know that the ring-bus is coming across (stated indirectly, but solidly enough in my book).

We know they are both DX10 parts, and that means there will have to be GS functionality.
 
That's the thing, we don't really know much more about R600 than G80. The assumption seems to be that it's just some Xenos/R580 mish-mash but there could be some new cool stuff in there too.
 
Back
Top