The People Behind DirectX 10: Part 3 - NVIDIA's Tony Tamasi

geo said:
500M+ came from Mike Hara, VP Investor Relations for NV.

HDR+AA came, officially, from AEG, on these boards. "Officially" in the sense that two of our AEG focus group members said that they'd been authorized by AEG to make that public.

Edit: That's my standard for "knows", btw. Official direct, or official by acknowledged authorization. Everything else is some varying flavor of "suspect". ;)


Oh, that's good enough for me as well :)
 
Jawed said:
And, with the apparently fine-grained AA sample grid that's coming to G80:

http://www.beyond3d.com/forum/showpost.php?p=789268&postcount=49

that would work really nicely.

I now think that if you count the sign bit, you get 6 so it's effectively a 64x64 grid for MSAA. I'm kinda surprised nobody else has expressed glee at that.

That thread started out 1.5 levels above my ability to comprehend it, and it's stayed there.

This post, on the other hand, I'll be as gleeful as you like about. :D :D :D

Tho is that enough for "knows"?
 
Sunrise said:
i second the thought that NV will try (no matter what) to still focus a lot of their effort on OpenGL.

As afar as I'm aware, the requirements of DX and OGL aren't really contradictory. The programming API is different, but they basically require the same HW for many things.

Their OGL focus is mainly on the software side of things, methinks.
 
Jawed said:
I now think that if you count the sign bit, you get 6 so it's effectively a 64x64 grid for MSAA. I'm kinda surprised nobody else has expressed glee at that.

Heh, the numbers may look good but if current 16xAA is anything to go by I'm guessing any glee will come from watching 300x zoomed in screenshots of jaggies. :smile: What I'd be excited about isn't really the number of samples but the quality of them - if they're able to make 4xAA look as good or better than today's 6x/8x that'd be cool.
 
_xxx_ said:
As afar as I'm aware, the requirements of DX and OGL aren't really contradictory. The programming API is different, but they basically require the same HW for many things.

Their OGL focus is mainly on the software side of things, methinks.
That´s largely they case, indeed. And it´s certainly not contradictory by any means, but i would still not underestimate driver development re: OpenGL, especially WRT the already addressed reasons.

Looking at their past and present HW, you are absolutely right, of course. Speaking stricly from a margin standpoint, there probably isn´t anything that could give them more of an advantage than using identical ASICs and since they´ve been very successful with this, they´ll certainly continue with that strategy.
 
Last edited by a moderator:
trinibwoy said:
Heh, the numbers may look good but if current 16xAA is anything to go by I'm guessing any glee will come from watching 300x zoomed in screenshots of jaggies. :smile: What I'd be excited about isn't really the number of samples but the quality of them - if they're able to make 4xAA look as good or better than today's 6x/8x that'd be cool.
That's why sparse sampling, which maximises the effective edge resolution, is so important, and why ATI's ability (with Super AA) to support sparse 12xAA is funky.

http://www.beyond3d.com/reviews/ati/crossfire/index.php?p=16

Whether it's possible to see more than 12xMSAA at typical enthusiast gamer resolutions (1280 or 1600) is arguable - I dunno. But for professional graphics stuff, I guess the more the merrier.

Jawed
 
Maybe Dave could get co-worker Richard Huddy to comment on Tony Toamasi's remark about US still being to early. Huddy mentioned ......

Mr. Huddy said that Xbox 360 game console, which sports developed by ATI Xenos graphics core with unified shader architecture and 48 shader processors, loses 20% to 25% performance in pixel-shader limited games, when its graphics chip is configured as non-unified, e.g.,16 processors work strictly on vertex shaders, whereas 32 are assigned for pixel shaders
 
I suppose it's worth following up that NVidia may go one step funkier by varying the sparse sampling pattern across the screen. There was a patent for this recently, wasn't there?

On a single card with 4xMSAA per pixel and 64EER sparse grid you could get loads of different sparse sampling patterns - erm it's the 8 queens problem, but with 4 queens on a 64x64 board. Completely nutty number of solutions.

http://www.math.utah.edu/~alfeld/queens/queens.html

I don't know how you'd define the good ones, except by saying that each quadrant of the grid should have one sample.

Jawed
 
Nvidia 80nm G80 (GeForce 8800) 600MHz-700MHz?

So if the 80nm G80 has around 560 million transistors witch is 2x of 90nm G71 (278 million) with clock speeds up too 700MHz! What kind of ckock speed will we get with 80nm G80?

130nm Nv40 400MHz (6800GTX) 222 million transistors
110nm Nv48 430MHz (6800GTX512) 222 million transistors
110nm G70 Tape out around 550MHz (Nv47 7800GTX/512) 302 million transistors
90nm G71 Tape out around 650MHz (Nv49 7900GTX/7950GX2) 278 million transistors
80nm G80 tape out around speed? (Nv50 8800GTX) 560 million transistors
65nm G81 (8900/GX2) Spring Refresh

I beleve 80nm G80 500+ million transistors should be clocked the same as the 90nm G71 stock 650MHz.

And dont forget about Nvidias spring refresh April 2007 65nm G81 (8900) ....can you say GX2 Quad SLI and higher clock speeds up too 700MHz-800MHz with 65nm.
 
Last edited by a moderator:
ToxicTaZ said:
So if the 80nm G80 has around 560 million transistors witch is 2x of 90nm G71 (278 million) tape out @700MHz max clock speeds! What kind of ckock speed will we get with 80nm G80?

110nm G70 maxed out around 600MHz (Nv48 7800GTX/512) 325 million transistors
90nm G71 maxed out around 700MHz (Nv47 7900GTX/7950GX2) 278 million transistors
80nm G80 tape out speed? (Nv50 8800GTX) 560 million transistors

I beleve 80nm G80 500+ million transistors should be clocked the same as the 90nm G71 stock 650MHz.

And dont forget about Nvidias spring refresh April 2007 65nm G81 (8900) ....can you say GX2 Quad SLI and higher clock speeds up too 800MHz!


There are a few minor incorrections in there:

- G70 maxed out officialy at 550MHz, not 600, had 302 milion transistors, not 325, and it was the original NV47, not NV48 (see below).
- G71 maxed out officialy at 650MHz, not 700, and it was the original NV49, not NV48 (BTW, NV48 was the codename for a rare version of one of the Geforce 6800 variants).
- G80... no comment :)
 
Last edited by a moderator:
If they can get Quad drivers working in optimal performance. Then G80 GX2, well with that you could sit back and upgrade again until DX11 comes out. :p
 
Tony Tamasi said:
Of course, DX10 has some great features for image quality, both in terms of API-visible functionality like geometry shaders, as well as more consistent and specified behavior for things like texture filtering, antialiasing, and transparency that should also benefit first-generation DX10 games.
What is meant by "more consistent and specified behaviour"?

Tony Tamasi said:
Some features that developers have continued to express strong interest in would be generalized tessellation, and more-sophisticated methods for solving transparency, among others.
What would be "generalized tesselation"?
 
  • Like
Reactions: Geo
One explanation could be programmable tesselation within the GPUs in order to achieve in the longrun (as programmability rises through time) true adaptive on-chip tesselation.

AFAIK developers are truly asking for it, trouble being that there obviously wasn't enough HW space for it for the entire D3D10 (future updates included).

If you look at the quite old DX-Next article here you may find a section about Tesselation in there. It is my understanding that there must have been some hints in early drafts for somewhat programmable tesselation in there, which then turned into something fixed function, then an optional unit and then vanished all together.
 
Ailuros said:
AFAIK developers are truly asking for it, trouble being that there obviously wasn't enough HW space for it for the entire D3D10 (future updates included).
Can you point me to links that stated or suggested :

1) the hw space cost; and
2) that this will never be in any future D3D10 versions?
 
I get the feeling nvidia will be trying for nearer 350mm area than 400. I think their mindset now is multiple simplest blocks rather than single complicated blocks. Lower power, low cost, less risk. Still speed when you lump 2 to 4 together though. I think the non unified shaders sums this approach up as well ..just do what you have to meet the spec but nothing too fancy. Low risk, high rewards(?)

On the other hand Ati's approach is a lot more exciting. It will be far more interesting to see how well they pull off the efficiency of the unified shaders. I'm not too keen on the 3:1 shader ratio of their current cards ( still think it is a waste for current time frame ) so it will be fascinating to see how the new approach works in practice. Possible 4:1 ratio and unified shaders .. death or glory time !
 
dizietsma said:
I get the feeling nvidia will be trying for nearer 350mm area than 400. I think their mindset now is multiple simplest blocks rather than single complicated blocks.
I'm not so sure. I think the decision to do what was little more than a die shrink between the 7800 and the 7900 was a decision borne out of performance just as much as cost: they figured that just adding more pipelines wouldn't increase performance enough to justify the cost.

Since by the time the G80 comes out we will have (yet again) faster memory available, and since the advancements required by DX10 will require a fair amount of transistors, as well as other inevitable enhancements (such as FP16 MSAA), I'm really not expecting nVidia to skimp on die area.

After all, they do have the spectre hanging over their head of ATI's unified architecture. With such a dramatic disparity in design culture this time around (something I don't think we've seen since the GeForce 256/GeForce2 vs. Rage Fury MAXX vs. Voodoo5), there is an inherent danger that one company's design decision will come out on top in a dramatic fashion. If nVidia understands the danger inherent in such a disparity, even if they believe they are correct on the path they are on, I fully expect them to pull out all the stops on this next architecture. Allowing ATI to significantly outperform them could cost them billions.

Note that ATI is in the same boat, by my estimation.
 
Following simple logic, nV had MUCH more time to develop, (re)design and test their new architecture. While ATI needed resources for the Xenos development and resolving the problems with R520, nV has had almost an extra year where they could invest most of their resources into theoretical development. That's why I think they'll have the overall significantly faster and cheaper to produce chip this time around, simply because they had more time for tweaking and optimizing the design.
 
Back
Top