NVIDIA's Goldman Sachs Webcast, Summary/Transcript (info G71, G80, RSX, etc.)

Jawed said:
As I've said, a load of devs are already finding out what a unified architecture can do, with Xenos.

No they're not because there is nothing new to discover. You will have to write a vertex shader and a geometry shader and a pixel shader in HLSL, exacly like we do now. Except for the new integer and bitwise instructions support, there are no new instructions to use in any shader. Currently every matrix/vector/math instrcution can and is used in either vertex or pixel shader, and the same goes to texture lookups, function calling, condition statements, etc. From a higher, API way of organizing things, yah, sure, they are all "shaders".

In Direct3D 10, all shader stages offer the same base functionality, which is implemented by the Shader Model 4.0 Common Shader Core. In additional to the base each of the three shader stages (vertex, geometry, and pixel) offer some unique functionality only to that stage, such as the ability to generate new primitives from the geometry shader stage or to discard a specific pixel in the pixel shader stage.

What does that unify that isn't already, currently unified?

As for the next big thing being The Big Unification... Unification for who? Not for the developers, because HLSL, GLSL and Cg already do 99% of that. Maybe the hardware? Well, there I don't know. What I know is that the work of a vertex shader is and always will be diferent from the amount and type of work of a pixel shader, even if what they end up doing is just math and texture lookups.

In my opinion, the G80 will unify vertex and geometry shaders, but not the pixel shaders. Maybe with each unit running at diferent speeds, like for example a single vertex/geometry shader @ 2GHz and 32 pixel shaders at 700MHz? Is this brute force? Not elegant? Maybe, but it is the best way of getting the max out of each stage: with speciallized units.
 
trinibwoy said:
That I will certainly give you - decoupled texturing doesn't depend on unification. But since the patent also allows for coupled texture-units I think it's more of a fleshing out of the unified architecture that somebody at Nvidia is thinking about. You don't think they went through all that trouble just to describe decoupled texture units and throw it away afterwards?
No, the way I see it is:
  • G80 will use decoupled texture units for each of GS, VS and PS (decoupling is required to stand any chance of having decently small batches, hence decent branching performance). We could get into a discussion about batching/threading sizes in GS/VS versus PS...
  • G90 will share texture units across the GS, VS and PS
  • G100 (for the sake of argument - the unified architecture) will be fully unified with a "single" multi-threaded processing unit associated with a decoupled texturing unit
When these patents are written-up, the authors try to cover all possible embodiments in the description (and the claims are written in such a way as to ensure that they don't block off any such embodiments). So whatever's on their minds will make it in there, regardless of whether they'll ever implement that way.

Right now, I expect NVidia is simulating its own unified architecture. I don't for a minute deny they have designs for it - my only point is that nothing indicates they've started early enough to make G80 unified. And the way I see it, top-to-bottom in first-gen D3D10, in realworld FLOPs per Watt, NVidia will trail ATI significantly.

---

I'm still intrigued to hear alternative definitions for a unified architecture...

Jawed
 
Sigma said:
No they're not because there is nothing new to discover. You will have to write a vertex shader and a geometry shader and a pixel shader in HLSL, exacly like we do now. Except for the new integer and bitwise instructions support, there are no new instructions to use in any shader. Currently every matrix/vector/math instrcution can and is used in either vertex or pixel shader, and the same goes to texture lookups, function calling, condition statements, etc. From a higher, API way of organizing things, yah, sure, they are all "shaders".
I was talking about performance :LOL:

What does that unify that isn't already, currently unified?
Well I get the feeling that since before SM2 arrived, the desire for logical unification has been strong in certain quarters - regardless of hardware unification. The idea of a GPU evolving in programmability towards a CPU has been around for ages.

As for the next big thing being The Big Unification... Unification for who? Not for the developers, because HLSL, GLSL and Cg already do 99% of that. Maybe the hardware? Well, there I don't know. What I know is that the work of a vertex shader is and always will be diferent from the amount and type of work of a pixel shader, even if what they end up doing is just math and texture lookups.
Those amounts are going to change, though, aren't they? Particularly as the GPU is able to take on a whole class of work that's historically stayed on the CPU. Similar to how GPUs took on T&L.

In my opinion, the G80 will unify vertex and geometry shaders, but not the pixel shaders.
Now that's an interesting idea that I like the sound of. Partially unified. Hmm... That actually has a ring of credibility to it.

Maybe with each unit running at diferent speeds, like for example a single vertex/geometry shader @ 2GHz and 32 pixel shaders at 700MHz? Is this brute force? Not elegant? Maybe, but it is the best way of getting the max out of each stage: with speciallized units.
Except for the rendering passes that perform zero pixel shading, when you'd have a great chunk of the die doing nothing...

Jawed
 
Jawed said:
Except for the rendering passes that perform zero pixel shading, when you'd have a great chunk of the die doing nothing...

Jawed

Here's an idea out of left field. . .how about expanding the idea of variable clocks, by domain? Notebook parts have had variable clocks for some time, and NV has domain clocking. . .

Edit: Oh. Err. Gee, I thot I might have heard it before. . .like, *right* before. :LOL: Too much multi-tasking!
 
Last edited by a moderator:
Jawed said:
Except for the rendering passes that perform zero pixel shading, when you'd have a great chunk of the die doing nothing...
As I said before: decouple geometry and pixel processing with a big enough FIFO and both parts don't have to work on the same rendering pass anymore.
 
Jawed said:
I was talking about performance :LOL:

Oh, then you assume that unification equals performance?
Developers will always look forward to next gen architecture (either unifined or not) because it will be faster than the last one.

Jawed said:
Those amounts are going to change, though, aren't they? Particularly as the GPU is able to take on a whole class of work that's historically stayed on the CPU. Similar to how GPUs took on T&L.

Are they? A single fullscreen quad (4 vertex) can produce 1024*768 pixels. Now, and at least for a very long time, there will be far more pixels to shade than vertices to calculate for sure. Even if it isn't graphics work, say GPGPU, it still usually uses more pixels/textures than vertices.

Jawed said:
Except for the rendering passes that perform zero pixel shading, when you'd have a great chunk of the die doing nothing...

Can you name an example of something that does not produce pixels?
Remember that you only have to have 1 or 3 vertices processed before starting to rasterized and shade the pixels it produces... Even with a small buffer between both pipeline stages, it is very and very dificult to have the pixel shader idle waiting for the vertex shader.

In current DX10 API, the vertex shader must output to the geometry shader and then data can either go to the pixel shader or output to memory. The only chance that the pixel shader can be completly idle is if the output of the geometry never reaches it, like for example... (I can't think of anything.. ;) ) And look at Xmas last post...
 
Jawed said:
When these patents are written-up, the authors try to cover all possible embodiments in the description (and the claims are written in such a way as to ensure that they don't block off any such embodiments). So whatever's on their minds will make it in there, regardless of whether they'll ever implement that way.

Right now, I expect NVidia is simulating its own unified architecture. I don't for a minute deny they have designs for it - my only point is that nothing indicates they've started early enough to make G80 unified. And the way I see it, top-to-bottom in first-gen D3D10, in realworld FLOPs per Watt, NVidia will trail ATI significantly.

Fair enough, that gels with what I'm thinking. Note I never claimed unified for G80, maybe that's where all your resistance was directed. It will be quite interesting to see how your predictions for DX10 performance are fleshed out.
 
Xmas said:
As I said before: decouple geometry and pixel processing with a big enough FIFO and both parts don't have to work on the same rendering pass anymore.
Would that be an acceptable compromise in a unified architecture? Would the extra memory bandwidth consumed (and latency incurred) be a win?

Would it simply be preferable on a non-unified architecture?

Jawed
 
Sigma said:
Can you name an example of something that does not produce pixels?
Stencil shadow volumes. Purely Z/stencil as I understand it.

As to Xmas's point, it seems to me to be a trade-off of memory-bandwidth/latency versus unified scheduling. Under what circumstances is the former a win?

Jawed
 
Joe DeFuria said:
Yes, I think the impression is that they have shifted / softened on the hardware unification issue. Though it doesn't seem like the change of heart would have been in time for '06 release of parts.

The general concensus has been pointing to G80 not being unified (while still supporting D3d10 of course), but nvidia's next major architecture would be unified. That might be G100 though...

"G100" is one notch too much IMHO.
 
There weren't and aren't any problems with 90nm. Those with a tad wider horizon and a tad more objectivity could maybe figure out the real reasons, but it's irrelevant at this point.

Both IHVs are doing more than well in terms of sales lately and I'd figure that both are concentrating mostly (as anyone would expect) on their coming D3D10 architectures.

As for the supposed FLOPs/Watt ratings, it's wildest shot in the air right now and laughable at best. One would be lucky to know a few tidbits on each architecture but power consumption relative details?

The best kneejerk of them all is one real professional trying to explain that a non-unified shader core would not necessarily signify any doom as of yet, while ironically the company he works for does have a very sophisticated feature rich USC on display right now.
 
geo said:
Who's to say G90 isn't to G80 what G70 is to GF6?

After the SM3.0 bulk I would speculate for D3D10:

1. One generation with separate pixel and geometry units.
2. One generation a USC

Now put any codename behind (1) and (2); I'm just borrowing numbers here.
 
Jawed said:
Would that be an acceptable compromise in a unified architecture? Would the extra memory bandwidth consumed (and latency incurred) be a win?
I don't see a reason to do that in a unified IMR. When there's no reason one type of shader would stall/starve the other, why would you decouple them?

A TBDR of course always writes some scene data to memory, whether unified or not, but that's for a different purpose, and usually a big win.

Would it simply be preferable on a non-unified architecture?
It would mean a certain bandwidth cost in exchange for better shader utilization, in cases where you have some render passes stressing the geometry side and some stressing the pixel side.

Note I'm not advocating non-unified architectures here, just saying that there might be some headroom left for them. I don't doubt unified will be the future.
Performance/Watt is an interesting one since it could be more about how much unnecessary work you save instead of whether all units are running at peak load.


Ailuros said:
The best kneejerk of them all is one real professional trying to explain that a non-unified shader core would not necessarily signify any doom as of yet, while ironically the company he works for does have a very sophisticated feature rich USC on display right now.
:LOL:
 
Ailuros said:
"G100" is one notch too much IMHO.

I'm just throwing out arbitrary numbers. If what nVidia said is true, that the next gen architecture will last two years, the I expect that over the next two years we'll have G80 and G90 based chips, both basically based on the same architecture.

Basically, I do expect that nVidia's "next architecture" after G80 to be of the unified variety, but going from G80 to G90 doesn't have to represent a genuine "new architecture." More like G80 in 4Q '06, G90 in 4Q '07...and then unified (G100?)sometime mid to late '08.
 
_xxx_ said:
I dunno 'bout current numbers, but it's not like these are carved in stone forever. And I see next to none ATI chips in high-end notebooks right now. Let's see how it works out this and next year.
Currently buying a highend laptop as a gift for my brother and dell is offering only the 256mb geforce 7800 with their 17" model. Selecting it, causes a 1 month delay in ship time. :devilish:

epic
 
epicstruggle said:
Currently buying a highend laptop as a gift for my brother and dell is offering only the 256mb geforce 7800 with their 17" model. Selecting it, causes a 1 month delay in ship time. :devilish:

epic

That's because very few people buy these, so they don't have them in stock imediately as opposed to cheaper models I guess.
 
Joe DeFuria said:
I'm just throwing out arbitrary numbers. If what nVidia said is true, that the next gen architecture will last two years, the I expect that over the next two years we'll have G80 and G90 based chips, both basically based on the same architecture.

Basically, I do expect that nVidia's "next architecture" after G80 to be of the unified variety, but going from G80 to G90 doesn't have to represent a genuine "new architecture." More like G80 in 4Q '06, G90 in 4Q '07...and then unified (G100?)sometime mid to late '08.

Someone also saw an experimental deferred renderer layout from NV. And yes you're supposed to take that as a joke, because I can't take it seriously myself.

What you call G90 sounds more like a refresh to me; if it would be up to me I'd name it G85 or something like that. I'm talking about real generations obviously ;)

***edit: SM3.0 will last them also 3 years (2004-2007) but that mostly for the low end.
 
Last edited by a moderator:
Back
Top