NV40: 6x2/12x1/8x2/16x1? Meh. Summary of what I believe

Chalnoth said:
FX16 doesn't really even exist. FX12 was a term coined by nVidia to describe their 12-bit integer format. Nobody has a 16-bit integer format that is called FX16.

While we are nitpicking FX12 is NOT an integer format...
 
Xmas said:
DemoCoder said:
Wrong, GLSLANG supports FX16 integers.
Wrong. GLslang supports 16bit signed integers, but that doesn't mean it supports FX16.

Well, that's a typo, I meant to say 16-bit integers. But anyway, Integers are integers. Given me an integer and I'll give you fixed point math. Give me hardware with builtin barrel shifting, and I'll give it to you for next to no penalty. Even with a penaltly, and in rare cases where you might want to treat them as FX16, it's still worth it compared to letting the units lay idle.
 
DaveBaumann said:
Sorry - how is adopting a specification that wasn't in either core API's current release and subsequently not supported in future API's the "right idea early"? Surely the right idea would be to adopt the correct format when the API's were adopting it as well?

So you believe IHVs should design their hardware *ONLY* to what's in the spec, and not add any new features that aren't already standardized, but instead wait 12 months to 2 years for it to be hashed out in committee? Perhaps you think we should get rid of the OpenGL extension mechanism that allows vendors to expose features in their HW that Microsoft won't let them, and that the only features that can be exposed in OpenGL are those that ARB adds to the Core?

Hey, why even have R&D? Let other IHVs and Microsoft do it for you, then just implement their spec, right?

I mean, come on Dave, you can't expect the development teams working today, who are designing for 18 months from now, to wait for Microsoft or ARB to tell them what to do, and you can't expect features designed for 18 months from now to be agreed upon by everyone, especially since it gives away what the IHVs are working on.

I've been sitting on standards comittees for years, and it's simply not how it works. What you do is lobby to get your vision and your features supported, and then once the spec reaches a working draft, you start trying to do two things:

1) alter your design to track the working draft
2) expose features which did not make it into the working draft in a nice and friendly way

But this is late in the game, and sometimes it is not always possible to make radical changes, so you end up non-compliant.

If we had to live in a world with "design by committee", we'd be f*cked.

This idea that you don't add any features that aren't in current or future specs is a naive fairy tale.
 
DemoCoder said:
So you believe IHVs should design their hardware *ONLY* to what's in the spec, and not add any new features that aren't already standardized, but instead wait 12 months to 2 years for it to be hashed out in committee?
There ain't nothing wrong with adding new features, but they really should of worried about having their card be capable of performing the existing standards before they started tripping the cinematografantastic. :rolleyes:

C'mon DemoCoder, they screwed up....quit trying to justify it. :p
 
Demirug said:
Chalnoth said:
Now that's about the most ludicrous thing I've ever heard. It's certainly not a floating-point format.
FX12 is a fixed point format.
The difference is superficial as far as I am concerned. It only affects how you interpret numbers, not how (most) operations are actually performed.
 
digitalwanderer said:
DemoCoder said:
So you believe IHVs should design their hardware *ONLY* to what's in the spec, and not add any new features that aren't already standardized, but instead wait 12 months to 2 years for it to be hashed out in committee?
There ain't nothing wrong with adding new features, but they really should of worried about having their card be capable of performing the existing standards before they started tripping the cinematografantastic. :rolleyes:

C'mon DemoCoder, they screwed up....quit trying to justify it. :p


Hey, if you've read my posts since the NV30 was released, I have more than criticised them for the mistakes they made and I was probably the most disappointed of anyone on this board. But I like to distinguish between *bad implementations* and *bad ideas*, and I prefer to criticize the idea instead of engaging in mindless corporate shilling or bashing.

Many people on this board seem to practice guilty-by-association. Since NV30 had problems, or since Derek Perez is evil, therefore, every architectural idea coming from NVidia must also be the same. To me, integer units are a great idea. I also like NVidia's gradient instructions, pack/unpack, and predicates. Regardless of whether or not they are usable in the NV30 is not relevant, since I believe they are fine ideas that should be part of the standard and other IHVs should adopt them.

NVidia created a pipeline with DDX/DDY, integer (but bad impl, limited to fx12 only), pack/unpack, predicates, before there was a spec. Nvidia was able to get predicates and DDX/DDY into 2.0 extended and 3.0. Pack/unpack didn't make it.

I give them credit for that. I give ATI credit for adding centroid sampling when it wasn't in the spec, then successfully lobbying to have it included. Even if their first implementation was busted, I still would have given them credit. Uber buffers? Another good idea whose time has come.
 
DemoCoder said:
digitalwanderer said:
DemoCoder said:
So you believe IHVs should design their hardware *ONLY* to what's in the spec, and not add any new features that aren't already standardized, but instead wait 12 months to 2 years for it to be hashed out in committee?
There ain't nothing wrong with adding new features, but they really should of worried about having their card be capable of performing the existing standards before they started tripping the cinematografantastic. :rolleyes:

C'mon DemoCoder, they screwed up....quit trying to justify it. :p


Hey, if you've read my posts since the NV30 was released, I have more than criticised them for the mistakes they made and I was probably the most disappointed of anyone on this board. But I like to distinguish between *bad implementations* and *bad ideas*, and I prefer to criticize the idea instead of engaging in mindless corporate shilling or bashing.

Many people on this board seem to practice guilty-by-association. Since NV30 had problems, or since Derek Perez is evil, therefore, every architectural idea coming from NVidia must also be the same. To me, integer units are a great idea. I also like NVidia's gradient instructions, pack/unpack, and predicates. Regardless of whether or not they are usable in the NV30 is not relevant, since I believe they are fine ideas that should be part of the standard and other IHVs should adopt them.

NVidia created a pipeline with DDX/DDY, integer (but bad impl, limited to fx12 only), pack/unpack, predicates, before there was a spec. Nvidia was able to get predicates and DDX/DDY into 2.0 extended and 3.0. Pack/unpack didn't make it.

I give them credit for that. I give ATI credit for adding centroid sampling when it wasn't in the spec, then successfully lobbying to have it included. Even if their first implementation was busted, I still would have given them credit. Uber buffers? Another good idea whose time has come.
Sorry, I thought you were trying to defend their 16/32 design aspect.

My bad, and my apologies.
 
I mean, come on Dave, you can't expect the development teams working today, who are designing for 18 months from now, to wait for Microsoft or ARB to tell them what to do, and you can't expect features designed for 18 months from now to be agreed upon by everyone, especially since it gives away what the IHVs are working on.

Who said “wait for MS to tell them what to doâ€￾. We know how it works – ATI and NVIDIA are the two main bodies who are pushing MS as to what should be in and shouldn’t. Its evident that integer wasn't part of the PS2.0 spec quite early on as the only dispute is over FP16/FP32. I think it behoves the IHV’s to support the current API’s to the best of their abilities to give the end users the best support possible. As for removing the FX support – in relative terms it evidently was fairly trivial since they removed it for their refresh parts, a larger change than they had done with other refreshes.
 
Irrelevent. You think ATI and NVidia are going to divulge to either other what they are are working on for the NV50/60 and R500/600 and agree now, 2 years out, what features should be in the standard? That's not how it works Dave. Companies do not want to divulge their future designs early in standard working groups. These chips are being designed and the major architectural feaures being "locked in" over a year before the next round of DirectX first drafts" are even published.

Moreover, DirectX isn't the only relevant standard. ARB can't be bullied around by just 2 vendors the way ATI and NVidia can influence the Microsoft monopoly.

You need to put yourself in the shoes of an engineer today who is designing a chip due to hit the market in 2006. Is it your assertion that he should a) restrict his feature set to ONLY that which exists in DirectX9 or b) have weekly conference calls with competiting IHVs over the next year to hammer out which features they are both developing in private?

There will always be proprietary features which are not exposed in the standards because hardware design mostly drives the standards, not the other way around.

(DX8 ps1.0, 1.1-1.3, and 1.4 are the best examples of this. specs which exist purely to "map" IHV specific extensions and features. Each revision was pretty much one IHV.)

-DC
Sitting member on 3 W3C, 2 IETF, 2 OMA, and 2 JCP working groups.
 
DemoCoder said:
(DX8 ps1.0, 1.1-1.3, and 1.4 are the best examples of this. specs which exist purely to "map" IHV specific extensions and features. Each revision was pretty much one IHV.)

The past isn't even an indicator of the present, let alone the future. Specificly, it wasn't that way with DX9, so why do you think it'll be that way in the future? Current trends would suggest there will be only one standard irregardless of IHV specific features.
 
DemoCoder said:
Irrelevent. You think ATI and NVidia are going to divulge to either other what they are are working on for the NV50/60 and R500/600 and agree now, 2 years out, what features should be in the standard? That's not how it works Dave. Companies do not want to divulge their future designs early in standard working groups. These chips are being designed and the major architectural feaures being "locked in" over a year before the next round of DirectX first drafts" are even published.
Was this the case when DX9.0 was drafted ? Recalling the post below it seems that in the case of DX9.0 certain issues were settled long enough before that changes could have possibly been made. Or is that reading to much into it?
sireric said:
Dave H said:
Very interesting point about from-scratch vs. evolutionary designs. Getting back to the original issue: do you think the decision to base DX9 around FP24 was sealed (or was at least evident) early enough for Nvidia to have redesigned the NV3x fragment pipeline accordingly without taking a hit to their release schedule? (And of course the NV30 was realistically planned for fall '02 before TSMC's process problems.) Obviously a great deal of a GPU design has to wait on the details of the API specs, but isn't the pipeline precision too fundamental to the overall design? Or is it?

The FP24 availability was at least 1.5 years before NV30 "hit" the market. I'm sure MS would of been very reasonable to inform them earlier or even work with them on some compromise. I have no idea what actually "happened". Not sure how that fits in with their schedule, but I think it's safe to say that they had time, if they wanted. I don't believe in the TSMC problems. I agree that LowK was not available, but the 130nm process was clean by late spring 02, as far as I know. The thing that people fail to realize is that MS doesn't come up with an API "out of the blue". It's an iterative process with the IHVs, and all of us can contribute to it, though it's controlled, at the end, by MS.

Or, since you might not be able to speak to Nvidia's design process: was R3x0 already an FP24 design at the point MS made the decision? If they'd gone another way--requiring FP32 as the default precision, say--do you think it would have caused a significant hit to R3x0's release schedule? Or if they'd done something like included a fully fledged int datatype, would it have been worth ATI's while to redesign to incorporate it?
No, it wasn't. I don't think FP32 would of made things much harder, but it would of cost us more, from a die cost.

...and apparently the answer is no. Which brings the mind the question of why Nvidia stuck with a 4x2 for NV30 and NV35, if not because they didn't have the transistor budget to do an 8x1. Two ideas spring to mind. First, that they were so enamored of the fact that they could share functionality between an FP32 PS 2.0 ALU and two texture-coordinate calculators that they went with an nx2 architecture. Second, that they planned to stick with a 128-bit wide DRAM bus after all; that NV35 is not "what NV30 was supposed to have been", but rather the quickest way to retrofit improved performance (particularly for MSAA) onto the NV30 core; and that if NV35's design seems a little starved for computational resources compared to its impressive bandwidth capabilities (particularly w.r.t. PS 2.0 workloads), that's because it was just the best they could do on short notice.

I can't speculate too much, but I agree with some of your posts. At the end, the GF4 was a 4x? arch (was it x1 or x2 -- I don't remember). A natural evolution of that architecture would be a 4x2 still. A radical change there might be more than their architecture can handle.
 
Well, I agree that it is an iterative process, but depending on schedules, there are cut-off points, and very early in design, collaboration is less. One of the standards groups I work with deals with mobile phones and carriers. Companies like Nokia, Ericsson, and Samsung. They have very long production cycles too. Stuff that I worked on 2 years ago is just making it to market today. At a certain point in the process, no further changes can really be made unless they are very minor. If you have two vendors, say, Nokia and Ericsson, and you want to get a new standard into the newest crop of phones, but they are already in feature freeze, what ends up happening is that the two biggest IHVs agree to pare down the spec until it is a common denominator that fits both devices. DX8 was definately such a compromise.

Identical experience in the software market, even though software is more flexible. With respect to application servers, say between IBM, BEA, and Oracle, all three will have some new features they wanted pushed, but shipping schedules can only be delayed so much, and at a certain point, a settlement is made on something "less than desired", and the rest of the spec improvements are "delayed until the next revision"

We know NVidia is toying around with the idea of a "programmable primitive processor" and perhaps ATI is too. Do you think their gonna wait until Microsoft releases a working draft spec to IHVs before they begin real implementation? No, real implementation will begin anyway. This exploration will give engineers a good idea of what's possible and what's not, and this R&D data will be the input into a specification at MS. However, if the spec takes too long to complete, NVidia or ATI will come upon their next cycle and will have to make a choice: ship without the new feature, or leave it in, and release an extension. It's also possible that a spec is hammered out close to tape out, but the changes are too radical to make it into the product.

ATI has features in the R300 which went beyond DirectX9.0. Clearly, the process of developing the R300 wasn't "let's design DX9 on paper with MS first, then implement the HW later"

3dLabs designed and shipped hardware that became the basis for their OpenGL2.0 proposals, which will become the basis for their next hardware, which will end up fully compliant with the spec.
 
You think ATI and NVidia are going to divulge to either other what they are are working on for the NV50/60 and R500/600 and agree now, 2 years out, what features should be in the standard? That's not how it works Dave. Companies do not want to divulge their future designs early in standard working groups. These chips are being designed and the major architectural feaures being "locked in" over a year before the next round of DirectX first drafts" are even published.

There’s a vast difference between designs and what’s actually supported. And, yes, given what we know already of DX Next you think that this hasn’t already been shaped by NV50 and R500? MS haven’t pulled these specs out of the air. Also, given the time span between the directions that MS are requesting for DX Next and its expected release its pretty well known what’s expected next fairly early on in the design cycle of the next parts.

The competition is too tight to be wasting large transistor budgets on features that won’t be supported in the API that your product is going to see 95% of its usage on – the difference between NV30 and R300 has probably put paid to that. I think you will see a number of optional elements in DX Next, but for the most part anything that would require a reasonable portion of your die that isn’t supported will not be in – hit the standard with as much performance as possible will be the order of the day. You may even start to see some tougher decisions with what’s supported in comparison to offering speed.

DirectX Next is going to be a very important point for, certainly NVIDIA, to hit correctly because a wider range of developers are going to be picking up with that and a number more titles are going to be developed with the Xbox 2 as its primary target. IMO, this is probably one of the reasons why MS have been so early in dishing out the DX Next directions so that there isn’t a disparity in support. The last thing they want is another DX9 scenario on their hands as the disparity in support hasn’t reflected well on them to a certain extent.

Moreover, DirectX isn't the only relevant standard. ARB can't be bullied around by just 2 vendors the way ATI and NVidia can influence the Microsoft monopoly.

ATI and NVIDIA will be the primary steerers. 3DLabs still have a powerful voice on the ARB, but their hardware output is somewhat less. The likes of Apple, Dell, and potentially SGI these days are there to see that OpenGL can advance their business in the correct fashion and for them seeing a consensus of support for the parts they are buying does that – the extension mechanism raises issues of support for them and they would rather there was support in the core API that supports both vendors to give them the freedom to pick (witness the work Apple did in heading up one of the shader steering groups for the ARB).

Of course, OpenGL is very much secondary in terms of use to DX as far as ATI and NVIDIA's primary targets are concerned.

-DC
Sitting member on 3 W3C, 2 IETF, 2 OMA, and 2 JCP working groups.

Irrelevant. :)
 
ATI has features in the R300 which went beyond DirectX9.0. Clearly, the process of developing the R300 wasn't "let's design DX9 on paper with MS first, then implement the HW later"

And I’ll wager that does not require a significant portion of their die – it may even be a useful side effect of their fundamental shader design. We are not talking about larger areas of die, in comparison to the FX units in NV30.

3dLabs designed and shipped hardware that became the basis for their OpenGL2.0 proposals, which will become the basis for their next hardware, which will end up fully compliant with the spec.

And they have less than 1% market share. They are not designing for the consumer space, which DX performance is of paramount importance.
 
Given 3dLabs HW output, they've had a disproportiate effect on the OpenGL2.0 for two main reasons: One, they're not ATI nor Nvidia, so it is easier for those two to "agree" to use an independent third party's proposals (e.g. neither is at a disadvantage). Whereas one vendor "rubber stamping" another competitors (which may already be implemented) put's the endoser at a disadvantage. Microsoft has played this game well on the W3C committees by withdrawing support for standards once they found out that the authors of the working draft had beta code almost ready to ship.

And the biggest reason: 3dLabs put forth one hell of a nice, clean, design.
 
DemoCoder said:
So you believe IHVs should design their hardware *ONLY* to what's in the spec, and not add any new features that aren't already standardized, but instead wait 12 months to 2 years for it to be hashed out in committee? Perhaps you think we should get rid of the OpenGL extension mechanism that allows vendors to expose features in their HW that Microsoft won't let them, and that the only features that can be exposed in OpenGL are those that ARB adds to the Core?

Hey, why even have R&D? Let other IHVs and Microsoft do it for you, then just implement their spec, right?

I mean, come on Dave, you can't expect the development teams working today, who are designing for 18 months from now, to wait for Microsoft or ARB to tell them what to do, and you can't expect features designed for 18 months from now to be agreed upon by everyone, especially since it gives away what the IHVs are working on.

I've been sitting on standards comittees for years, and it's simply not how it works. What you do is lobby to get your vision and your features supported, and then once the spec reaches a working draft, you start trying to do two things:

1) alter your design to track the working draft
2) expose features which did not make it into the working draft in a nice and friendly way

But this is late in the game, and sometimes it is not always possible to make radical changes, so you end up non-compliant.

If we had to live in a world with "design by committee", we'd be f*cked.

This idea that you don't add any features that aren't in current or future specs is a naive fairy tale.

Bingo, exactly what i was saying only to get sarcastic responses. NVIDIA where the undisputed industry leader at the time they designed NV3x, the had been since the TNT2 days. I am gonna shut up now cause i don't see this conversation going anywhere...u bunch of fanboys u...
 
I'm intrigued by the way the R420 is going to do VS texture lookups compared to the NV40 personally.
AFAIK, the NV40 has dedicated VS lookup units, but it can send pixels in those VS pipelines. On the other hand, the R420 shares a few lookup units between VS/PS I believe (?). I believe this is the same thing NVIDIA originally tried to do with the NV30, but they failed to implement it, and it got canned evantually (they'd have had VS3.0 compliancy had they managed it, really).

R420's approahc is nearer to a "true" ILDP approach, although in the short term, the NV40's approach might be better because it allows NVIDIA to have more arithmetic power too.

Not much is known about the R420's VS anyway, beside it seems to come from the R400... :s


Uttar

P.S.: Keep up discussing, and maybe this thread will have the most views on the page soon ;) Right now, there are enough views for ALL employees of ATI, NVIDIA and most AIBs to have seen this thread :LOL: - not that they did, but whatever :)
 
DemoCoder said:
And the biggest reason: 3dLabs put forth one hell of a nice, clean, design.

I think if you ask some from 3Dlabs you'll find that they are quite happy that ATI saw that as well and lent weight behind the proposals.
 
Back
Top