Carmack's comments on NV30 vs R300, DOOM developments

Is that right, Joe? I could have sworn it was the other way around, i.e. the R200 does have a FF TCL unit and it was one of the things they cut with RV250.

I don't believe that's correct. To my understanding the RV-250's vertex unit is actually closer to the R-300's in design than that of the R-200. They didn't cut out anything from the R-200's design wrt vetex processing / T&L ...they totally revamped it.

I'll do a bit of searching to refresh my memory though...
 
antlers4 said:
I'm really beginning to see why the R300 was such a successful product. They took the bold step of eliminating the fixed-functionality on all their previous chip generations and running all their operations with FP24. This clean, forward-looking design allowed them to achieve excellent performance on a mature process--their whole transistor budget was devoted to getting their FP24 path running fast enough to support everything. It's remarkable that their drivers were as good as they were at launch, considering how big a break this was with previous designs.
You do remember how DX9 drivers weren't available for a couple of months after launch, right? Notice that the NV30's drivers are also quite good in the fixed-function case.

The final result is a big chip that has to be clocked at 500 Mhz to beat/exceed the R300, and so runs hot and power hungry even at .13 micron.
Due to excessive clock speed goals and the lack of low-k dielectrics.

I wonder if in the NV35 (or the NV31 or NV34) they will follow ATIs lead and run everything through the FP path.
Texture filtering is not going through the FP path on any processor. I find it rather doubtful that nVidia kept most of the integer pixel shading path (it appears that they just kept the integer register combiner path...).
 
Chalnoth said:
You do remember how DX9 drivers weren't available for a couple of months after launch, right? Notice that the NV30's drivers are also quite good in the fixed-function case.
Is that the drivers, or that it is using the old fixed-function engine from the GF3/4 for fixed fuction work, rather than using only PS/VS?
 
Joe DeFuria said:
According to Carmack, nVidia has stated that performance enhancements are coming....so why should developers worry about NV30 extensions at all?

Which raises the additional question: why is CARMACK supporting NV30 extensions at all? The only reasonable explanation IMO, is that he is not confident that nVidia ARB2 path performance will be increased to the point where it is competitive. At least, not by the time Doom is released....
Maybe it's because he started out with NV30 paths <shrug>.
 
Maybe it's because he started out with NV30 paths <shrug>.

He also "started out" with vendor specific vertex paths, but dropped them in favor of the ARB path. That just makes sense as you don't have to support so many different paths as long as performance doesn't severly suffer because of it. So certainly, carmack would drop the NV30 path altogether if he felt it would offer no significant performance or quality increase over the ARB2 path.

To be clear, this is not to say that nVidia's ARB2 path can't come up to speed. However, if Carmack was confident that would be the case, I see no valid reason for still supporting the NV30 path.
 
Chalnoth said:
You do remember how DX9 drivers weren't available for a couple of months after launch, right? Notice that the NV30's drivers are also quite good in the fixed-function case.

Since I was talking about emulating existing functionality, and since DX9 wasn't released for months after the R300 was released, the availability of DX9 drivers is hardly relevant here, is it? :D ATI drivers were plenty buggy initially, of course, but they worked fine in many games and cleaned up quickly.

My point is that NVidia took a more conservative design route (logical design, not process design, of course), so it's hardly surprising that the NV30's fixed-function drivers are in good shape.

Chalnoth said:
The final result is a big chip that has to be clocked at 500 Mhz to beat/exceed the R300, and so runs hot and power hungry even at .13 micron.

Due to excessive clock speed goals and the lack of low-k dielectrics.

But why does it need 125 m transistors and excessive clock speeds to keep up? I argue that it is because NVidia didn't make the bold choice that ATI made to run everything through their FP pipe. NVidia made this decision because they didn't believe that they could/or needed to make their FP pipe run fast enough. And if you don't try, you don't succeed.

ATI kind of "cheated" by maxing out at FP24, and this may end up hurting them in the CGI market, but I believe it won't hurt the gamer on any game released before 2007.
 
Doomtrooper said:
ATI kind of "cheated" by maxing out at FP24

What would be the difference in the output in a image comparing all three FP 16 24 and 32 ?? Rendering errors ?? Less Vivid...can anyone explain this part ?

If the calculations lose too much precision, you might see some banding, pixelization or visible polygon boundaries. The ATI DX9 car demo of normal mapping shows the kind of thing that happens when you lose precision (though I think the effect is subtle even there where they are comparing with fixed point). If you try Humus' Mandelbrot demo and zoom in too far, you might see another result of loss of precision.
 
Thanks antlers...so ideally 32 FP should be used all the time for any high end rendering scene...yet for games might be overkill (Carmack appears to think that way)

ATI not offering FP 16 according to John Carmack is a oversight and appears to think that 24 FP is Overkill ?
 
Doomtrooper said:
ATI not offering FP 16 according to John Carmack is a oversight and appears to think that 24 FP is Overkill ?

I don't think he thinks FP24 is overkill as long as the performance is there (and it is). I think his major problem with the R300 is limited shader length.
 
Doomtrooper said:
Thanks antlers...so ideally 32 FP should be used all the time for any high end rendering scene...yet for games might be overkill (Carmack appears to think that way)

I'm not sure how many iterations would be required to show a difference between 24FP and 32FP, but it should be quite a bit more than required to show a difference between 16FP and 32FP. In either case, for games both 16FP and 24FP seem quite sufficient for color processing.

ATI not offering FP 16 according to John Carmack is a oversight and appears to think that 24 FP is Overkill ?

No, I don't think he said "oversight" at all. I think he did imply "overkill" for games, but since it is competing with another part running at 16FP that should not matter AFAICS.

It must feel good to have that choice vindicated (ap, se, ogg?)
 
Joe DeFuria said:
He also "started out" with vendor specific vertex paths, but dropped them in favor of the ARB path. That just makes sense as you don't have to support so many different paths as long as performance doesn't severly suffer because of it. So certainly, carmack would drop the NV30 path altogether if he felt it would offer no significant performance or quality increase over the ARB2 path.

To be clear, this is not to say that nVidia's ARB2 path can't come up to speed. However, if Carmack was confident that would be the case, I see no valid reason for still supporting the NV30 path.
Shouldn't fault a man for leaving options open for a demanding game while also remembering NVIDIA's drivers history.
 
Joe DeFuria said:
According to Carmack, nVidia has stated that performance enhancements are coming....so why should developers worry about NV30 extensions at all?

Which raises the additional question: why is CARMACK supporting NV30 extensions at all? The only reasonable explanation IMO, is that he is not confident that nVidia ARB2 path performance will be increased to the point where it is competitive. At least, not by the time Doom is released....

One problem with the ARB_fragment_program and NVidia is that while you can specify "ARB_precision_hint_fastest" to instruct the driver, that covers the whole program and if you absolutely need precision in some parts (like normalizing vectors probably) you are kind of screwed. NVidia's extension allows you to explicitly specify which operations you want to be performed with maximum accuracy.

Like the spec says:

More explicit control of precision, such as provided in "C" with
data types such as "short", "int", "float", "double", may also be
a desirable feature, but this level of detail is left to a
separate extension.

...
 
Joe DeFuria said:
Maybe it's because he started out with NV30 paths <shrug>.

He also "started out" with vendor specific vertex paths, but dropped them in favor of the ARB path. That just makes sense as you don't have to support so many different paths as long as performance doesn't severly suffer because of it. So certainly, carmack would drop the NV30 path altogether if he felt it would offer no significant performance or quality increase over the ARB2 path.

To be clear, this is not to say that nVidia's ARB2 path can't come up to speed. However, if Carmack was confident that would be the case, I see no valid reason for still supporting the NV30 path.

You're really over-reaching and drawing conclusions that aren't necessarily there. Could it possibly be that all the paths are still there, he's just not actively supporting them anymore? I mean, really, he talked about the R200 path also. Does that mean that he has reservations that the 9700 can't deliver on the ARB2 path?
 
We are back to comparing vendor specific extensions that are proprietary to ARB extensions..of course proprietary extensions will be better as its optimized for one manufacturer, now if we start getting three and four major players in the graphics industry we have 3 and four companies wanting proprietary extensions which I assume wouldn't be ideal for the programmers.

Why can't Nvidia and ATI optimize the ARB extensions to suit their needs, vs proprietary garbage.
 
Why can't Nvidia and ATI optimize the ARB extensions to suit their needs, vs proprietary garbage

From JC:


Reasonable arguments can be made for and against the OpenGL or Direct-X style
of API evolution. With vendor extensions, you get immediate access to new
functionality, but then there is often a period of squabbling about exact
feature support from different vendors before an industry standard settles
down. With central planning, you can have "phasing problems" between
hardware and software releases, and there is a real danger of bad decisions
hampering the entire industry, but enforced commonality does make life easier
for developers.
 
Of course there is going to be squabbling, a IHV that wants to expose more potential in their hardware over another is not good for the PC 'gaming' industry (results are Never Winter Nights and Dronez)...thats the whole idea behind a governing body vs. one voice.

Everybody gets input...not just one. All I'm saying what would Opengl look like right now if 3Dlabs and ATI and Matrox and VIA and PowerVR started to demand support for their own proprietary extensions to expose the optimal peformance on each and every hardware.
 
You're really over-reaching and drawing conclusions that aren't necessarily there. Could it possibly be that all the paths are still there, he's just not actively supporting them anymore? I mean, really, he talked about the R200 path also. Does that mean that he has reservations that the 9700 can't deliver on the ARB2 path?

If I am "over-reaching" about my "conclusion" (that Carmack isn't confident that NVidia's ARB2 path will be up to snuff by the time Doom3 is released), then please offer some other tangible, alternative reason why he's supporting the NV30 path.

Disagree with my opinion if you must, but don't tell me I'm over-reaching when your only response amounts to "that might not be the case", and offer no compelling alternative reason for supporting the path.

Could it possibly be that all the paths are still there, he's just not actively supporting them anymore?

:?:

Carmack said:

I fretted for a while over whether I should leave in support for
the older APIs for broader driver compatibility, but the final decision was
that we are going to require a modern driver for the game to run in the
advanced modes. Older drivers can still fall back to either the ARB or NV10 paths.

There no uncertaintly that Carmack is
1) Dropping support for those old vendor specifc "vertex program"s
2) He did CONSIDER leaving "support" in, but ultimately decided against it.
3) Modern drivers will be REQUIRED to run the vertex program paths. That is, unless you want to fall back to NV10 / ARB path, your driver had better support the "modern" vertex path.

Seems to me you're the one doing so "far reaching" speculation here.

I mean, really, he talked about the R200 path also. Does that mean that he has reservations that the 9700 can't deliver on the ARB2 path?

Did you read the same .plan that I did?

Obviously the R200 path is NEEDED for the R-200 series. Of course he's not going to drop that path.

And obviously, we already know that ATI deliverd on the ARB2 path. The fact that the ARB2 path is his choice of render OVER the R-200 path for the R-300 tells us this.
 
Back
Top