Carmack's comments on NV30 vs R300, DOOM developments

antlers · Jan 30, 2003

Do we know the NV30 path in Doom 3 is floating point? Perhaps it is using 12 bit fixed point.

Doomtrooper · Jan 30, 2003

Actually there was more than just one person that made that claim, the one thing I've learned since Chalnoth has arrived here (I take nothing away from this 3D knowledge) is:

1) You will never win a arguement..sorta like having a pet rock and debating with it
2)Chalnoth will use whatever excuse he needs to put his point of view/preference (Nvidia) in the best light..

The thread he contradicts himself is here:

http://www.beyond3d.com/forum/viewtopic.php?t=3908&postdays=0&postorder=asc&start=60

tb · Jan 30, 2003

JD said:
The best part of the whole .plan file is that JC is pushing ARB extensions which is a big win for many devs disgruntled with current vendor specific extension mess. Also interesting is that he also finds the gffx noisy. Also, no mention of CG at all.

Agreed.
We don't need Cg, if we write code for OpenGL or DiretcX, it makes only sense if you write for both API's at the same time (and for NVIDIA only, because no OpenGL plugin for ARB2 are available at the moment...)

Thomas

Nagorak · Jan 30, 2003

I think it's pretty obvious that John Carmack is ridiculously biased. He's been demoing Doom3 with the R9700, yet he has an optimized code path for NV30 and none for R300? Everybody seems to be in love with Carmack because he "makes such great engines" (not my words), yet he doesn't even seem interested in optimizing across the board. Just totally ridiculous. Not that I was planning on giving id any of my money anyway for some eyecandy with 1 dimensional gameplay.

KimB · Jan 30, 2003

OpenGL guy said:
Is that so? How many operations can you do with such a 16-bit float before the difference affects the LSB of an 8-bit integer format? Not too many.

Keep in mind that most internal calculations are done at higher precision than the storage formats. The NV30 appears to do at least 12-bit precision on its 16-bit mantissa for the internal calculation. While this is probably not stored each time around, it will significantly reduce the errors from what you've stated.

But I suppose you are right, there will be some errors. The thing is, the floating-point format ensures that they will be much less noticeable. That is, the human visual subsystem is much more highly-attuned to intensity differences at very low brightness levels. For each halving of the number value, the floating-point format will effectively gain a bit of "buffer zone" before the errors become apparent. This for the reason that, for example, the number 1.1E-03 will become 0.0011 when sent to the framebuffer. If that second digit had been in error, it wouldn't make any difference in the end since the DAC can only deal with a granularity of 1/1024.

I still think 16-bit floats being enough for pretty much any color-only data (at least until 12-bit DACs are available...).

32-bit floats may be useful for things like normal maps. Whether or not these are used is up to JC, but it seems apparent to me that he is very concerned about offering an equivalent experience on every video card.

Click to expand...

Carmack may not need 32-bit floats, but he might need more than 16-bit.

So? For such cases, he will use 32-bit floats on the NV30.

KimB · Jan 30, 2003

Nagorak said:
I think it's pretty obvious that John Carmack is ridiculously biased. He's been demoing Doom3 with the R9700, yet he has an optimized code path for NV30 and none for R300?

ATI has no proprietary extension exposing the R300's fragment shading featureset (at least, none that's currently on the OpenGL extension registry). John Carmack cannot write to an extension that does not exist.

KimB · Jan 30, 2003

Doomtrooper said:
The thread he contradicts himself is here:

http://www.beyond3d.com/forum/viewtopic.php?t=3908&postdays=0&postorder=asc&start=60

Why did I contradict myself? That thread was talking about 16-bit floats in the vertex processor.

KimB · Jan 30, 2003

T2k said:
I wanna use 4xAA+8xAF at least 1280... so, what do you think NOW about a neck-to-neck situation?

C'mon: switch on this on NV30 and the race finished...

Don't expect to be able to run at that res/FSAA in DOOM3, with any video card this year.

Doomtrooper · Jan 30, 2003

Oops yes it was...do you want me to find the other thread regarding the 9700 with 24-bit...

Opengl_guy got the answers from you anyway, what I wanted to hear.

KimB · Jan 30, 2003

Althornin said:
when it looked like GFFX would be faster than 9700 with 32bit floating point (per component), you launched a tirade on these boards about how 24bit floating point (per component)was insufficient.

No, I absolutely did not. You're imagining things.

KimB · Jan 30, 2003

Doomtrooper said:
Oops yes it was...do you want me to find the other thread regarding the 9700 with 24-bit...

Opengl_guy got the answers from you anyway, what I wanted to hear.

Be my guest. I've always said that it should be enough for pretty much any calculation.

elchuppa · Jan 30, 2003

em. I have not read any of the evidence presented apart from this thread. However I would like to resolve this little war so that we can continue to enjoy what was a very interesting discussion.

So far in this thread Chalnoth appears to have simply spoken his mind about what he sees as an acceptable level of precision at present. Apparently (doom trooper) from your previence experience, you feel like he is biased against what appears to be your card of preference. Regardless of this, if you have nothing specific to argue against I suggest you simply ignore his offensive posts and enjoy the rest of the conversation.

Can't we all just get along?
:?

Doomtrooper · Jan 30, 2003

Its about getting the truth, nothing more...

Who are you ????

elchuppa · Jan 30, 2003

just a guy trying to make a difference... j/k

I actually found this thread to be one of the most interesting posted recently so I wanted to fizzle the escalation.
I understand trying to get to the truth, but it can be a pain in the arse when it brings everythread down to a slinging match.
Chalnoth may well be wrong, but he used clear arguments. It would be more constructive to attack those than him. All just my opinion of course..
In any case, the post had the opposite to the desired effect, I have diverted the thread even further so this will be the last thing I have to say on the matter. My apologies.

Peace

tEd · Jan 30, 2003

I also think 16bit FP will be enough , i mean even offline renderer don't use higher precision than 16bit FP, right?

..and nv30 will be faster in d3 than the r300...gee what a shocker

-running with less precision
-using nvidia extensions
-clocked 175mhz higher

would be a shame if not

THe_KELRaTH · Jan 30, 2003

I think it's a tad too early to start debating about what card will be faster in D3 but what did spark my interest is the info regarding R300 running at 24bits v NV30 running at 12/16bit by default.

Would I be correct in assuming that this means that every current game / benchmark test the NV30 is at an advantage as its only running in 16bit, in terms of speed, whereas the R300 is running at 24bits for higher quality. If this is the case then it should be mentioned in every comparison review.

Sharkfood · Jan 30, 2003

I think it's pretty obvious that John Carmack is ridiculously biased. He's been demoing Doom3 with the R9700, yet he has an optimized code path for NV30 and none for R300?

Carmack has always approached 3D engines with a standard set of "levels" for compliancy and specialization. It's a recurring theme. What he usually provides are the following:
A- a "standard" path usually designed to be the most commonly used
B- a few smaller, specialized engines for problem children, or optimizations for products that would benefit greatly from them.

As far back as Quake, it had Default OpenGL as the "standard" path, then provided mini-renderers for 3dfx and PowerVR.

It seems to me that with Doom3, the "standard" path has been split into ARB/ARB2 for different levels of compliancy, and has also slapped on a special R200 and NV30 paths/renderers.

You obviously dont want to write a custom path/renderer for every single videocard out there, but in this case, as he has stated, the NV30 running the ARB2 path (i.e. it has the highest level of compliancy/features) it's nearly twice as slow as an R300. So what to do?

As far as making a special R300 path/renderer, what would the pay off be? The real question would be- what what a custom R300 path offer over the ARB2 path? Precision or performance wise... Sure, you might gain a smallish performance improvement, but Im not so sure it would warrant a completely new path.

Hellbinder · Jan 30, 2003

Just noticed this...

The most significant functionality in the new crop of cards is the truly
flexible fragment programming, as exposed with ARB_fragment_program. Moving
from the "switches and dials" style of discrete functional graphics
programming to generally flexible programming with indirection and high
precision is what is going to enable the next major step in graphics engines.

It is going to require fairly deep, non-backwards-compatible modifications to
an engine to take real advantage of the new features, but working with
ARB_fragment_program is really a lot of fun, so I have added a few little
tweaks to the current codebase on the ARB2 path:

Um.. ARB2 path is the Default for the R300 and it runs 2x faster than when running on the Nv30. Which means that as things stand now nearly no one but R300 and R350 owners will bennefit from whatever cool little *Tweaks* and other toying around he does.

It just made me realize that if you read the Fine print the R300 currenly is not just a little, but a LOT better for Doom-III than the Nv30.

demalion · Jan 30, 2003

The truth, AFAICS, is that Chalnoth has indeed consistently recognized 24-bit per component as sufficient for fragment processing, and has challenged the necessity for 32-bit per component. I don't recall a change in this when the nv30's 128-bit support was announced, but I do remember, and verified, this from when the R300's 96-bit was established. Confident that it is known that I'm not afraid to criticize Chalnoth, I'll take this opportunity for laziness in posting a link and ask you take my word for it, or search for "component" with his name for yourself.

What might be confusing this is two things:

1) He initially phrased his mentioning of 24-bit per component capability on the R300 as being a tradeoff required by the R300's 0.15 micron process, amongst a long tirade of other criticisms of the R300 (the power connector, and his statements of "disappointment' based on the phrasing "without limits" as mentioned by ATI).

2) He has tended to advocate 32-bit FP values being used for vertex processing.

On another note, if Doom3 has game controlled anisotropic filtering, and with relatively efficient MS AA implementations available, why are we still thinking the GF FX and R300 can't run it with aniso and AA? How hard would it be for him to apply aniso only on the surfaces where it is reasonable? Worst case, MSAA should not be a performance issue...it is like people are stuck on the belief that "latest id game = crushes every card out there" irregardless of how powerful the cards we have now are...have people just decided to forget the leap in performance above Carmack's initial targetting?

We should separate the performance concern into "what we've seen in the screenshots" and "maximum image quality tweaks that he just referred to in his plan", but the thing is those maximum tweaks seem to me likely to be performance killers even without AA and even, perhaps, anisotropic filtering.

mboeller · Jan 30, 2003

Ã„hhh...SORRY

but were does Carmack say, that the NV30-path uses FP at all?

I read his comments completely different. For me it loks like that the NV30-path is nothing more than the NV30-specific fixed-function path. Only with this mode the NV30 was faster in Shadermark then the R300.

edit:

comparing his statements about the different cards :

NV30 ( full featured, single pass),

R200 (full featured, almost always single pass interaction rendering)

ARB2 (floating point fragment shaders, minor quality improvements, always single pass).

So why he says the same things about the NV30 and R200, but specifically says "floating point fragment shader" with ARG2 in mind. So why not the same emphasis with the NV30?. Only the normal "full featured" which only says fixed point.

Carmack's comments on NV30 vs R300, DOOM developments

antlers

Doomtrooper

tb

Nagorak

KimB

KimB

KimB

KimB

Doomtrooper

KimB

KimB

elchuppa

Doomtrooper

elchuppa

tEd

Casual Member

THe_KELRaTH

Sharkfood

Hellbinder

demalion

mboeller

Similar threads