Another ATIvsNVIDIA review from gamersdepot

Reverend said:
Well, as a developer, I think you can be entirely objective while having to consider making money. I do not think the influence of money makes it impossible to be objective in both work as well as public comments. I'm no game developer but someone like DeanoC (who is one) can probably back me up in this opinion of mine.

I think the provision of the standard ARB2 path as well as specific NV path is being both objective as well as being considerate. Carmack has said that he couldn't provide specific ATI paths because there are no available ATI specific paths. I would take that to mean he would've done it (i.e. provide specific ATI paths) if it was possible.

Personally, I kind of marvel at JC's ability to express what he has expressed in almost all his .plan updates (and interviews) -- to the point stuff, some details of his work, all the while never really giving the impression that he gives undeserved care for a particular IHV and/or their products. The fact that he has spoken at length about what he'd been doing wrt NVIDIA's GFFX range, instead of ATI's DX9 range, doesn't mean he isn't being objective (i.e. some people thinks he's paying more attention to NVIDIA/GFFX... but they forget that this is simply what needs to be done) -- he's just telling us things many, many developers don't tell the public. And that is describing the shortcomings of a particular type of product.

Folks rarely talk about things they have no problems with (ATI's DX9 offerings wrt DOOM3) but it is probably important to talk about things you have problems with (GFFX wrt DOOM3) as well as talking about what fixes you intend to do.

For being purely objective as well as pushing the envelope, I think hardware reviewers have a part to play as well. If a game ships with different rendering paths (be it IHV specific, or different pixel shader versions), then a hardware reviewer can choose to not test a video card with anything less than the "ultimate" configuration. If a game/software ships with ps_2_0 as well as ps_1_4 paths, and it also comes with different IHV extensions as well as a standard path, then a hardware reviewer would be "pushing the envelope" if he only tests with ps_2_0 and standard paths, instead of "dumbing down" to ps_1_4 and non-standard paths.

The point is that if a developer had previously stated his wishes for exciting new technologies, he shouldn't be faulted for providing options in his game later (while also implementing such exciting new technologies) in order for it to sell better -- objectivity can still be practiced while providing these options in order to sell more copies of the game... it is just common sense. "Pushing the envelope" then falls on the shoulder of hardware reviewers and how they choose to perform tests/conduct reviews.

I can understand what you mean however (i.e. Carmack said he wanted higher precision so he should do nothing but stick by what he said and don't care if his game runs lousy on certain video cards), but we do not live in a perfect world where everyone has the same video card. Nor in a world where a developer don't give a damn. What Valve has done wrt HL2 is another example -- Valve is "pushing the envelope" while also is being considerate (and complaining about being considerate, by stating how much additional time is spent "caring for" GFFX owners) in addition to being, well, objective.

We want objectivity practiced with common sense, not objectivity to the point of being inconsiderate.

As for your edited additions, I have always maintained that I appreciate the 9700 more than any GFFX cards (in fact, I'm so impressed by it that I have become frustrated by its lack of FP32 since I want it to be my primary reasearch card because of its speed but I have go back and forth between it and a GFFX due to my precision research... very frustrating when you have only one machine!). Carmack has not, IMO and on the contrary, clouded the issue. He has stated the problems he has been encountering with GFFX cards.


This has nothing to do with consideration for FX owners or Radeon owners. Where did I question what JC did to improve FX performance? You brought up potential sales not me. Now are you agreeing your statement had no relevants, and if it has any it invalidates your contention.

JC basically said that NV30 was an par with R300. It is not. R300 is competative with NV30 performance even when NV30 is using FX (as opposed to R300 using FP).The R300 2.0 shader performance is far better. This is a gaming card so features with real time performance are important. By what criteria is the NV30 on par with R300?

Although I have read and appreciated many of DeanoC's comments, developers are not inherently altruistic. They are people that act in their best interest like everyone else. And yes some have biases as well.
 
This has nothing to do with consideration for FX owners or Radeon owners. Where did I question what JC did to improve FX performance? You brought up potential sales not me. Now are you agreeing your statement had no relevants, and if it has any it invalidates your contention.

JC basically said that NV30 was an par with R300. It is not. R300 is competative with NV30 performance even when NV30 is using FX (as opposed to R300 using FP).The R300 2.0 shader performance is far better. This is a gaming card so features with real time performance are important. By what criteria is the NV30 on par with R300?

Although I have read and appreciated many of DeanoC's comments, developers are not inherently altruistic. They are people that act in their best interest like everyone else. And yes some have biases as well.
I suggest you re-read JC's .plan updates and interviews (at this site and elsewhere). His .plan updates are his thoughts of his works in progress. Anyone can read all his .plan updates and interviews.

JC has implicitly said that the NV3x is worse in performance than ATI's DX9 offerings. He is biased in stating the truth.

I am sorry if my posts seem to go here and there to you but I seem to have lost track about questions on JC's objectivity and his work that tries to improve performance for FX cards. Both kinda seem mixed-in to me.

Summary : You say JC lacks objectivity. I say no such thing based on the way I re4ad his .plan updates and interviews. You may read them differently. Let's agree to disagree. Sorry if I meandered a bit.
 
A different interpretation...

Doomtrooper said:
John Carmack said:
I am using an NV30 in my primary work system now, largely so I can test more of the rendering paths on one system, and because I feel Nvidia still has somewhat better driver quality (ATI continues to improve, though). For a typical consumer, I don't think the decision is at all clear cut at the moment.
So what I read from that, he is stating there is no clear cut winner between a 9700 and 5800...
fwiw, I thought the underlined portion of Carmack's quote referred to driver quality, not the overall quality of the hardware or which would be better for doom 3.
 
DaveBaumann said:
Also, where are you getting the improved FP16 performance from? According to these tests FP16 is still in line.

http://www.beyond3d.com/forum/viewtopic.php?t=6430

Actually, FP16 performance is still slightly below FX12 performance on the NV35, but as pointed out that could be due to lower latency for integer arithmetic. In any case, it's certainly a marked difference from the NV30.

Incidentally, that thread notes the similar figures on MDolenc's (DX) fillrate test, with FP16 on the NV35 in line (per clock) with the NV30. No, I have no theory as to why that would be the case, beside the old "better drivers in OpenGL than DX", which is wearing a bit thin as an explanation by this point.

Prod me some when I have a little time, to remind me to mail John and ask about these performance figures and his statement a little more...

*prod*

...

*prod*

...

*prod* *prod* *prod*

/shakes stick and mumbles
 
Dave H said:
*prod* *prod* *prod*

/shakes stick and mumbles

Lemme help:

poke.gif
 
Hyp-X said:
The difference is:
- the R200 path uses normalization cubemaps for vector normalization, and uses a low order approximation to specular power.
- the R300 path uses arithmetic vector normalization (dp3/rsq/mul), and arithmetic specular power calculation (log/mul/exp).

The R200 path is likely bandwidth limited (the cubemap lookups can be stressful because of the non-standard read pattern).
The R300 path is likely shader computation limited.
The fact that the difference is small only shows how well balanced the R300 really is.

...

What is not clear is whether the NV30 path uses cubemaps and what kind of approximations used. I woldn't be surprised if using the FP unit for the rsq in normalization would be faster than using cubemaps (if everything else is using the FX combiners).

I had a response to this post started that I never quite finished, trying to figure out if it was more likely the NV3x path would use math or cubemap lookups to renormalize. (It seems clear from Carmack's Jan 17 .plan that, at that point, only the ARB2 path did this; but obviously that could have easily changed since.)

In that vein, I found it extremely interesting that one of the optimizations made by HL2's "mixed" DX9 mode--essentially their version of an NV3x path (although obviously without recourse to FX12 ops)--was that, unlike the full DX9 path, it often uses lookups to precalculated textures in place of doing the math with PS 2.0. While it was not specifically mentioned by name, the use of specular lighting and bump mapping has got to mean that one of the prime instances of this is for vector renormaliztion.

In other words, it's worth noting that in Valve's version of the NV3x path, they have apparently found it's a performance benefit to use a cubemap lookup instead of math to renormalize.

On the other hand, it's worth noting that the only NV3x card tested that seems a big gain from the "mixed" path is the 5900; the 5600 actually manages to lose performance, and this is on top of other optimizations like more liberal use of the _pp hint and rewritten shaders. Of course the 5900 is the most likely to reap performance gains from a shader-for-bandwidth trade, because it has loads of ridiculous bandwidth to spare (particularly with no AA). The 5800 wasn't tested, of course, but I'd expect the opposite from it, seeing as how it's by far the most bandwidth starved card ever, uh, I mean, of the NV3x family.

It's not really clear of course that Doom3 would see a similar performance trade-off from this decision. You made the argument that NV3x should have the FP32 unit free to do normalization analytically, because most of its shader calculations will be using the FX12 units instead. I'm not sure that's entirely true when you consider that the FP32 unit shares functionality with the texture coordinate generators, and thus won't actually be free very often in a shader that's heavy on texture sampling, as I think the Doom3 lighting algorithm is. Of course you might expect it to be more free running Doom3's shaders than running the PS 2.0 shaders in HL2, which won't be making use of the FX12 units at all.

So where does that leave us? Unsure. On the one hand, we have a specific example of a developer, faced with the very same decision, essentially telling us that the way to optimize for NV3x is to use cubemaps, not math. On the other, the card that shows by far the greatest benefit from this optimization is the one that's supposed to be running the ARB2 path instead. Maybe the notion that NV35 is supposed to have identical performance on the NV3x and ARB2 paths can tell us the answer--that would presumably indicate the NV3x path normalizes with math after all, since it would appear to be a significant advantage for NV35 to use cubemaps instead.

Blah blah blah. I guess the best we can do is: it probably uses math but who knows. Incidentally, it's worth pointing out that on the off chance the NV3x path uses cubemap lookups like the R200 path, it should actually end up looking worse than the R200 path run on the R3x0, right? (Same features enabled, but using primarily FX12 and FP16 instead of FP24.)
 
I consider Doom3 to be a DX8.1 "zixel bound" game, not a DX9 engine, and I think it's becoming more and more clear that NVidia was designing the NV30 to be an uber DX8.1 stencil/z filling card for Doom3, and not a DX9 FP monster.

Is it possible that the floating point support of the NV3x was added at a later stage and that the original design they started with was an evolution of PS1.4 (more regs, longer shaders, predication, etc) and not a redesign (FP-everything) The R300 seems floating point to it's core, as if every part of the design was predicated on the assumption of FP everywhere. That's why there doesn't seem to be any restrictions on what FP can be used for (cube maps, MRTs, etc)

The NV3x seems like they started out with the idea of building PS1.5, discovered halfway through that PS2.0 was progressing way faster and floating point was becoming more than just a "checklist" feature, but the MAJOR feature. They then tacked on FP32 afterwards, but the fact that there are serious limitations on its use to me suggests that the chip wasn't designed from the very beginning with FP in the pixel pipeline.


If they were seriously about FP from day1, I think they would have dumped the FX12 part of the pipeline and just made FP32 run as fast as possible.

The zixel tradeoff also seems to suggest a DX8.1 Doom3 stencil fill monster design.
 
Reverend said:
I suggest you re-read JC's .plan updates and interviews (at this site and elsewhere). His .plan updates are his thoughts of his works in progress. Anyone can read all his .plan updates and interviews.

JC has implicitly said that the NV3x is worse in performance than ATI's DX9 offerings. He is biased in stating the truth.

I am sorry if my posts seem to go here and there to you but I seem to have lost track about questions on JC's objectivity and his work that tries to improve performance for FX cards. Both kinda seem mixed-in to me.

Summary : You say JC lacks objectivity. I say no such thing based on the way I re4ad his .plan updates and interviews. You may read them differently. Let's agree to disagree. Sorry if I meandered a bit.

Don't assume I have not read JC .plan(s) and interviews because I don't agree with you. I haven't read them all but I have read some, and one of them was actually with that quote. Although he has said that he has had to do more work on the NV30 path, he has also said things like FP16 is enough. The FP16 is enough comment is strange when you actually are using FX. Didn't he only recently acknowledge the NV30 path actually uses FX?

When the evidence consistently shows how impressive a chip R300 is compared to NV30 you only look foolish denying the obvious. I believe someone that works at this site said something similar about Kyle. The fact is JC's statement in question is clearly wrong and he knew better.
 
Back
Top