AnandTech - GeForceFX 5800 Ultra..... Canceled

RoOoBo · Feb 25, 2003

DaveBaumann said:
Sabastian said:

EDIT: Also note test one (the test that nvidia was upset that futuremark used single textureing on the sky) is the only strong point of the GeforceFX5800??!!!

Click to expand...

Trilinear filtering...

So that second TMU can be also useful with single texturing

.

OpenGL guy · Feb 25, 2003

DaveBaumann said:
Sabastian said:

EDIT: Also note test one (the test that nvidia was upset that futuremark used single textureing on the sky) is the only strong point of the GeforceFX5800??!!!

Click to expand...

Trilinear filtering...

Ahem... CPU limited. The 9700 Pro system was reported as a 1.3 Ghz P4 vs. 2.5 Ghz P4.

Mintmaster · Feb 25, 2003

Those are the scores from the old drivers, aren't they? A GFFX 5800 Ultra gets over 5000 points, so a 400 MHz version should be able to get over 4000 points.

Sabastian · Feb 25, 2003

Mintmaster said:
Those are the scores from the old drivers, aren't they? A GFFX 5800 Ultra gets over 5000 points, so a 400 MHz version should be able to get over 4000 points.

Arn't the newer drivers highly suspect? I mean they use FP16 which is not DX9 spec, min FP24 AFAIK. Not too mention all the details that the drivers that you are talking about leave out of the benchmark particularly in GT1 IIRC.

Althornin · Feb 25, 2003

Mintmaster said:
Those are the scores from the old drivers, aren't they? A GFFX 5800 Ultra gets over 5000 points, so a 400 MHz version should be able to get over 4000 points.

Until the FX renders everything, i dont think scores should be considered - as such, the new drivers are a no-go for comparison right now, even though i expect perf to remain the same and rendering to be correct later.

Dave H · Feb 25, 2003

Sabastian said:
Arn't the newer drivers highly suspect? I mean they use FP16 which is not DX9 spec, min FP24 AFAIK. Not too mention all the details that the drivers that you are talking about leave out of the benchmark particularly in GT1 IIRC.

The greatest speed gains are in GT2 and GT3, which don't use PS 2.0 (i.e. floating point) at all. There are large gains in GT4 as well, to be sure, but considering how small a fraction of it uses FP shaders I doubt FP16 vs. FP32 has very much (if anything) to do with the performance gains.

Althornin said:
Until the FX renders everything, i dont think scores should be considered - as such, the new drivers are a no-go for comparison right now, even though i expect perf to remain the same and rendering to be correct later.

In general I would agree, but as GFfx isn't available to consumers yet I'm not really bothered by some slight rendering problems which as you say probably don't significantly change performance anyways. The point of GFfx benchmarks right now is to get a sense of what the architecture is capable of and to predict how it will match up to e.g. 9700 Pro. Using those criteria, it seems silly to use the old scores (which obviously have almost nothing to do with what actual performance will be) just because the new scores are not yet on compliant drivers. OTOH, the new scores should only be used on a provisional basis; they are obviously not "official" scores.

Still, I think there's a difference between a situation where, say, leaked beta drivers for an established GPU give a 5% performance boost but introduce rendering errors which may be responsible for the entire speedup, and the current situation, where all drivers are beta, and the new version gives a 100% speedup, only a small portion of which might be attributable to the rendering errors.

Sabastian · Feb 25, 2003

Thanks for the input Dave. Also there is the matter of slower memory on top of the lower clocked core. Just what is a more accurate score for the non ultra anyhow? I can't find any results with the drivers in question. I think Dave B has the non ultra version of the card, maybe he will shed some light on the matter. Again, thanks for the heads up with regards to the drivers in question.

Nagorak · Feb 26, 2003

Sabastian said:
Yeah but I think he has a real soft spot for nvidia. I think he will forgive and forget rather quickly. Hard too say how disapointed he really is. Considering how undeniably better ATi has become compared to nvidia how many negative things could he say before he starts to look like any other stary eyed nvidia fan? Like you say it will be interesting to read what he has to say with regards to nvidias next offering.

I don't understand why these reviewers don't just call it down the middle. It's a piece of hardware, not a frickin' religion or something.

Mintmaster · Feb 26, 2003

Dave H, I agree with your point in that we are interested in what the architecture is capable of. That's why I mentioned the new drivers.

However, Sabastian does have a point, although its not related to FP16 vs. FP32. These gains are suspect because they look like a case of hand tuned optimization and instruction scheduling by the NVidia driver guys, as we haven't seen any other shaders improving in performance. A case in point is the digit-life review, where all their shaders run slow on NV30 even with the new drivers.

Obviously NVidia would go to some lengths to get these shaders optimized, and even so NV30 does not perform as well as R300 in shading, even with 50%+ higher clock rate and more transistors.

The point is I don't see how compiler technology is going to improve the situation enough, and even if it does, I think what we are seeing with 3DMark2k3 is an upper limit. If you write an application with any shaders more complex than PS 1.1, chances are it will run like crap on NV30 unless you are a major developer with games that get benchmarked on major review sites.

TheMightyPuck · Feb 26, 2003

Still, for better or worse, we should wait until the GF:FX makes it to market (if ever) before we categorize it absolutely. Damn, look at me, I almost feel sorry for the little chip.

Dave H · Feb 26, 2003

These gains are suspect because they look like a case of hand tuned optimization and instruction scheduling by the NVidia driver guys, as we haven't seen any other shaders improving in performance. A case in point is the digit-life review, where all their shaders run slow on NV30 even with the new drivers.

This is a very good point. Although, just to play devil's advocate for a moment, it's worth pointing out that while the Digit Life review--with new drivers--showed GFfx losing to 9700 by a factor of ~2 on the Rightmark shader tests, back a few weeks ago Brent posted on this forum some Shadermark results--with the old drivers--that showed GFfx losing by a factor of ~3. (edited for Tagrineth) Is it the new drivers, or the difference in tests? I dunno.

Still, there are explicit comments from Nvidia engineers to the effect that the new drivers hand-optimize for the 3DMark03 shaders, if not by actually optimally scheduling the shaders in the drivers themselves, at the least by deciding case-by-case whether to run a PS 1.4 shader in NV30's general-purpose PS 1.4-2.0+ shader pipeline or to break it up into several passes through the legacy PS 1.0-1.3 path.

Now, the first thing to be said is that these statements are obviously part of Nvidia's campaign to discredit 3DMark03. They're probably true, but just as clearly Nvidia (and ATI) do the same sort of thing for previous 3DMark benchmarks, and for any games heavily used by benchmarking sites; the only difference is that they don't proclaim it so publicly and in a way that suggests the tests themselves are invalid.

The second thing to note is, golly, it sure doesn't say good things for NV30's real shader pipeline if emulating a PS 1.4 shader with 2 or 3 PS 1.1 passes is faster in many cases.

Now, with those two things out of the way I think it's obvious that normal games (i.e. outside of the 4 or 5 somehow annointed as benchmarks) are not going to get special case treatment in the drivers. On the other hand, it's also obvious that developers are going to optimize their games for Nvidia GPUs. If it's "just" a matter of trying the same effect in both PS 1.4 and multipass PS 1.1 configurations, and putting in a special path for Nvidia cards to do whichever is faster, most developers are going to do it (especially as they will have help from Nvidia devloper relations). So in some sense, it's not really that unrepresentative if the special optimizations made in the drivers are likely to be done by developers for most games.

If it's a matter of explicitly scheduling the shader in a way different from what the general-purpose compiler in the driver would do, then of course developers aren't going to have the ability to do that on their own. I find it difficult to believe that NV30's shader pipeline is so difficult a compiler target that future drivers won't be able to at least approach the performance of hand-optimized code. Still, I'd much rather wait for drivers that optimize in the general case than see hand-tuned results right now.

In case I sound inconsistent: the difference between the two cases is that in the first the developer has the ability to do the optimization themselves and likely will, but in the second they can't (unless perhaps if they use the proprietary NV OpenGL extensions).

antlers · Feb 26, 2003

Dave H said:
On the other hand, it's also obvious that developers are going to optimize their games for Nvidia GPUs.

I think developers may be reluctant to optimize for NVidia's DX9 GPU's if NVidia doesn't manage to sell a few of them first

Mintmaster · Feb 26, 2003

Dave H said:
These gains are suspect because they look like a case of hand tuned optimization and instruction scheduling by the NVidia driver guys, as we haven't seen any other shaders improving in performance. A case in point is the digit-life review, where all their shaders run slow on NV30 even with the new drivers.

Click to expand...

This is a very good point. Although, just to play devil's advocate for a moment, it's worth pointing out that while the Digit Life review--with new drivers--showed GFfx outperformed by 9700 by a factor of ~2:1 on the Rightmark shader tests, back a few weeks ago Brent posted on this forum some Shadermark results--with the old drivers--that showed GFfx outperformed ~3:1 by the 9700. Is it the new drivers, or the difference in tests? I dunno.

I think its kind of funny that you see losing by a factor of 2 rather than 3 as sort of a positive thing

On the other hand, Ilfirin wrote some long shader tests recently and GFFX was losing by a factor of over 7. I would definately attribute your observation to variation in tests rather than improvement in drivers.

Personally, I think dependant textures are the major culprit in NV30 crappy shader performance. The latency problem mentioned by DaveB also fits in with that theory, as you have to buffer out the latency of the texture fetch before you can proceed with executing the shader. This is probably the issue with Xabre's pixel shaders as well.

This is in one of the ATI optimization docs:

Dependent texture reads are quite expensive. On RADEON 8500/9000 a two-phase shader is much more expensive than a single phase shader...(I cut some stuff out here)

RADEON 9500/9700 has significantly optimized dependent texture read implementation for performance and efficiency. As well the number of levels of dependency has been increased to four in 2.0 pixel shader model. The best performance on RADEON 9500/9700 can be achieved when not exceeding two dependent texture reads. While three or four levels of dependency will provide sufficient performance, it will not be as good as with only one or two levels. Keep in mind that if arithmetic instructions are used to compute texture coordinates before the first texture fetch, they will also be counted as a level of dependency.

It looks like ATI did so well with R300 because it learned many lessons about general dependent texturing from R200. NVidia's dependent texturing in GF3/GF4 was quite fixed and predictable, and JC described it as just an extension of the register combiners.

NVidia also mentioned somewhere that it has no limit on levels of dependancy in NV30, although I don't see why such a feature is useful, as very few shaders go beyond even one level. It may be, then, that they did not optimize dependent textures at all to allow such a capability, and NV30 pretty much just stalls until the texture fetch is complete.

Tagrineth · Feb 26, 2003

Mintmaster said:
Dave H said:

This is a very good point. Although, just to play devil's advocate for a moment, it's worth pointing out that while the Digit Life review--with new drivers--showed GFfx outperformed by 9700 by a factor of ~2:1 on the Rightmark shader tests, back a few weeks ago Brent posted on this forum some Shadermark results--with the old drivers--that showed GFfx outperformed ~3:1 by the 9700. Is it the new drivers, or the difference in tests? I dunno.

Click to expand...

I think its kind of funny that you see losing by a factor of 2 rather than 3 as sort of a positive thing

He said the FX outperformed the 9700 2:1 in RightMark3D, but that with Shadermark the FX was outperformed by the 9700 3:1.

PARALLEL SENTENCE STRUCTURE, PEOPLE! PLEASE!

mboeller · Feb 26, 2003

NO;

DaveH said :

>showed GFfx outperformed by 9700 by a factor of ~2:1 on the Rightmark shader tests<

Yes, I know he has an weird phrase structure here; but nevertheless readable after 3~5 times reading

Tagrineth · Feb 26, 2003

My comment on parallel sentence structure stands...

Tahir2 · Feb 26, 2003

Are we optimising sentences as well now? 8)
BTW I agree that part of the post was kinda hard to follow.

AnandTech - GeForceFX 5800 Ultra..... Canceled

RoOoBo

OpenGL guy

Mintmaster

Sabastian

Althornin

Senior Lurker

Dave H

Sabastian

Nagorak

Mintmaster

TheMightyPuck

Dave H

antlers

Mintmaster

Tagrineth

murr

mboeller

Tagrineth

murr

Tahir2

Similar threads