NV40: Surprise, disappointment, or just what you expected?

The Baron said:
Newsflash: Some PS3.0 tests on NV4x hardware shows that it runs less than a quarter of how fast it should run. Driver immaturity? We'll see.
What benchmarks? Where?

It seems pretty ludicrous right now that running the same shader on PS 3.0 would be any different in performance if that shader can also be executed on PS 2.0. Once you use the added features of PS 3.0 (properly, of course), performance can only increase.
 
Malfunction said:
Are all the Open GL games I play this way too?

what the one or two pure OpenGL games a year that are currently being released? Don't do much new gaming then? :p
 
Randell said:
what the one or two pure OpenGL games a year that are currently being released? Don't do much new gaming then? :p
Now that GLSL is finally available, I really hope that more developers return to OpenGL.

Anyway, remember that id, perhaps the largest developer of OpenGL game engines, hasn't released a new engine in quite a while. Once DOOM3 comes out, I'm sure you can expect a number of games licensing that engine.
 
Please.. Super Sampling, May Sharpen textures, But its more likely to Blur the screen and text of games,

I'll take Multi Sampling Over Super Sampling in just about every case cept older titles and compatibility sakes, Any Game like wc3, Everquest, Command and Conquer, Any game with REAL text is gonna show a blur from super Sampling.


Turn on AF, It'll make a huge difference in improving Multi Sampling, And ATIS multi Sampling is Rotated Grid, Nvidias is Rotated Grid, they'd have practically the same blur...
 
Newsflash: Some PS3.0 tests on NV4x hardware shows that it runs less than a quarter of how fast it should run.

What PS3.0 tests are you talking about, specifically? Links would be helpful too.

It is my understanding that PS 3.0 will primarly help the performance, and it is pretty obvious that the NV40 just blazes right through the PS 2.0 tests with super high speed. There have been some pictures floating around the net (of Far Cry I believe) showing image quality using 2.0 vs image quality making use of PS 3.0, and the 3.0 pictures look absolutely amazing in comparison!

On another interesting note, notice that the 6800 Ultra seemed to be defaulting to NV3x settings with Far Cry, using the raw drivers.

Things can really only go up from here.
 
ChrisRay said:
Please.. Super Sampling, May Sharpen textures, But its more likely to Blur the screen and text of games,
Supersampling doesn't blur anything. You're thinking of oversampling (i.e. Quincunx)

Turn on AF, It'll make a huge difference in improving Multi Sampling, And ATIS multi Sampling is Rotated Grid, Nvidias is Rotated Grid, they'd have practically the same blur...
Anisotropic filtering improves texture clarity, but does little to arrest texture aliasing. While many don't notice is, there is always some texture aliasing associated with taking bilinear samples. Supersampling cures most of this aliasing quite nicely in the near-field, while also improving texture clarity.

Granted, anisotropic filtering + multisampling really is excellent for performance, but it would be nice to be able to do, say, 4x supersampling plus some other degree of multisampling to really get rid of aliasing. I don't think we're there yet, though, unless you're playing older games.
 
jimmyjames123 said:
There have been some pictures floating around the net (of Far Cry I believe) showing image quality using 2.0 vs image quality making use of PS 3.0, and the 3.0 pictures look absolutely amazing in comparison!
Well, since Far Cry appears to be doing more with PS 3.0, it seems pretty certain that it will run slower in that mode. But I can pretty much guarantee you that if they added a PS 2.0 fallback that had equivalent quality (which should be possible for nearly any PS 3.0 shader, though possibly only through multipass), the PS 3.0 version would be faster.
 
Chalnoth said:
ChrisRay said:
Please.. Super Sampling, May Sharpen textures, But its more likely to Blur the screen and text of games,
Supersampling doesn't blur anything. You're thinking of oversampling (i.e. Quincunx)

Turn on AF, It'll make a huge difference in improving Multi Sampling, And ATIS multi Sampling is Rotated Grid, Nvidias is Rotated Grid, they'd have practically the same blur...
Anisotropic filtering improves texture clarity, but does little to arrest texture aliasing. While many don't notice is, there is always some texture aliasing associated with taking bilinear samples. Supersampling cures most of this aliasing quite nicely in the near-field, while also improving texture clarity.

Granted, anisotropic filtering + multisampling really is excellent for performance, but it would be nice to be able to do, say, 4x supersampling plus some other degree of multisampling to really get rid of aliasing. I don't think we're there yet, though, unless you're playing older games.


Super Sampling DOES blur text tho, Do you have a Geforce 4 Ti right? Just Apply 4x Super Sample, Load Star Wars Galaxies, Everquest ect, You will notice a blur, Its not a gaussian blur like quincunx tho, it sharpens textures but blurs text.
 
jimmyjames123 said:
There have been some pictures floating around the net (of Far Cry I believe) showing image quality using 2.0 vs image quality making use of PS 3.0, and the 3.0 pictures look absolutely amazing in comparison!

That's because they are "no PS" vs PS 3.0 pics. PS2.0 vs PS3.0 will give very similar output.

I know that my 9700 Pro looks much closer (ie near identical) to the PS3.0 pic than the default (unlabelled) pic that we are supposed to believe is what we would get without a NV40.
 
One issue here is a slight conceptualization expression gone awry:

First, AFAICS in some of this discussion, "PS 3.0" is "using the hardware level branching functionality", and "PS 2.0" is "not using the hardware's branching functionality"

I doubt the NV4x runs "PS 3.0" a quarter of how fast it "should" run. I'm assuming this refers to some branching penalty observations made. However, those branching penalties should be there, because that is how the NV4x managed to implement "PS 3.0". It is less than ideal, and it does limit benefit opportunities, but with devrel and driver efforts it could just do without where it would be faster.

A PS 3.0 implementation with less branching penalties would clearly be better than NV40'x PS 3.0 branch penalties, but it doesn't make sense to compare it to something that isn't out there. :-? Even if you would...

NV40's "PS 3.0" implementation is sometimes better performing (we aren't sure how much or how often) and sometimes worse (quite significantly, which is what has been shown apparently) than "PS 2.0". This might look bad, but fortunately for it, the NV40 can do both (it can hardly help it with the superset/subset relationship between them). So, with work being done towards picking between them, only the first has any reason to manifest once that work has succeeded. And, looking at both the limitations and the useful shader lengths for the NV40's performance, it seems this is a reasonably likely eventuality, and in fairly short order.

The only negative about "PS 3.0" is the marketing usage that seems to be cropping up that pretends PS 3.0 is the only thing that can do things that PS 2.0 can also do in real time. But this really has nothing to do with evaluating the NV40's hardware virtues regarding PS 3.0 implementation, so we should probably try to avoid mixing them up.
 
Yes, there are going to be times to use PS3.0 and times not to use PS3.0, but it's a checkbox feature. It can't run any of the PowerVR PS3.0 demos because of PS3.0 limitations imposed as of DX9.0c.

Its PS3.0 support seems like it might be rather similar to NV3x's PS2.0 support--luckily, PS3.0 doesn't really matter at all at this point in time (and probably never will).
 
The Baron said:
http://www.hardware.fr/articles/491/page6.html

9 cycles instead of what should be 2. Last paragraph. Find someone who knows French (that's what I did). Yes, it's a branching penalty.

OK so exactly why should it be 2?

Other than that if the test is really what they quote in the article, then it's meaningless. You need to know ghow the branch penalty is handled...

Can other ops overlap with the overhead, does it assume the branch is taken/not taken. Even if it's a static 9 cycle latency, if you coded something like a raytracer where in PS2.0 you would have to unroll the entire loop, early exit from the unrolled loop is potentially a huge win.

How much of a win PS3.0 will be in the cards lifetime remains to be seen.

As a developer it's very nice to have 3.0 now, I'll be picking up an NVidia card this time around, purely because ATI don't support 3.0. but I'm hardly the common case.
 
ChrisRay said:
Super Sampling DOES blur text tho, Do you have a Geforce 4 Ti right? Just Apply 4x Super Sample, Load Star Wars Galaxies, Everquest ect, You will notice a blur, Its not a gaussian blur like quincunx tho, it sharpens textures but blurs text.
Actually, my GeForce4 Ti died. Regardless, it doesn't support 4x supersampling anyway, so it doesn't matter.

I did have a GeForce DDR for some time, though, and I guarantee you that it did not blur text with FSAA.
 
erp we dont know if ati supports ps 3.0 until they actually announce the card

all we're hearing now is rumors

its dumb to automatically assume nvidia is better before giving ati a chance to respond

sure, i might be wrong, and nvidia could dominate this round, but one of the many voices in my head :LOL: tells me that ati isnt going to roll over and die
 
Snarfy said:
erp we dont know if ati supports ps 3.0 until they actually announce the card

all we're hearing now is rumors

its dumb to automatically assume nvidia is better before giving ati a chance to respond

sure, i might be wrong, and nvidia could dominate this round, but one of the many voices in my head :LOL: tells me that ati isnt going to roll over and die

FWIW I have very good reason to believe (read close to absolute certainty) that ATI will not support PS3.0 in R420.
 
The Baron said:
http://www.hardware.fr/articles/491/page6.html

9 cycles instead of what should be 2. Last paragraph. Find someone who knows French (that's what I did). Yes, it's a branching penalty.
That website claims it should cost two cycles.

But was the branch implemented as a dynamic or static branch? Static branching should have little to no performance hit. If static branching does indeed have an ~9 cycle overhead, then that is very disappointing, and compilers would pretty much always be better-off unrolling static branches. If it's for dynamic branching, however, a 9 cycle overhead is not bad at all, and may possibly be masked for longer shaders.

On the other hand, dynamic branching in the pixel shader pretty much has to have a significant performance hit. If you read nVidia's .pdf's released at GDC, you'll note that they expect ~2 cycle overhead in dynamic branching for the vertex shader. This may be where the article got that number. This performance hit definitely won't apply to the pixel shader.

Regardless, branching performance simply cannot be a detriment to overall performance. Branching in the pixel shader isn't the only way to get the desired effect: compares are also available. Static branching should help by reducing state changes, while dynamic branching should help by reducing the number of instructions executed. Neither should ever be a detriment to performance, because whenever one type of branch or the other could lower performance, other methods can be used to get the same effect.
 
Hey, who can read French. Not me. Somebody translate the damn thing already. I'm just really not excited about PS3.0--if the 6800 Ultra is the only thing that supports it, say hello to the PS1.4 Bin. If it doesn't support it well, it's DOA. If it supports it very well, that would be interesting, but I still don't think it will catch on until there's a broader market base. Hell, look at PS2.0, and NV and ATI were supporting that since August 2002. Heh.
 
Back
Top