nV40 info w/ benchmarks

well if thats true and half life 2 is running at full dx 9 persicion looks like they got about 35 fps on the 9800pro. Not impressive to me in the least .

Now those unreal benchmarks are. Of course I have to see the actual image quality come out it before i even think of doing it .

edit sorry if it was messing up the board.
 
I mentioned this a while back to Uttar, and if those specifications are correct then it looks like the pipeline is merely two NV35 pipelines with some fixes to the PS in order to increase the number of instructions and also fix the FP32 temp usage. PS2.0 Extended wasn't that far off PS3.0, other than instruction lengths one of the main differences will be the conditionals - it'll be interesting to see if NV have just taken the conditional write masks and exposed them as full conditionals, in which case the PS3.0 performance will drop off dependant on the number of potential outcomes.
 
DaveBaumann said:
if those specifications are correct then it looks like the pipeline is merely two NV35 pipelines
Do you mean NV35 with twice the shader pipelines (pixel and/or vertex?) or do you mean 2x full NV35? The die would be too big for 2x full NV35, or what do you think?

DaveBaumann said:
it'll be interesting to see if NV have just taken the conditional write masks and exposed them as full conditionals, in which case the PS3.0 performance will drop off dependant on the number of potential outcomes.
Isn't that what ATI was supposed to be doing, too?
 
madshi said:
Do you mean NV35 with twice the shader pipelines (pixel and/or vertex?) or do you mean 2x full NV35? The die would be too big for 2x full NV35, or what do you think?

No, just the PS pipes. It looks like they are doubling up on the number of quad pipelines NV30/35 has - from those specifications at least.

DaveBaumann said:
it'll be interesting to see if NV have just taken the conditional write masks and exposed them as full conditionals, in which case the PS3.0 performance will drop off dependant on the number of potential outcomes.
Isn't that what ATI was supposed to be doing, too?

ATI don't have conditional write masks in the PS.
 
DaveBaumann said:
ATI don't have conditional write masks in the PS.
Oh, okay.

But ATI was also supposed to use a cheap (and eventually slow) way to achieve Shader 3.0, or am I wrong? When I heard that, I thought "NVidia might have an advantage in Shader 3.0 speed then". But now what you said sounds like *both* IHVs would implement a cheap (and eventually slow) method to implement Shader 3.0. What do you think?

Thanks!
 
Nice catch, nelg.

Comparing numbers (blindly, considering we don't know what maps were used to obtain these numbers) with the 5950U in AT's Fall 2003 High-End Shootout, I see some impressive and some just good improvements. UT2K3 numbers seem to jump 30-50%, though this is one bench where the choice of map is key, so I can't be sure that's a legit comparison. Halo numbers seem to be doubled, an impressive achievement.

I'm not gonna touch HL2 numbers with a ten foot pole, though. :)

Dave, is it possible to double the pixel pipelines (4x2 to 8x2, if I understand you) with only an additional 50M transistors?

Edit: OK, curiousity is killing me. Using xbit's illicit numbers, the NV40 is either twice as fast or merely 25% faster than a 128MB 9800P.

I'm really hoping for improved AA.
 
madshi said:
But ATI was also supposed to use a cheap (and eventually slow) way to achieve Shader 3.0, or am I wrong? When I heard that, I thought "NVidia might have an advantage in Shader 3.0 speed then". But now what you said sounds like *both* IHVs would implement a cheap (and eventually slow) method to implement Shader 3.0. What do you think?

Well, if it were the case that this is how PS3.0 was implemented in NV40 then it would only be slow (beyond the speed of executing that many instructions) if conditionals are used and I'd assume that NV would recommend to developers that its not used in heavy load situations.

I'm curious about the VS support though. The texture lookup in the VS is probably the more interesting element of the shader 3.0 specification.
 
Re: Nice catch, nelg.

Pete said:
Dave, is it possible to double the pixel pipelines (4x2 to 8x2, if I understand you) with only an additional 50M transistors?

Well, its probably best not to really put too much stock in the quoted transistor numbers. Bear in mind though that you'll there is a lot of logic that you won't need to double up on.
 
Wasn't Half-Life 2 proposed as system/CPU limited for the 9800 (non-SE) family at 1024x768 with no AA and AF? Wouldn't 95 fps be primarily a factor of a high speed system (perhaps an Athlon 64?), even if full DX 9 settings were used and the result indicates problems with floating point shading are no longer hobbling performance dramatically?

As for UT2003, I find those numbers pretty believable for some flyby maps at a decent amount of quality (as in, not like the Quake 3, etc, numbers put out before the NV30 launch). However, was the source for all numbers nVidia or some 3rd party? I just ran a quick Benchmark.exe at 1600x1200, without closing any programs on a 3.0GHz/250FSB/200MEM system and 432/398 versus their 466/366, Vsync set to app pref:
No AA, No AF - 127 fps average (152 fps asbestos, 102 fps antalus).
4xAA 16xAF (Control panel Quality) - 73 fps average (85 fps asbestos, 61 fps antalus).
Plenty of opportunities for assumptions to distort things here without knowing the system and maps used across the board, especially with flyby dependence on system performance. For example, were the 9800 Pro numbers pulled from the same map, settings, and same system, or pulled from some other non-disclosed source (and therefore completely useless in the context given, instead of simply suspect)?

The included HALO benchmark mode drops from 33.56 fps at 1024x768 to 26.33 fps at 1600x1200 (vsync off, all settings maximum, no programs closed)...this again makes me wonder what system and settings they use, because the system seems to have a large impact.
 
Are folk placing much faith in these data? If IHVs only get the chips in February then its a leak that has to be sourced directly to NVidia demoing it to someone who writes articles online.

I just thought the wraps would stay on for a few more weeks at a minimum.

So do folk think this data is genuine?
 
Minor suggestion, after wondering why I had to keep scrolling right:

[ url = long url ] short word or phrase [ /url ] (dropping spaces)
 
Dave: I doubt they'd use conditional write masks like that; because then, they'd need 65K *static* instructions supported if they want to reach PS3.0 compliancy. Certainly not an option, unless you manage to convince MS to accept only 2K "dynamic" instructions...

Unless they'd implement looping, but not conditionals. And that seems like a completely illogical design choice too, IMO.


Uttar
 
Back
Top