nvidia shader compiler details

jsea · Apr 7, 2004

At the end of the nv_perf3x.dll file which is part of the
NVShaderPerf
there is a bunch of text which looks like a command line help for
a shader compiler that is used internally at nvidia.

The full text can be found at offset 121E90 using a hex editor
and is almost 900 lines long. It includes a lot of details about
hardware capabilities and even mentions NV4x.

Most of it is very technical, but there are many interesting
things, especially related to hand tuned shaders. Below are some
interesting quotes, including "hand tuned shaders", "3DMark" and "UT"!

First, all the data on any line whose initial keyword ends in INFO is
data that will be used in shader matching. That is to say this hand
tuned shader will be used if the data on the INFO lines matches the same
data associated with an incoming hand tuned shader.

SUBAPPINFO: 0xXXXXXXXX
This line identifies the sub-application. Not every application has
a sub-application. Things like the levels in UT are each a different
sub-app, or each of the tests in 3dmark03 or 3dmark01 are different
subapps.

CHIPINFO: identifier [+|- identifier] [+|- identifier] ...
This line is REQUIRED and identifies chips on which the pixel shader
can be used. The chips available are: ALLCHIPS, NV30, NV31, NV33,
NV34, NV35, NV36, NV3X, NV40, NV41, NV42, NV4X. Examples:

TEXTURESTAGEANISOCAP: (stage,cap) [, (stage,cap)] [, (stage,cap)] .
This line can cap the aniso for a given texture stage. The stage values
are hex numbers from 0x00 to 0x0f inclusive. The aniso cap values are a
decimal number from 1 to 16 inclusive. You cannot have the same stage
show up in more than one (stage,cap) pair in a given shader. Since
different hardware supports different sets of aniso levels on textures,
the value given here will be rounded down to the closest supported
hardware value when it is used. e.g. on Rankine only 1X, 2X, 4X, 8X are
supported. Curie supports all those plus 6X, 10X, 12X, 16X.

Maybe NV3x is "Rankine" and Nv4x is "Curie"?

digitalwanderer · Apr 7, 2004

Wow, cool find! Thanks for sharing.

I don't understand all of it, but it looks like the nV40 will have 16xAF....but 6,10, & 12xAF too? :|

(Funny about the 3dm & UT stuff too.

)

oddfellow · Apr 7, 2004

jsea said:
e.g. on Rankine only 1X, 2X, 4X, 8X are
supported. Curie supports all those plus 6X, 10X, 12X, 16X.

Click to expand...

Maybe NV3x is "Rankine" and Nv4x is "Curie"?

Looks like it.

Nice bit of digging. Well done! 8)

Tim Murray · Apr 7, 2004

*downloads NVShaderPerf before it is ripped off the server*

Evildeus · Apr 7, 2004

The Baron said:
*downloads NVShaderPerf before it is ripped off the server*

*Done* 8)

nelg · Apr 7, 2004

Brilinear lives. Brilinear Lives.

hoom · Apr 7, 2004

So when you select 16* af in drivers, you're gonna be fairly likely to get something less in reality? (depending on if there is hand tuned shaders or not?)

Nonono really ATI is totally in the same league of BS as NV

Mendel · Apr 8, 2004

arrrse said:
So when you select 16* af in drivers, you're gonna be fairly likely to get something less in reality? (depending on if there is hand tuned shaders or not?)

Nonono really ATI is totally in the same league of BS as NV

actually yes. ever heard of angle dependant AF on ati products?

btw, anyone interested in copypasting the actual text in the .dll to a txt file, hosting and posting a link to it so we could read all of it? [The hex editor that I use (xvi32) doesn't do too well on text reading

]

anaqer · Apr 8, 2004

A bit raw, but here it is.
http://ond.vein.hu/~rasputin/download/nv_perf3x.txt

Althornin · Apr 8, 2004

Mendel said:
arrrse said:

So when you select 16* af in drivers, you're gonna be fairly likely to get something less in reality? (depending on if there is hand tuned shaders or not?)

Nonono really ATI is totally in the same league of BS as NV

Click to expand...

actually yes. ever heard of angle dependant AF on ati products?

actually, no. ati doesnt lie about it.

hoom · Apr 8, 2004

Its also built in to the design of the algorythm implemented in hardware on the chip rather than the drivers rounding down.

Mendel · Apr 8, 2004

Althornin said:
actually, no. ati doesnt lie about it.

High level Anisotropic filtering, or lack thereof is still apparent on both cases. Nvidia "lies" on software. ATi limits angles by hardware compromises.

Effectively I select 16x AF and get something completely different. Therefore I call it Python filtering

Ichneumon · Apr 8, 2004

I guess what I find the most sad about the last year and Nvidias NV3x malarky is that no one even bothers to be indignant or get upset about further flat-out proof that Nvidia bald faced lying about performance of their cards in non-benchmark apps... not to mention lowering quality without giving users the *option* to turn the chea^h^h^h^hoptimizations off!

BRiT · Apr 8, 2004

Ich, I'd say that people have been upset about it. Only, they voiced their dissent with their wallets. That is why ATI's market share grew significantly while Nvidia's fell.

digitalwanderer · Apr 8, 2004

Ichneumon said:
I guess what I find the most sad about the last year and Nvidias NV3x malarky is that no one even bothers to be indignant or get upset about further flat-out proof that Nvidia bald faced lying about performance of their cards in non-benchmark apps... not to mention lowering quality without giving users the *option* to turn the chea^h^h^h^hoptimizations off!

It's getting hard to keep up the same level and intensity of righteous indignation with nVidia's tactics when they just keep topping themselves and denying in spite of the proof.

I'll try though....BAD nVidia, BAD!!!!

digitalwanderer · Apr 8, 2004

BTW- From the thingie:

SWIZ: specifies the swizzle for the combiner constant (like xyyw or zzwx).

Is that a real word? :|

DemoCoder · Apr 8, 2004

Did Nvidia say they weren't going to do app detection anymore? I thought they flat out said they would continue to do it, with the caveat that would not not lower IQ. Suppose NVidia detected that a game consists of only 90 degree angles, so they drop requested maximum anisoptrophy down to a minimum level required for those angles. There might be some parts of a game that will look bad because of bad angles, but I fail to see how this would be any different than an app requested a level of anisotropy from hardware, and the hardware failing to satisfy that request at certain angles. You've only pushed the decision from the silicon into the driver.

It would be nice if there was zero app detection, and developers always got what they asked for, but unfortunately, it doesn't always work. One of the biggest uses of app detection in drivers is to fix bugs, where a game was developed with one card's "buggy driver" and your driver has to detect this application and emulate a bug-for-bug compatibility workaround, instead of adhering to the spec.

Tim Murray · Apr 8, 2004

DC, that's a bullshit excuse, and you know it. "Moving the limitation from the silicon into the driver?" Give me a break. First of all, that isn't NVIDIA's call to make in the first place. That's not a bug fix, that's a performance optimization that comes with an IQ tradeoff. Second, even if NVIDIA put the optimization in, there had better be a way to turn it off. If the hardware is capable of doing something and the software and IHV decide that, for performance reasons, I actually don't want to do that (forget that I've requested it), that's wrong. Hardware should be judged primarily by its own merits, meaning maximum image quality--not when it is the most comparable to its competitors' hardware.

I'm working on ripping the strings now.

Tim Murray · Apr 8, 2004

I think it's been replaced. I downloaded it earlier today, Firefox screwed me over (didn't save it, just opened it), downloaded it again, and lo and behold, no strings.

Somebody want to send me the string-laden one?

digitalwanderer · Apr 8, 2004

The Baron said:
I think it's been replaced. I downloaded it earlier today, Firefox screwed me over (didn't save it, just opened it), downloaded it again, and lo and behold, no strings.

Somebody want to send me the string-laden one?

I got the text bits, I could toss it up somewhere if you'd like.

nvidia shader compiler details

jsea

digitalwanderer

wandering

oddfellow

Tim Murray

the Windom Earle of mobile SOCs

Evildeus

nelg

hoom

Mendel

Mr. Upgrade

anaqer

Althornin

Senior Lurker

hoom

Mendel

Mr. Upgrade

Ichneumon

BRiT

(>• •)>⌐■-■ (⌐■-■)

digitalwanderer

wandering

digitalwanderer

wandering

DemoCoder

Tim Murray

the Windom Earle of mobile SOCs

Tim Murray

the Windom Earle of mobile SOCs

digitalwanderer

wandering

Similar threads