nvidia shader compiler details

Status
Not open for further replies.
The Baron said:
DC, that's a bullshit excuse, and you know it. "Moving the limitation from the silicon into the driver?" Give me a break. First of all, that isn't NVIDIA's call to make in the first place. That's not a bug fix, that's a performance optimization that comes with an IQ tradeoff. Second, even if NVIDIA put the optimization in, there had better be a way to turn it off. If the hardware is capable of doing something and the software and IHV decide that, for performance reasons, I actually don't want to do that (forget that I've requested it), that's wrong. Hardware should be judged primarily by its own merits, meaning maximum image quality--not when it is the most comparable to its competitors' hardware.

I disagree. If your hardware can run at superior quality or precision to other hardware, I think you're justified doing apples to apples comparisons. Imagine if ATI HW could run at FP24 and FP32 (yes, it's a silly example). Would you suggest that shaders which request full precision (no _pp hint) always run at FP32, it would ATI be justified in using the minimal FP mode needed to beat NVidia IQ and performance?

Or take Doom3, which has separate paths for PS1.4 and PS2.0 HW, but which look identical. If your hardware supports 1.4 and 2.0, but runs 1.4 path faster, wouldn't your driver be justified in reporting "1.4" capability only?

I agree that the user/developer should still have control, perhaps via a checkbox in the control panel "Let driver performance optimizations (yes/no)". I don't agree with running max HW features all the time.

I also think NVidia is definately guilty of some "fishy" behavior in their drivers, but on the other hand, people go to far in claiming that all of their performance enhancements in their driver/compiler amount to "cheats" Some are legitimate, some are questionable (IQ trade off), and some are frankly hardcore cheats (clip planes). It's a grey picture, not black and white.
 
It's pitty that the current build doesn't support NV40 analysis, otherwise we can test how much faster NV40 is in shader sensitive situation. :cry:
 
Or take Doom3, which has separate paths for PS1.4 and PS2.0 HW, but which look identical
Not true as far as I know, just that the margin of improved quality of the 2.0 is relatively small due to the content being used.
& no it should be entirely up to the consumer to decide whether to run the faster 1.4 or the slower 2.0, thats why they spent so much money on the card.

Sounds like the contents of nvperfstrategy.h would be interesting.

(btw, swizzle has at least 3 meanings: apparently an altered form of swindle; a cocktail (swizzle stick); and whatever mathematical function NV is using it for, I've seen it used)
 
No, it is black and white. Does an optimized driver produce exactly the same image as the non-optimized driver? Not appear to, like people claim with brilinear, but 100%, pixel for pixel, exactly the same. Mathematically equivalent, as they say. If yes, okay, leave it. If no, either remove it or make it an option. I won't even get into 3DMark.

That's it right there. Hardware is different. Enthusiasts (I'm not talking about 5200s here) know (or should know) what that $400 or $500 piece of silicon can do before they buy it; if we just worry about apples-to-apples comparisons, we'll constantly reduce IQ until we're running everything in wireframe mode.

But, anyway, it is simple. If it changes IQ by deviating from what the application requests, it should be an option and never forced. If it does not, go ahead and force it.
 
DemoCoder said:
The Baron said:
DC, that's a bullshit excuse, and you know it. "Moving the limitation from the silicon into the driver?" Give me a break. First of all, that isn't NVIDIA's call to make in the first place. That's not a bug fix, that's a performance optimization that comes with an IQ tradeoff. Second, even if NVIDIA put the optimization in, there had better be a way to turn it off. If the hardware is capable of doing something and the software and IHV decide that, for performance reasons, I actually don't want to do that (forget that I've requested it), that's wrong. Hardware should be judged primarily by its own merits, meaning maximum image quality--not when it is the most comparable to its competitors' hardware.
I disagree. If your hardware can run at superior quality or precision to other hardware, I think you're justified doing apples to apples comparisons. Imagine if ATI HW could run at FP24 and FP32 (yes, it's a silly example). Would you suggest that shaders which request full precision (no _pp hint) always run at FP32, it would ATI be justified in using the minimal FP mode needed to beat NVidia IQ and performance?
What does any of this have to do with brilinear, which is the main point of the post you are replying to.
Or take Doom3, which has separate paths for PS1.4 and PS2.0 HW, but which look identical. If your hardware supports 1.4 and 2.0, but runs 1.4 path faster, wouldn't your driver be justified in reporting "1.4" capability only?
When did Carmack ever state they looked identical? In fact, he stated the opposite, but that the differences weren't large.
I agree that the user/developer should still have control, perhaps via a checkbox in the control panel "Let driver performance optimizations (yes/no)". I don't agree with running max HW features all the time.
Looks like NVIDIA agrees with you :LOL:

-FUDie
 
DemoCoder said:
I don't agree with running max HW features all the time.

But that's up to the user, isn't it? It's not up to Nvidia.

And you see, that's the whole problem. Nvidia isn't lower quality as a service to the user, they are lowering quality to mislead consumers on the performance of their hardware.
 
arrrse said:
(btw, swizzle has at least 3 meanings: apparently an altered form of swindle; a cocktail (swizzle stick); and whatever mathematical function NV is using it for, I've seen it used)

Thanks Arrrse, "swizzle stick" was the one that was on the tip of my tongue I couldn't remember.

Nick Spolec said:
DemoCoder said:
I don't agree with running max HW features all the time.

But that's up to the user, isn't it? It's not up to Nvidia.

And you see, that's the whole problem. Nvidia isn't lower quality as a service to the user, they are lowering quality to mislead consumers on the performance of their hardware.

I'm with Baron and everyone else who feels it should be up to the end-user to decide. If the hardware is CAPABLE of doing it right it should be an option for the user. (BTW-Hi Nick! :) )
 
Nick Spolec said:
DemoCoder said:
I don't agree with running max HW features all the time.

But that's up to the user, isn't it? It's not up to Nvidia.

I absolutely agree there. In no way can nvidia call this a benefit for the user, if there is no way to disable it. Leave it on by default or whatever, but stick a checkbox in the driver control panel and everyone would be happy. Benchmark runners can add another set of bar graphs for "nvidia - no optimizations", we can have slews of articles talking about the IQ differences between app profile optimizations and "straight" hardware, and everyone can be happy.

Damnit, I had almost convinced myself to buy an nvidia next (seeminly more complete feature set, and more importantly, fairly solid linux drivers), but seeing more stuff like this makes my skin crawl.

As for swizzle.. the term, I think, comes from swizzle stick, which is used to shake things around -- it's used in shaders to "mix around" components of a vector.
 
Whether or not this is "cheating", it does mean the world is turned upside down.

The game (UT, 3dmark) is the platform, and the driver is the app. Instead of the driver being a stable base against which different applications are coded, the applications are treated as a stable base against which driver updates are developed.

You would think that if there were app specific optimizations for UT, nVidia and Epic would get together and actually put them in UT. But I guess it is easier to roll changes into the next release of the graphics driver than to have UT put out a patch.
 
VVukicevic said:
As for swizzle.. the term, I think, comes from swizzle stick, which is used to shake things around -- it's used in shaders to "mix around" components of a vector.
I just remembered my crazy Uncle Shorty used to collect 'em and keep them in his rumbus room/bar/basement back when I was a kid in the late 60s but couldn't connect up the name 'til someone mentioned "swizzle stick". :oops:

Now I'm crashing and flashing back to some family gatherings in his basement from 3 decades ago, I'll probably have some weird dreams tonight involving a 6' inflatable "Schlitz" bottle that was the pride of my uncle's rumbus room/bar/basement. :rolleyes:
 
antlers said:
You would think that if there were app specific optimizations for UT, nVidia and Epic would get together and actually put them in UT. But I guess it is easier to roll changes into the next release of the graphics driver than to have UT put out a patch.

Yeah, but really, how fair would it be for Futuremark to release a "Special NVidia-optimized version of 3DMark"? ;)
 
My view of a valid point buried in DC's post

DemoCoder said:
The Baron said:
DC, that's a bullshit excuse, and you know it. "Moving the limitation from the silicon into the driver?" Give me a break. First of all, that isn't NVIDIA's call to make in the first place. That's not a bug fix, that's a performance optimization that comes with an IQ tradeoff. Second, even if NVIDIA put the optimization in, there had better be a way to turn it off. If the hardware is capable of doing something and the software and IHV decide that, for performance reasons, I actually don't want to do that (forget that I've requested it), that's wrong. Hardware should be judged primarily by its own merits, meaning maximum image quality--not when it is the most comparable to its competitors' hardware.

I disagree. If your hardware can run at superior quality or precision to other hardware, I think you're justified doing apples to apples comparisons. Imagine if ATI HW could run at FP24 and FP32 (yes, it's a silly example). Would you suggest that shaders which request full precision (no _pp hint) always run at FP32, it would ATI be justified in using the minimal FP mode needed to beat NVidia IQ and performance?

But that example doesn't remotely relate to what you were discussing with "moving the limitation from the hardware to the driver". That kind of faulty analogy is necessary to pretend that it does , and I can't imagine how you are thinking rationally and conclude that it is somehow better from you than any random person you decide to label a f@nboy. :-?

SPCMW condensed discussion of the obvious said:
Back to the anisotropic discussion, and the obvious departure points you overlook:

Given the latitude of what defines anisotropic filtering, the issue with nVidia's current cheats is not the mechanics of what they do...lowering filtering on selected texture stages can clearly offer benefit without image quality loss, or be desirable for a user to have control over, just like slider positions with less than maximum aniso and AA settings.
These cheats are not the same as going to a minimum spec precision when it is allowed, or calling trilinear trilinear and brilinear something else...they are (quite obviously if you're not hell bent on defining your position not by logic but by a negation of the f@nboy opinion you view disagreement as and end up resembling) a policy of (1) behaving differently when image quality is being examined in a test app in contrast to (2) when fps is being measured, in order to (3) deceive a consumer specifically about when they get simultaneously for both when they are playing a game on your hardware.

How you can expect to be taken seriously when you ignore obvious things like how nVidia's current/past uses of brilinear, shader substitutions, and texture stage reductions as well as clip planes fit this, and your precision example doesn't resemble these at all, I do not know.

Using fp16 when the _pp hint is is specified, or perfectly using _pp when it will not introduce errors (as nVidia developer relations can actually do by working with developers if they focus on equivalent shaders, but their "compiler" fails to do consistently almost as if (3) is its primary function), does not fit this label of "cheating", though equating cheating by using fp16 and using fp16 when specified is an illogical condemnation of nVidia that ignores such distinctions. How you don't see the resemblence to your illogical analogy I don't know, and so have concluded you don't care as long as you don't disagree with the conclusion.

Even when not ignoring the distinction, it is clear that it is possible that at some point, with nVidia's (user controlled) application detection options and an actual change in nVidia's (apparently justified in many cases, unfortunately) perception of users' willingness to excuse or not knowing better about misrepresentations, things such as brilinear and texture stage optimizations can be beneficial and not cheating. What I tend to think is that for you this is an issue of focusing on the possibility of these things being used in another way, the problem being your ignoring the simple observation that they are not being used in that other way at current and (in my view, typically) going overboard in your defense of that possibility by working to outright dismiss people's reasons for judging an IHV in ways you've decided not to bother to do.

Hopefully, with the bad analogy air cleared, I'm right in thinking that your hypothetical angle dependent aniso discussion supports the former possibility, because it is a valid point that gets lost in the illogical comparisons you present (and from rumors, it might actually be an issue for the NV40). To address it: it wouldn't be a cheat simply if it gained performance by a shortcut that manifested angle-dependent issues, it would be a cheat if it was misrepresented. Pretty damned simple issue...just like brilinear and substituting fp16 (or fx12) without prioritizing compliance with what is being asked for. Don't forget (1), (2), and (3) please.

What I think is the valid point

Recognizing that, what does present itself as an issue (still simple) is that of whether, for example, an aniso test app showed no angle dependence with the same settings that actually resulted in angle dependence in a game/benchmark.
If there is no such deception, even if the hardware could do better and there simply was no setting that exposed the "full" aniso, it would be an issue of denying users image quality they could otherwise have, but one consumers could be directly apprised of by available tools. It wouldn't be good to have hardware functionality denied for fps boosting, and some people could view that as cheating (because of the various meanings of that word, and that users are being denied/"cheated of" something the hardware is capable of), but as you propose it clearly isn't the same thing.

I hope we stay clear on the valid point about the hypothetical angle dependent NV40 aniso, why it might be cheating, might not, and why people might still have reason not to like it in the latter case....distinguished from the "BS" parallel that ignores so many distinct issues.

Now on to the separate examples, that hopefully further support my tak on your reasons for your viewpoint.

Or take Doom3, which has separate paths for PS1.4 and PS2.0 HW, but which look identical. If your hardware supports 1.4 and 2.0, but runs 1.4 path faster, wouldn't your driver be justified in reporting "1.4" capability only?

Well, no, because you take control away from the developer and facilitate misrepresentation of PS 2.0 performance. :-?
But, as I think you were trying to refer to (because of most of the NV3x line and how some people have argued, illogically I agree, that nothing less than "PS 2.0" should be used for them regardless of what is needed for the game, or their performance), Doom 3 would be perfectly justified in treating them as PS 1.4, and it would also accurately be able to report that it was when asked. Even with image quality difference between them.

I agree that the user/developer should still have control, perhaps via a checkbox in the control panel "Let driver performance optimizations (yes/no)". I don't agree with running max HW features all the time.

Here is a problem: how does Baron's opposition to your statements require running max HW features all the time? Why'd you bring it up?

I also think NVidia is definately guilty of some "fishy" behavior in their drivers, but on the other hand, people go to far in claiming that all of their performance enhancements in their driver/compiler amount to "cheats" Some are legitimate, some are questionable (IQ trade off), and some are frankly hardcore cheats (clip planes). It's a grey picture, not black and white.

Then why do you ignore inconvenient shades of grey, and conclude in black and white? You had a valid point in my view, and I think talked right past it on a crusade of tilting at fanboys.
 
VVukicevic said:
Yeah, but really, how fair would it be for Futuremark to release a "Special NVidia-optimized version of 3DMark"? ;)

The Dig bites his tongue HARD out of respect for the Worm and exits the thread for the night without a snappy retort.
 
A very recent reviewers guide highlights NVIDIA's optimisation policy and again it reiterates that is shouldn't just benefit a bechmark. Their own policy is at odd's with the comments in their own drivers (3dmark).
 
DaveBaumann said:
A very recent reviewers guide highlights NVIDIA's optimisation policy and again it reiterates that is shouldn't just benefit a bechmark. Their own policy is at odd's with the comments in their own drivers (3dmark).

That would depend on when the comments (which look like a compiler help file) have been added.
 
siconik said:
DaveBaumann said:
A very recent reviewers guide highlights NVIDIA's optimisation policy and again it reiterates that is shouldn't just benefit a bechmark. Their own policy is at odd's with the comments in their own drivers (3dmark).

That would depend on when the comments (which look like a compiler help file) have been added.

Why would the timeframe be important?
 
AlphaWolf said:
siconik said:
DaveBaumann said:
A very recent reviewers guide highlights NVIDIA's optimisation policy and again it reiterates that is shouldn't just benefit a bechmark. Their own policy is at odd's with the comments in their own drivers (3dmark).

That would depend on when the comments (which look like a compiler help file) have been added.

Why would the timeframe be important?

Because a compiler help file written 2 years ago does not mean that Nvidia is planning to renege on policy outlined in a "very recent reviewers guide".
 
'Swizzle' the the term used for the remapping of input components - so instead of r0.xyzw it can be r0.yzxw.
 
Status
Not open for further replies.
Back
Top