New dynamic branching demo

Mariner · Jul 2, 2004

radar1200gs said:
On ATi hardware When you use dynamic branching without the optimisations/bypasses you get a low framerate. When you bypass the branching with what humus wrote it speeds up. Conclusion - Something is wrong with ATi's dynamic branching at the driver or hardware level.

On nVidia when you use dynamic branching without the optimizations it is faster than when the optimization is enabled. In other words the hardware is already efficient at dynamic branching.

Dynamic branching is part of SM3.0 and is therefore not supported by current ATI chips. Therefore, I can't quite see how "Something is wrong with ATi's dynamic branching at the driver or hardware level" :?

The whole point of this thread is to discuss Humus' demo which allows cards that don't support dynamic branching (such as R3XX, R4XX, NV3X etc.) to access some of its benefits.

samker · Jul 2, 2004

Evildeus said:
If i was paranoaic, i would say there's a corelation between the release of the demo and the SM3.0 test @ Anand

More seriously, hope you can do something zeno . I would really like to see the difference between this tech and PS3.0.

Well, this "dynamic branching" demo is about nothing else than stealing the show to nvidia like Humus already stated because it isn't dynamic branching at all. It's just a method which does a different job but nevertheless gives the same result as dynamic branching for some rare lighting situations without the flexbility of what PS3.0 has to offer. But contrary to SM3.0, "Humus new technique" will not be used in any upcoming game.

Unfortunately for Humus, nvidia is currently stealing the show to his fake dynamic branching demo with the upcoming FarCry 1.2 patch (previewed at anandtech and techreport) which includes major performance increases with the help of the SM3.0 path in a real-world game and not in a limited tech demo with some random lights swirling around in a small room.

and yes, you've guessed it, I only registered to post this... but you can expect more

Humus, I must really say that I am deceived of you and the way you're spreading misinformation around different forums. You really should know better.

Evildeus · Jul 2, 2004

Well, welcome here, thanks for the explanation and post more

HAL-10K · Jul 2, 2004

He is using a programming trick to do in this specific shader something that has similar results like dynamic branching.

But this does not show that dynamic branching "is just a marketing thing".

It is impossible to do effective dynamic branching in most shader programms with SM2.0!

Dave Baumann · Jul 2, 2004

samker said:
and yes, you've guessed it, I only registered to post this... but you can expect more

Please, feel free to discuss the technical merits / drawbacks of various solutions.

SvP · Jul 2, 2004

Quite a number of people seem to be personally offended by this piece of code

AlNom · Jul 2, 2004

"nVidia can consider themselves owned

"

sounds more like a joke to me, especially with the smilie there.

trinibwoy · Jul 2, 2004

Mordenkainen said:
Sorry, but I really don't agree with this justification. Say ATI pays valve to make all nvidia cards run in fixed-function mode. I'm sure nVidia fans would be extremelly happy by that.

I'm not really trying to justify anything. And your analogy is poor at best

Far Cry does support the highest shader model available on ATI cards - 2.0. Running a 2.0 card at 1.1 is in no way analogous to this situation.

And didn't nVidia fans complain that 3Dmark03 had ATI optimisations (PS 1.4) so much that futuremark had to patch it to include a PS 1.1 fallback?

Similar situation, wrong application. I don't think you can equate a benchmark to these extra Far Cry features. Please realize that these features are in a patch and are not what you 'paid for' so to speak. If you though Far Cry was awesome before and willing to buy it, why would you change your mind now? The game hasn't.

Evildeus · Jul 2, 2004

Alstrong said:
"nVidia can consider themselves owned

"

sounds more like a joke to me, especially with the smilie there.

You may be right, but it seems to me that the reason of the is to try to demonstre the non-advantage of PS_3.0 branching (and hence Nv40) and it seems that many people (and even new ones) are displeased with this way and even more disagree with the advantages of this particular technic.

Unfortunately the discussion over this technic is a bit difficult as the times of posting are quite incompatible.

pat777 · Jul 2, 2004

Sorry, but I really don't agree with this justification. Say ATI pays valve to make all nvidia cards run in fixed-function mode. I'm sure nVidia fans would be extremelly happy by that.

You could always force the SM 2.0 path if that happens. NV40 can do everything any version of SM 2.0 can do without code change.

991060 · Jul 2, 2004

It's weird, my customer test shows that NV40 does have early stencil rejection, even if alpha test is enabled. :?:

Xmas · Jul 2, 2004

991060 said:
It's weird, my customer test shows that NV40 does have early stencil rejection, even if alpha test is enabled.

What is weird about that? Stencil test and alpha test can only kill pixels, and both happen before anything is written to memory, so the order of those tests is irrelevant (but they affect operation of hierZ). Now, if you were talking about stencil operation...

digitalwanderer · Jul 2, 2004

991060 said:
It's weird, my customer test shows that NV40 does have early stencil rejection, even if alpha test is enabled.

How are you testing for it?

Evildeus · Jul 2, 2004

Well, i think that what he meant is: It's weird because the Nv40 does have early stencil rejection as R420, but in humus demo the R420 does take advantage of that whereas the NV40 sees its performance decrease.

Dave Baumann · Jul 2, 2004

Evildeus said:
It's weird because the Nv40 does have early stencil rejection as R420

They may both have early z rejection, but the question is where in the pipeline they are both rejecting.

Evildeus · Jul 2, 2004

DaveBaumann said:
Evildeus said:

It's weird because the Nv40 does have early stencil rejection as R420

Click to expand...

They may both have early z rejection, but the question is where in the pipeline they are both rejecting.

Well, maybe. So it's not even interesting to compare both, now that we seem to know that Nv and Ati are not using the same way as the benchs tend to show.

Now i'm still interesting to see the PS3.0 doing the same thing, and the answers to the drawbacks pointed out over here.

991060 · Jul 2, 2004

DaveBaumann said:
Evildeus said:

It's weird because the Nv40 does have early stencil rejection as R420

Click to expand...

They may both have early z rejection, but the question is where in the pipeline they are both rejecting.

This is how I tested:
First disable stencil then run the fillrate tester using 2 instructions, the result was just as expected, about 3200MP/s.
Then set render state so that all pixels were rejected, the result was much higher, around 22000MP/s.(Z writing was disabled, and there's no color output because of the rejection)
Then I enabled alpha test, the result was about the same as the 2nd situation.

The above data clearly showed stencil rejection happens before the pixel shading unit, otherwise I wouldn't get that high fillrate.

edit: spelling.

digitalwanderer · Jul 2, 2004

Thanks 991060.

reever · Jul 2, 2004

And when developers talked about the befits of SM3.0 they always mentioned branching and it's main purpose, which always seemed to be helping this problem with per-pixel lights, and now it's possible without SM3.0 albeit using a trick, where is the problem?

AlNom · Jul 2, 2004

Evildeus said:
You may be right, but it seems to me that the reason of the is to try to demonstre the non-advantage of PS_3.0 branching (and hence Nv40) and it seems that many people (and even new ones) are displeased with this way and even more disagree with the advantages of this particular technic.

Unfortunately the discussion over this technic is a bit difficult as the times of posting are quite incompatible.

Fair enough

New dynamic branching demo

Mariner

samker

Evildeus

HAL-10K

Dave Baumann

Gamerscore Wh...

SvP

AlNom

Moderator

trinibwoy

Meh

Evildeus

pat777

991060

Xmas

Porous

digitalwanderer

Evildeus

Dave Baumann

Gamerscore Wh...

Evildeus

991060

digitalwanderer

reever

AlNom

Moderator

Similar threads