far cry ps3 and stuff

OpenGL guy said:
Except that writing stencil is the same as writing depth if your depth and stencil buffers are combined (as most platforms use). If you have separate depth and stencil buffers, then this is a bigger win.

Well, that's true. Also, depth never needs to be updated anyway because of that first depth-fill pass, so unless the hardware is stupid and still writes depths that are equal to the stored the depth mask isn't going to matter. Disabling color writes is a win however. Disabling depth will be more like a symbolic thing you do just because you can.
 
hug.gif
 
Humus said:
pocketmoon66 said:
Nice demo of soft shadows (penumbra's) with PS3.0. Basically do a a small number of shadow sampler tests. If all 0 then you'r out of shadow, if all 1 your in shadow otherwise you are in the penumbra so do 64 samples to get your shadow level.

Works well because the expensive sampling needed is only done if youre initial small sample set is not wholly in or wholly out of shadow. 2xfaster than equivalent PS2 shader (not shown)

Unfortunately for nVidia, I have developed a technique at work that will work as a drop-in replacement of dynamic branching that works in the most important situations where you'd otherwise need or prefer ps3.0. You'll probably have to resort to very esoteric effects to really require ps3.0. The effect you mention sounds like a perfect example where this technique would work equally well (or maybe even better) than using dynamic branching, depending on how much the cost of dynamic branches are.
Another typical example is the "early-out" kind of optimization where you for instance detect that the pixel is backfacing the light or is outside the light radius and just return zero instead of going through the all the lighting computations. I have implemented this for a typical "Humus-demo" scene with four lights. With early-out enabled I get 45fps. Without it's 14fps. That's more than 3x performance improvement. If you prefer to dwell in the darkness, the difference is 136fps vs. 25fps. 8) I'm not even sure nVidia's will be able to match that performance increase with ps3.0. :devilish:
Cool, can you describe how the technique works?
 
pocketmoon66 said:
quick brain dump from the Nvidia dev conference in London on Monday/Tuesday (nothing under NDA...)

Both VS3 and PS3 paths have been added to Fry Cry.

In general, gains from using Geometry Instancing usually between 0 and 10%, though up to 40% in worst (best!) case.

Devs wanted to use PS3 to loop through dynamic number of lights in a single pass but PS3 has limitations (in specs) which prevent this from being implementable. So use static branching for up to 4 lights per pass. This created more instructions than ps2 could handle so need ps3.

Gains are good - in game fps went from 20 to 30 in demo.

HDR - Did not require PS3 but requires 16bit blending and filtering. Devs replaced 'hacks' ( glare/flare etc) with HDR. No AA when using HDR. on 6800U playable (40-50fps) @1024x768.

Very Nice effect :)
I wonder why *suddenly* HDR is only good to impliment or use if you can do FP Blending?? I Have seen several Comments from people in the Forums at Nvnews that reflect this theory. I Personally Cant see any visual difference between the year old ATi HDR demos and this new version with "FP Blending"... Funny how it was pointless and stupid a year ago but now its the best feature ever but only if you have "FP Blending"...

There is no need to support it this way. Why not support HDR accross all platforms that support it?

I find the performance gains interesing as well... I wonder about 3Dc support and its effect on performance?
 
Hellbinder said:
I wonder why *suddenly* HDR is only good to impliment or use if you can do FP Blending?? I Have seen several Comments from people in the Forums at Nvnews that reflect this theory. I Personally Cant see any visual difference between the year old ATi HDR demos and this new version with "FP Blending"... Funny how it was pointless and stupid a year ago but now its the best feature ever but only if you have "FP Blending"...

There is no need to support it this way. Why not support HDR accross all platforms that support it?

I find the performance gains interesing as well... I wonder about 3Dc support and its effect on performance?

Because the ATI HDR demos are only using HDR in shaders. Now if you want to full HDR with transperencies you will have to do multi pass rendering and it may require up to as many passes as their are triangles.

And I'm sure devs will point out that having more then a couple of passes is bad let alone 100 000 passes,
 
Lezmaka said:
I'm not a programmer, so is HDR easier to do if you use FP blending vs using shaders?

They will still use shaders for stuff but the developers won't need to emulate stuff in shaders that is DX5/Opengl 1.0 technology which alpha blends are. The only reason FP blends currently aren't support is because it takes up far more transistors then Int Blends.
 
bloodbob said:
Hellbinder said:
I wonder why *suddenly* HDR is only good to impliment or use if you can do FP Blending?? I Have seen several Comments from people in the Forums at Nvnews that reflect this theory. I Personally Cant see any visual difference between the year old ATi HDR demos and this new version with "FP Blending"... Funny how it was pointless and stupid a year ago but now its the best feature ever but only if you have "FP Blending"...

There is no need to support it this way. Why not support HDR accross all platforms that support it?

I find the performance gains interesing as well... I wonder about 3Dc support and its effect on performance?

Because the ATI HDR demos are only using HDR in shaders. Now if you want to full HDR with transperencies you will have to do multi pass rendering and it may require up to as many passes as their are triangles.

And I'm sure devs will point out that having more then a couple of passes is bad let alone 100 000 passes,
Um... Arent the Farcry devs simply replacing some of the effects with HDR Shaders? When you are talking about HDR "effects" you are automatically talking about shaders correct? Or am i off on this?

At any rate they are talking about a few "Effects" in the game being done with HDR which would seem to indicate shaders.

(actually, HL2 is using a TON of HDR effects and they work just fine.. in fact BETTER on ATi hardware)
 
Hellbinder said:
Um... Arent the Farcry devs simply replacing some of the effects with HDR Shaders? When you are talking about HDR "effects" you are automatically talking about shaders correct? Or am i off on this?

At any rate they are talking about a few "Effects" in the game being done with HDR which would seem to indicate shaders.

(actually, HL2 is using a TON of HDR effects and they work just fine.. in fact BETTER on ATi hardware)

Okay you can HDR in the shaders which is only a part of the rendering to do HDR all through other parts of the pipeline with a float format ( yes you could do it with I16 but its not really HDR more High Precision ) you have to emulate stuff with shaders that really can kill preformance totally. HL2 I believe only uses integer texture formats and I think the HIGH dynamic range that they use is only 0.0-2.0 which I wouldn't call a high dynmaic range really.
 
Humus said:
pat777 said:
Ok, can you respond to the question in my previous post? Couldn't nVIDIA just use this technique as well to assist dynamic branching?

It doesn't assist dynamic branching, it just solves the same problem that dynamic branching does in many situations, making them more or less unneccesary in those situations.
I still miss the contradiction. It solves it in many situation but not all. What happen in the other situations? What if Dev use branching in the other situations?
 
Humus said:
Sure. For the case of "early-out" with lighting:

1) Render depth-only pass, include ambient in this pass if you want it. You'll probably want a depth-only pass anyway, so this is for free.

For each light {
2) Draw "tagging" pass that tags pixels that needs to be lit by this light. The shader outputs an alpha > 0 if the pixel is supposed to be lit. Alpha test kills fragments that are unlit. The surviving fragment sets stencil to 1. No depth or color writes are needed, so this pass is very cheap.

3) Draw your lighting as usual. Stencil test kills all unlit fragments early, saving an assload of shading power. For all surviving fragments, set stencil back to zero.
}
It seems to me that this is one example where dynamic branching could dramatically save the number of passes. It seems like you could do the above algorithm in one pass with PS 3.0 (well, two if you include the initial z-pass....but I suppose it really does depend upon what you were using to eliminate lights....what were you thinking about, specifically?). You may also need to clear the stencil buffer without clearing the z-buffer with the above algorithm.

In particular, if the "whether or not this pixel is lit" calculation can be done on the vertex side, dynamic branching on the NV40 is almost sure to be faster (as coherent branches are much faster than ones whose values are dependent upon pixel shader calculations), but may still be faster even if the calculations must be done in the pixel side for optimal quality.

Edit:
I'm sure there are many clever ways of improving the performance of these more complex scenarios on PS 2.x video cards, but there are even more ways of doing so on PS 3.x hardware, and there are going to be fewer limitations with the techniques you use.
 
LeStoffer said:
True. But LeGreg, you really should be asking TSMC and not Humus. ;)

(hint: 0.09-micron process)

If that's true, R500 should be a pretty mature chip when it will, eventually, show up.
 
Back
Top