Another soft shadows demo

Mendel said:
now its only that the light/shadows being fully dynamic isn't used to anything as the light just follows the same path... how about making the light follow camera or something like that? :devilish:

Well, you can modify the variable "dynamicLight" as you like and recompile if you want another path, rolling colors, or whatever.
 
Humus said:
Yes, I fixed it this morning after reading Chalnoth's post. Didn't have time to mention it here though.

Indeed it does work on the 6800s now and, as promised, I thank you for yet another interesting demo. It's always nice to have technologies and ideas presented like this instead of reading about what can or should be done.

SoftShadows 2 seems to be quite demanding, bringing the 6800 Ultra to its knees with FPS dips into the low 40s. I also noticed that this demo does not have the same shadowing fidelity as the previous version (SoftShadows I). In version 2 I can see clear jumps in the shadow gradient whereas SoftShadows I was smooth enough to be considered perfect. Perhaps the glossy environment helped give a smooth appearance. On the other hand, perhaps the fix to make it work on 6800 brough with it some unwanted side-effects. I have not seen it rendered on any other hardware than the 6800 so I am not sure how it is supposed to look. Could someone with a x800 post a screen capture?
 
xGL said:
For those interested, here's a screenshot of the demo on XGI Volari :

Nice to see it working on the Volari. The banding in the shadows is visible in this screen shot so I will assume this is the expected output until I see something to the contrary (this looks like it does on the 6800).

I also want to add that I first thought it was the greater distancing between the objects and cast shadows that may have caused the banding, the greater distancing emphasising the calculation error, but I can see the banding in shadows very near the objects casting them. It clearly looks like 4 discrete steps in intensity (or so, that is a rough impression).
 
SoftShadows 2 seems to be quite demanding, bringing the 6800 Ultra to its knees with FPS dips into the low 40s.

Well, that is interesting. What resolution and AA/AF settings do you use?
On my puny Radeon 9600Pro I get about 30 fps in 1024x768, no AA/AF, with 6xAA/16x performance AF it's still 27 fps.
In 640x480 it's even over 70 fps.
40 fps for a 6800Ultra doesn't seem much to me. Then again, perhaps you were using 1600x1200 with 8xAA/16xAF or something, that would be more like it :)
 
Huh, it seems like the FSAA/AF performance hit for this demo was quite high. I didn't test any other settings, but with 4x/16x, I get 16 fps. With no AA/AF, I get about 27 fps (this is at 1280x960, windowed/maximized).
 
Scali said:
Well, that is interesting. What resolution and AA/AF settings do you use?
On my puny Radeon 9600Pro I get about 30 fps in 1024x768, no AA/AF, with 6xAA/16x performance AF it's still 27 fps.
In 640x480 it's even over 70 fps.
40 fps for a 6800Ultra doesn't seem much to me. Then again, perhaps you were using 1600x1200 with 8xAA/16xAF or something, that would be more like it :)

I should have been more clear and I apologise for making that statement without at least hinting to the settings used. The settings used were as follows:

Full screen
1024x768x32
No AA
No AF
Quality mode selected in Forceware (but with optimizations disabled for both anisotropic and trilinear mipmap filtering)

Forceware 61.77 & DirectX 9.0c (Windows XP SP2)

I believe the max FPS I can hit under these conditions is ~130 and the minimum is ~42. Naturally, these depend on the viewing angle and the minimum stated is a very extreme case with a fair average at around 60 FPS give or take 15.

In testing it again for verifying everything for this post I discovered that disabling "Use Clip Plane" gives a huge boost. I am now in the 80-90 FPS region using the settings mentioned.

Still, there is a large discrepancy between this demo and the original SoftShadow demo, in which I have over 300 FPS at the same settings. IMO the first demo looks better with richer colors, greater spectrum of intensity, better contrasting, and finer shadows. I am just somewhat stunned that something like the softshadows 2 demo, although quite good looking, can make a beast like 6800 Ultra scream for mercy.
 
wireframe said:
I should have been more clear and I apologise for making that statement without at least hinting to the settings used. The settings used were as follows:

Full screen
1024x768x32
No AA
No AF
Quality mode selected in Forceware (but with optimizations disabled for both anisotropic and trilinear mipmap filtering)

Pretty much the same settings as I used on my Radeon 9600Pro then.
I also have everything to highest quality, and I use Windows XP SP2 and DX9.0c too. And Cat 4.9 drivers (latest).

I believe the max FPS I can hit under these conditions is ~130 and the minimum is ~42. Naturally, these depend on the viewing angle and the minimum stated is a very extreme case with a fair average at around 60 FPS give or take 15.

In my case the fps are pretty much constant, although it's faster if you go completely into one corner, so you can't see the rest of the scene. Generally somewhere between 30 and 40 anyway.

In testing it again for verifying everything for this post I discovered that disabling "Use Clip Plane" gives a huge boost. I am now in the 80-90 FPS region using the settings mentioned.

Well that's interesting. Sounds like NVIDIA did something wrong, either in their driver, or in their hardware. On my Radeon it is the other way around, it's faster with the clipplanes enabled (as it should be, the whole idea of user clipplanes is to clip away geometry that you don't want rendered, so you should render less pixels, which should be faster ofcourse. It's no use if clipping geometry will actually make it slower). Although the difference is not that large in my case, about 5-10 fps.

Still, there is a large discrepancy between this demo and the original SoftShadow demo, in which I have over 300 FPS at the same settings. IMO the first demo looks better with richer colors, greater spectrum of intensity, better contrasting, and finer shadows. I am just somewhat stunned that something like the softshadows 2 demo, although quite good looking, can make a beast like 6800 Ultra scream for mercy.

Well, obviously precomputed shadows will be faster than these dynamic ones. But indeed, it is rather strange that the 6800Ultra gets such low numbers here... On the bright side, the 6800Ultra has some hardware features to accelerate shadowmaps, which could be used.

I wonder how X800s or other fast Radeons perform here, anyone have any numbers on that?
 
wireframe said:
I also want to add that I first thought it was the greater distancing between the objects and cast shadows that may have caused the banding, the greater distancing emphasising the calculation error, but I can see the banding in shadows very near the objects casting them. It clearly looks like 4 discrete steps in intensity (or so, that is a rough impression).

Heh, well it should actually be 12 different steps in intensity since there is 11 samples potentially.
 
wireframe said:
In testing it again for verifying everything for this post I discovered that disabling "Use Clip Plane" gives a huge boost. I am now in the 80-90 FPS region using the settings mentioned.
Hrm, might be the drivers. With the 66.31 drivers, I get a small performance boost from using clip planes, as you'd expect.
 
Using a X800XT I get around 205fps average without AA or AF and 140 with 6xAA and 16xAF this is with it windowed, just starting the demo and not moving or changing anything.
 
Well, my brother just bought a GeForce 6800LE, so I thought I'd give this demo a try myself.
Indeed, it runs under 30 fps on that card, with user clipplanes turned on. Literally slower than my 9600Pro. Quite funny, since the 6800LE literally gets twice the 3DMark03 score (3300 vs 6600).
Turning off user clipplanes improved performance, but still below what I would expect, based on the performance difference in games and other stuff. Perhaps you should use more halfs instead of floats, Humus :)
I think that's the main reason why the GeForces are so slow, apart from obviously broken user clipplane implementation...

Speaking of broken implementations, I also ran the Instancing demo, and the vertexshader constant technique was actually faster than true instancing... I guess the 6800 drivers are still a bit suboptimal (I just downloaded them, the latest ones from the NVIDIA website).

Edit: ofcourse we can all search-and-replace float with half ourselves in the .shd files... I did that, and it didn't seem to change performance much, I can't tell the difference.
 
Broken Hope said:
Using a X800XT I get around 205fps average without AA or AF and 140 with 6xAA and 16xAF this is with it windowed, just starting the demo and not moving or changing anything.

This is more like it. I'd like to think that the 6800 with UltraShadow II would do even better. Oh well, something is definitely not right with how the 6800 and Forceware 61.77 handle this demo.

It would be very interesting if someone with Nvidia/6800 know-how would optimize this demo for the 6800 to see what it is capable of. Although we cannot expect every title to have optimized code paths for all hardware, it would be very interesting to see the competition pitted against each other in situations like these. With UltraShadow and UltraShadow II Nvidia definitely made a claim, it would be nice to see the 6800 and x800 step up to the plate and take a swing at this, each using optimal conditions.
 
Humus said:
I recall the 5x00 series cards had trouble with the R16F format as render target, so I changed that to RG16. See if it works better now.

GeForce cards support the following 4 formats:
G16R16F
A16B16G16R16F (limited addressing on FX cards)
R32F
A32B32G32R32F (limited addressing on FX cards)

Only Radeons support the following two:
R16F
G32R32F

GeForce cards support G16R16 but not as a render target.

Edit: Opps, I replied not realizing it's a multi-page topic, but still it's worth noting.
 
wireframe said:
I also noticed that this demo does not have the same shadowing fidelity as the previous version (SoftShadows I). In version 2 I can see clear jumps in the shadow gradient whereas SoftShadows I was smooth enough to be considered perfect. Perhaps the glossy environment helped give a smooth appearance. On the other hand, perhaps the fix to make it work on 6800 brough with it some unwanted side-effects. I have not seen it rendered on any other hardware than the 6800 so I am not sure how it is supposed to look. Could someone with a x800 post a screen capture?

This is because it's a limited number of samples. It's the same output on the X800 as well. The other soft shadows demo used lightmaps, which are precomputed and thus don't really have any limit on how many samples you can use. I used 400 in the previous demo, versus 11 in this demo.
 
Scali said:
Well that's interesting. Sounds like NVIDIA did something wrong, either in their driver, or in their hardware.

In the GF3/4 I know they didn't have any clip plane support in hardware, so it was implemented with texkill, at least in OpenGL. So using clip planes would take a TMU, and if you enabled too many clip planes, you'd end up in software. Whether this is still what they do in GFFX and up I don't know, but it would be good to know. A good way to check this would be to simply comment out the depth bias code in this demo and recompile it. If there's no z-fighting, then they are probably still using texkill.
 
wireframe said:
Broken Hope said:
Using a X800XT I get around 205fps average without AA or AF and 140 with 6xAA and 16xAF this is with it windowed, just starting the demo and not moving or changing anything.

This is more like it. I'd like to think that the 6800 with UltraShadow II would do even better.

Ultra Shadow isn't going to help this technique any since it's for stencil shadows and not shadow mapping.

On my X800pro I'm getting 145fps with no AA/AF and 102fps with 6xAA/16xAF.
 
Hyp-X said:
GeForce cards support the following 4 formats:
G16R16F
A16B16G16R16F (limited addressing on FX cards)
R32F
A32B32G32R32F (limited addressing on FX cards)

Only Radeons support the following two:
R16F
G32R32F

GeForce cards support G16R16 but not as a render target.

Edit: Opps, I replied not realizing it's a multi-page topic, but still it's worth noting.

Good info, thanks. Now that I think of it, it would probably be helpful in the future if someone with a 6800 would send a capsviewer dump to me.
 
This should have most of the capabilities of the 6800 listed:

http://developer.nvidia.com/object/gpu_programming_guide.html

There's a nice table of all supported texture formats and the capabilities supported for each one. A quick scan through the caps seems to indicate that other than texture formats, the 6800 appears to have a "yes" for every cap. Here's a printout of the caps that have number values:

Code:
         DeviceType                                        1
         AdapterOrdinal                                    0
         MaxTextureWidth                                   4,096
         MaxTextureHeight                                  4,096
         MaxVolumeExtent                                   511
         MaxTextureRepeat                                  8,192
         MaxTextureAspectRatio                             4,096
         MaxAnisotropy                                     16
         MaxVertexW                                        1E+010
         GuardBandLeft                                     -1E+008
         GuardBandTop                                      -1E+008
         GuardBandRight                                    1E+008
         GuardBandBottom                                   1E+008
         ExtentsAdjust                                     0
         MaxTextureBlendStages                             8
         MaxSimultaneousTextures                           8
         MaxActiveLights                                   8
         MaxUserClipPlanes                                 6
         MaxVertexBlendMatrices                            4
         MaxVertexBlendMatrixIndex                         0
         MaxPointSize                                      8192
         MaxPrimitiveCount                                 1,048,575
         MaxVertexIndex                                    1,048,575
         MaxStreams                                        16
         MaxStreamStride                                   255
         VertexShaderVersion                               3.0
         MaxVertexShaderConst                              256
         PixelShaderVersion                                3.0
         PixelShader1xMaxValue                             65504
         MaxNpatchTessellationLevel                        0
         MasterAdapterOrdinal                              0
         AdapterOrdinalInGroup                             0
         NumberOfAdaptersInGroup                           1
         NumSimultaneousRTs                                4
         MaxVShaderInstructionsExecuted                    65,535
         MaxPShaderInstructionsExecuted                    65,535
         MaxVertexShader30InstructionSlots                 544
         MaxPixelShader30InstructionSlots                  4,096

As a side point, though, it seems the pixel shader 3.0 instruction slots number has been slowly increasing. I think with the drivers that came with my card, it was 512, and it later moved to 2048 with the 61.77 drivers, if I remember correctly.....
 
Humus,

Should we consider the current (4th release fixed for 6800) the final for SoftShadow II or are you planning on investigating performance on the 6800 and have a go at getting it up to par if possible?
 
Back
Top