GFFX Tests / Test Apps

Dave Baumann

Gamerscore Wh...
Moderator
Legend
OK, I've had a number of requests for tests and little apps to be run on the GFFX over the past few days and I've lost track of what they were as I'm still running the normal gaming tests for the preview.

So, if any of you had a request for a test to be run can you put them here and I'll see if I can oblige over the next few days. I can't say you'll see the results immediately, or that they'll even be in the preview, but I'll try and put them up sooner or later, possibly with comparisons to other parts.

Try and keep discussion to a minimum in this thread and just post the tests/benchmarks you want run (with links if needs be) and the criteria for the test.

FYI - Game testing for the GFFX Preview is complete, need to run GF4 numbers and then write it up.
 
Could you run this? It tests how fast a graphics card rejects Z failed and/or stencil failed pixels.

In normal mode it draws quads from back to front and from front to back. In reversed Z mode it changes Z func to "greater." It also has three stencil test modes, normal stencil test, normal stencil test with stencil write on Z fail, and all failed stencil test (no pixel rendered).
[/url]
 
The results of this test (thanks MDolenc) have already been shown in the forums on the GeForceFX, but it is perhaps the best single "test" that reveals the pipeline architecture of the FX.

If you're not going to include this test in the review, then you don't have to bother, because we know the results. I'm only mentioning it because in a single set of results, it rather nicely lays out the characteristics of the FX pipeline, and could be useful for your preview. Though one thing that's missing (and would be useful) is to see how fast it runs "simple" PS 1.4 shaders...

Below is a sample of results that someone ran on a Quadro FX:

Fillrate Tester
--------------------------
Display adapter: NVIDIA Quadro FX 2000
Driver version: 6.14.1.4290
Display mode: 1024x768x32bpp
--------------------------

Color writes enabled, z-writes enabled:
FFP - Pure fillrate - 1528.705200M pixels/sec
FFP - Single texture - 1236.259644M pixels/sec
FFP - Dual texture - 991.728027M pixels/sec
FFP - Triple texture - 586.592285M pixels/sec
FFP - Quad texture - 557.524841M pixels/sec
PS_2_0 - Per pixel lighting - 64.196724M pixels/sec
PS_2_0 PP - Per pixel lighting - 64.197914M pixels/sec
PS_1_1 - Simple - 780.622070M pixels/sec
PS_2_0 - Simple - 498.688599M pixels/sec

Color writes enabled, z-writes disabled:
FFP - Pure fillrate - 1523.353516M pixels/sec
FFP - Single texture - 1224.495850M pixels/sec
FFP - Dual texture - 1015.775146M pixels/sec
FFP - Triple texture - 586.473450M pixels/sec
FFP - Quad texture - 557.536377M pixels/sec
PS_2_0 - Per pixel lighting - 64.197647M pixels/sec
PS_2_0 PP - Per pixel lighting - 64.205772M pixels/sec
PS_1_1 - Simple - 780.526428M pixels/sec
PS_2_0 - Simple - 498.690155M pixels/sec

Color writes disabled, z-writes enabled:
FFP - Pure fillrate - 2909.496338M pixels/sec
FFP - Single texture - 2907.544434M pixels/sec
FFP - Dual texture - 2906.712646M pixels/sec
FFP - Triple texture - 2907.811523M pixels/sec
FFP - Quad texture - 2813.688232M pixels/sec
PS_2_0 - Per pixel lighting - 1350.793213M pixels/sec
PS_2_0 PP - Per pixel lighting - 1795.684937M pixels/sec
PS_1_1 - Simple - 2907.509277M pixels/sec
PS_2_0 - Simple - 2907.678955M pixels/sec
 
There is my FSAA Viewer which might be interesting, even though it seems we know what all the modes are anyway

http://www.users.on.net/triforce/d3d_fsaaviewer/d3d_fsaaviewer-3.zip

Comparing it to the Basic's OpenGL based programs migh be interesting too...

Dave: When I said "Done" previously, I mean I had actually used your's becuase it gets all the modes from DX. Will Admit I haven't check with Basic's OGL version as my assumption was that they would be the same.
 
In brief, could you please:

First, use the GeometryProcessingSpeed benchmark archive from the digit-life article, using shader level 2 (atleast that one shader level setting, depending on your interest and time) to compare the results for the FFP (Fixed T&L), VS11 (Vertex Shaders 1.1 and Fixed Function Blend Stages), VS20 ( Vertex Shaders 2.0 and Fixed Function Blend Stages).

Then, go on to contrast the PS11 (Vertex Shaders 1.1 and Pixel Shaders 1.1), PS14 (Vertex Shaders 1.1 and Pixel Shaders 1.4), and PS20 (Vertex Shaders 2.0 and Pixel Shaders 2.0) with the VS11 and VS20 results (accordingly with the VS versions).
It might be handy to also note the precision reported by the drivers for each PS version for whichever driver set is used for testing, if possible.

The .bat files show the syntax, and I think there is a readme with further detail as well.
....

The below is a cut and paste of my last post about this in the other thread, which "linkifies" my original thoughts about this (I hope the wording is clear...I'm still a bit illness-fuddled).

...

Is my first suggestion about testing vertex processing under different circumstances just "too whacko"?

If so, could someone tell me why?... I'm a bit under the weather and I won't be likely to be able to figure it out on my own anytime soon. :-?

If not, could you give my suggestion for testing a try, Wavey?

I noticed that VS20+PS20 gives higher fps (the same as for Fixed Function lighting) than VS11+PS11 for their testing at the simplest "shader level" (only ambient lighting), which could be an indicator that some processing unit used for PS 1.1 (but not PS 2.0 or FF) can be allocated for some simple lighting processing...this might also fit the FF results for the GF FX in general. This idea can perhaps be checked with the VS11 with FF color processing test results. What this causes me to wonder is if outputting to a floating point color buffer might be necessary to get around some of the driver shortcuts that might be implemented otherwise...but in any case, my test suggestion seems like it would have to use shader level 2 or higher.

Also, the nv30 VS20 + PS20 are near the GF 4 Ti 4600 VS11 + PS11 results of the same complexity for shader levels 2 and higher. The results seem to perhaps correlate to what would happen for 5/3 (clockspeed) * 2/3 (losing a vertex processing unit for fp32 processing) for GF FX versus GF 4 Ti 4600, with perhaps some efficiency loss for fewer vertex processing units, and it seems one conceivable explanation for the other results (the ones besides the Shader Level 1 test).
 
Maybe you could ask Nvidia to either supply or what tweak is required to get FP16 as the default as it seems to me comparing score results based on FX12 v FP24 seems a tad unbalanced. At least FP16 v FP24 would be a closer match.

It would be interesting to see how 16bit AA images compare too!
 
I'm lobbying for Humus' Mandelbrot Shader demo as an example of a compute-bound shader. Comparing screenshots with R300 running FP24 will show if its running FP16 or FP32 under DX9. If it is running FP32, using this version of the shader file mandel.fsh should test whether the partial precision hint in DX9 can force it to FP16.
 
antlers4 said:
I'm lobbying for Humus' Mandelbrot Shader demo as an example of a compute-bound shader. Comparing screenshots with R300 running FP24 will show if its running FP16 or FP32 under DX9. If it is running FP32, using this version of the shader file mandel.fsh should test whether the partial precision hint in DX9 can force it to FP16.

That's a good idea, but make sure, Dave, that you zoom in enough to see the precision artifacts, which look like rectangular blocks. I think FP24 will have 32 times the resolution as FP16, and FP32 will have 256 times the resolution of FP24, so the effect between the various precisions will be obvious as different sizes of blocks. I was having fun with 80-bit FP numbers in my CPU version, and needed well over a trillion times zoom to get any artifacts.
 
I wouldn't mind seeing the latest MIPS rate with my benchmark, the updated version (with the 505 instruction shader). I don't care if it's in the preview or anything, just kinda curious about the extra long shaders.
 
I'd like to see how it performs in XSI. You can download the demo from softimage.com I think, its supposed to support the FX shader system in the realtime preview.
 
If there is a way to hack the ATI PS1.4 demo's originally released for the Radeon 8500 that would be interesting. But I guess it wont work because you can expect ATI to have written them exclusively for the Radeon 8500, but they do work on the Radeon 9700 Pro without a problem. Perhaps some enterprising soul could assist (Rage3D guys?).

Any benchmarks that also rely pretty heavily on the CPU would be good to see too - just to see what card demands more from a CPU.

If it is at all possible to run the GFFX on a lowend system like a PIV1.6GHz or AMD Athlon 1GHz that would be interesting from a scaling point of view.

I would also love to see the shaderbenchmark from tb (www.tommi-systems.com) employed but it uses an ATI demo so dont know what information you would actually gain from it (Brent did a few on these forums... an expansion of that would be great).

P.S. do you have the Ultra or Non-Ultra?
 
Tahir said:
P.S. do you have the Ultra or Non-Ultra?

Assuming the only differences between the two are the core and memory speeds, then it looks like he has the Ultra (note the thread regarding 500 MHz DDR-II latency).
 
I wouldn't mind seeing how fast you can send the board down the block. Put it on a skate board. with a moble power supply and see how far the fan will push it :)
 
Tahir said:
If there is a way to hack the ATI PS1.4 demo's originally released for the Radeon 8500 that would be interesting. But I guess it wont work because you can expect ATI to have written them exclusively for the Radeon 8500, but they do work on the Radeon 9700 Pro without a problem. Perhaps some enterprising soul could assist (Rage3D guys?).

Any benchmarks that also rely pretty heavily on the CPU would be good to see too - just to see what card demands more from a CPU.

If it is at all possible to run the GFFX on a lowend system like a PIV1.6GHz or AMD Athlon 1GHz that would be interesting from a scaling point of view.

I would also love to see the shaderbenchmark from tb (www.tommi-systems.com) employed but it uses an ATI demo so dont know what information you would actually gain from it (Brent did a few on these forums... an expansion of that would be great).

P.S. do you have the Ultra or Non-Ultra?

ATis demos (from the 8500 ones onwards anyway) aren't card specific afaik. Providing the card can use the relevant PS version then it will run the demos. I'm sure I remember one of the ati people saying here that any PS2.0 card will be able to run the 9700 demos.
 
Can you test Battlefield 1942? How about GTA3? No one really tests GTA3 I don't think, but that is probably because it is not reviewer friendly.

Dave: This is not about games testing, this is about synthetic tests / shader apps that people have coded up, or can be found
 
I know these are dull tests but how about:

Codecreatures
FableMark
ST Temple Demo

:?:

Dave: Codecreatures, download link? Fablemark, already in. Temple, no but Villagemark in (newer version that works about 200FPS!!)
 
1. Before/after shader tests (PS 1.4 and 2.0 in particular) regarding the new "3DMark03" drivers. (Maybe ShaderMark?) I'm curious to know whether the speed gains really were just a matter of special "benchmark code" in the drivers or whether there are some general purpose shader optimizations in there.

2. If possible, a test of "the same" (synthetic) shader, but written in:
  • DX9
  • ARB_fragment_program
  • NV_fragment_program
At all the different precisions available in each language, if possible. I'm really curious given Carmack's comments whether the apparently huge performance advantage of NV_fragment_program over the other two is due to precision differences, a lack of compiler optimization, or something else. And it would be especially great if this could encompass a few different shaders, to see whether this performance difference varies according to instruction mix, length, etc.
 
Back
Top