New GLSL / Pbuffer benchmark [Update: version 1.4 / ORCv0.4]

R350/XP2400+ (Cat 5.3/Win2K)

Code:
GL filter framework 1.2999 test application by Peter Thoman 2004-2005

Gui initialized successfully.
DevIL initialized successfully.
 - DevIL Version: 167
OpenGL initialized successfully.
ILUT OpenGL mode set successfully.
Loaded required OpenGL extensions for GLPixelShader.
Loaded required OpenGL extensions for GLRenderTexture.
Loaded required OpenGL extensions for GLFilterStep.
Initialization complete.

Press return key to start benchmark...



Testing 32x32 image:
BufferCreateINT: msecs: 719 || ms/i: 119.833 || i/s: 8.34492
BufferCreateINT16: msecs: 703 || ms/i: 117.167 || i/s: 8.53485
BufferCreateFP16: msecs: 703 || ms/i: 117.167 || i/s: 8.53485
BufferCreateFP32: msecs: 703 || ms/i: 117.167 || i/s: 8.53485
JustCopy: msecs: 750 || ms/i: 0.375 || i/s: 2666.67
SimpleSmooth: msecs: 813 || ms/i: 0.4065 || i/s: 2460.02
TexNoise: msecs: 828 || ms/i: 0.414 || i/s: 2415.46
3x3Conv: msecs: 437 || ms/i: 0.437 || i/s: 2288.33
TEncode: msecs: 406 || ms/i: 0.406 || i/s: 2463.05
TDecode: msecs: 953 || ms/i: 0.953 || i/s: 1049.32
LinDiffINT: msecs: 1031 || ms/i: 0.5155 || i/s: 1939.86
LinDiffINT16: msecs: 1047 || ms/i: 0.5235 || i/s: 1910.22
LinDiffFP16: msecs: 1047 || ms/i: 0.5235 || i/s: 1910.22
LinDiffFP32: msecs: 1047 || ms/i: 0.5235 || i/s: 1910.22
PMTEncoded: msecs: 1750 || ms/i: 1.75 || i/s: 571.429
PMStandard: msecs: 1688 || ms/i: 1.688 || i/s: 592.417
PMBuffered: msecs: 156 || ms/i: 0.312 || i/s: 3205.13

Testing 64x64 image:
BufferCreateINT: msecs: 734 || ms/i: 122.333 || i/s: 8.17439
BufferCreateINT16: msecs: 735 || ms/i: 122.5 || i/s: 8.16327
BufferCreateFP16: msecs: 734 || ms/i: 122.333 || i/s: 8.17439
BufferCreateFP32: msecs: 734 || ms/i: 122.333 || i/s: 8.17439
JustCopy: msecs: 766 || ms/i: 0.383 || i/s: 2610.97
SimpleSmooth: msecs: 782 || ms/i: 0.391 || i/s: 2557.54
TexNoise: msecs: 812 || ms/i: 0.406 || i/s: 2463.05
3x3Conv: msecs: 438 || ms/i: 0.438 || i/s: 2283.11
TEncode: msecs: 485 || ms/i: 0.485 || i/s: 2061.86
TDecode: msecs: 1047 || ms/i: 1.047 || i/s: 955.11
LinDiffINT: msecs: 1078 || ms/i: 0.539 || i/s: 1855.29
LinDiffINT16: msecs: 1125 || ms/i: 0.5625 || i/s: 1777.78
LinDiffFP16: msecs: 938 || ms/i: 0.469 || i/s: 2132.2
LinDiffFP32: msecs: 1109 || ms/i: 0.5545 || i/s: 1803.43
PMTEncoded: msecs: 1485 || ms/i: 1.485 || i/s: 673.401
PMStandard: msecs: 1484 || ms/i: 1.484 || i/s: 673.854
PMBuffered: msecs: 156 || ms/i: 0.312 || i/s: 3205.13

Testing 128x128 image:
BufferCreateINT: msecs: 735 || ms/i: 122.5 || i/s: 8.16327
BufferCreateINT16: msecs: 734 || ms/i: 122.333 || i/s: 8.17439
BufferCreateFP16: msecs: 735 || ms/i: 122.5 || i/s: 8.16327
BufferCreateFP32: msecs: 734 || ms/i: 122.333 || i/s: 8.17439
JustCopy: msecs: 812 || ms/i: 0.406 || i/s: 2463.05
SimpleSmooth: msecs: 813 || ms/i: 0.4065 || i/s: 2460.02
TexNoise: msecs: 875 || ms/i: 0.4375 || i/s: 2285.71
3x3Conv: msecs: 437 || ms/i: 0.437 || i/s: 2288.33
TEncode: msecs: 406 || ms/i: 0.406 || i/s: 2463.05
TDecode: msecs: 969 || ms/i: 0.969 || i/s: 1031.99
LinDiffINT: msecs: 969 || ms/i: 0.4845 || i/s: 2063.98
LinDiffINT16: msecs: 937 || ms/i: 0.4685 || i/s: 2134.47
LinDiffFP16: msecs: 938 || ms/i: 0.469 || i/s: 2132.2
LinDiffFP32: msecs: 968 || ms/i: 0.484 || i/s: 2066.12
PMTEncoded: msecs: 1563 || ms/i: 1.563 || i/s: 639.795
PMStandard: msecs: 1829 || ms/i: 1.829 || i/s: 546.747
PMBuffered: msecs: 187 || ms/i: 0.374 || i/s: 2673.8

Testing 256x256 image:
BufferCreateINT: msecs: 734 || ms/i: 122.333 || i/s: 8.17439
BufferCreateINT16: msecs: 734 || ms/i: 122.333 || i/s: 8.17439
BufferCreateFP16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP32: msecs: 735 || ms/i: 122.5 || i/s: 8.16327
JustCopy: msecs: 828 || ms/i: 0.414 || i/s: 2415.46
SimpleSmooth: msecs: 844 || ms/i: 0.422 || i/s: 2369.67
TexNoise: msecs: 890 || ms/i: 0.445 || i/s: 2247.19
3x3Conv: msecs: 766 || ms/i: 0.766 || i/s: 1305.48
TEncode: msecs: 485 || ms/i: 0.485 || i/s: 2061.86
TDecode: msecs: 1078 || ms/i: 1.078 || i/s: 927.644
LinDiffINT: msecs: 1140 || ms/i: 0.57 || i/s: 1754.39
LinDiffINT16: msecs: 1171 || ms/i: 0.5855 || i/s: 1707.94
LinDiffFP16: msecs: 1125 || ms/i: 0.5625 || i/s: 1777.78
LinDiffFP32: msecs: 1172 || ms/i: 0.586 || i/s: 1706.48
PMTEncoded: msecs: 1562 || ms/i: 1.562 || i/s: 640.205
PMStandard: msecs: 1562 || ms/i: 1.562 || i/s: 640.205
PMBuffered: msecs: 672 || ms/i: 1.344 || i/s: 744.048

Testing 512x512 image:
BufferCreateINT: msecs: 735 || ms/i: 122.5 || i/s: 8.16327
BufferCreateINT16: msecs: 734 || ms/i: 122.333 || i/s: 8.17439
BufferCreateFP16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP32: msecs: 735 || ms/i: 122.5 || i/s: 8.16327
JustCopy: msecs: 485 || ms/i: 0.485 || i/s: 2061.86
SimpleSmooth: msecs: 765 || ms/i: 0.765 || i/s: 1307.19
TexNoise: msecs: 703 || ms/i: 0.703 || i/s: 1422.48
3x3Conv: msecs: 969 || ms/i: 1.938 || i/s: 515.996
TEncode: msecs: 219 || ms/i: 0.438 || i/s: 2283.11
TDecode: msecs: 500 || ms/i: 1 || i/s: 1000
LinDiffINT: msecs: 1079 || ms/i: 1.079 || i/s: 926.784
LinDiffINT16: msecs: 1078 || ms/i: 1.078 || i/s: 927.644
LinDiffFP16: msecs: 1078 || ms/i: 1.078 || i/s: 927.644
LinDiffFP32: msecs: 1265 || ms/i: 1.265 || i/s: 790.514
PMTEncoded: msecs: 1016 || ms/i: 2.032 || i/s: 492.126
PMStandard: msecs: 2391 || ms/i: 4.782 || i/s: 209.118
PMBuffered: msecs: 515 || ms/i: 2.06 || i/s: 485.437

Testing 1024x1024 image:
BufferCreateINT: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateINT16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP32: msecs: 734 || ms/i: 122.333 || i/s: 8.17439
JustCopy: msecs: 1234 || ms/i: 1.234 || i/s: 810.373
SimpleSmooth: msecs: 3484 || ms/i: 3.484 || i/s: 287.026
TexNoise: msecs: 2469 || ms/i: 2.469 || i/s: 405.022
3x3Conv: msecs: 4047 || ms/i: 8.094 || i/s: 123.548
TEncode: msecs: 234 || ms/i: 0.468 || i/s: 2136.75
TDecode: msecs: 1641 || ms/i: 3.282 || i/s: 304.692
LinDiffINT: msecs: 4235 || ms/i: 4.235 || i/s: 236.128
LinDiffINT16: msecs: 4234 || ms/i: 4.234 || i/s: 236.183
LinDiffFP16: msecs: 4235 || ms/i: 4.235 || i/s: 236.128
LinDiffFP32: msecs: 5031 || ms/i: 5.031 || i/s: 198.768
PMTEncoded: msecs: 3859 || ms/i: 7.718 || i/s: 129.567
PMStandard: msecs: 9391 || ms/i: 18.782 || i/s: 53.2425
PMBuffered: msecs: 5703 || ms/i: 22.812 || i/s: 43.8366

Finished. Press return key to close...
                Don't forget to copy the results!
 
Broken Hope said:
There's something weird about this test on my system.

First of all, it won't allow me to copy the results when it's finished, the output part of the program stops responding and it doesn't allow me to highlight anything in the command window to copy it.

But the main problem is, I seem to be getting worse results than the 9700 pro and the x800 pro, which shouldn't be happening, since I'm running an X850XT with an A64 3500+

Yes, I'm having the same "can't copy from the window" problem when it finishes. I was very distraught when I finally hit the return button in the hopes it was going to present the results to me some other way and it all disappeared.

That last test really did take a long time. I would have been sure it was hanging if Wireframe hadn't warned me upstream. If that is "expected results" I highly recommend you put some kind of warning message in there at that point.

But I see you already got a 6800GT results anyway, so perhaps you don't need mine all that much anyway (running on an Intel 2.4c @ 2.88 --standard ASUS AI 20% overclock).
 
athlon 64 3000+ @ 2.3ghz / x800xt pe default clocks

Code:
GL filter framework 1.2999 test application by Peter Thoman 2004-2005

Gui initialized successfully.
DevIL initialized successfully.
 - DevIL Version: 167
OpenGL initialized successfully.
ILUT OpenGL mode set successfully.
Loaded required OpenGL extensions for GLPixelShader.
Loaded required OpenGL extensions for GLRenderTexture.
Loaded required OpenGL extensions for GLFilterStep.
Initialization complete.

Press return key to start benchmark...



Testing 32x32 image:
BufferCreateINT: msecs: 78 || ms/i: 13 || i/s: 76.9231
BufferCreateINT16: msecs: 63 || ms/i: 10.5 || i/s: 95.2381
BufferCreateFP16: msecs: 62 || ms/i: 10.3333 || i/s: 96.7742
BufferCreateFP32: msecs: 63 || ms/i: 10.5 || i/s: 95.2381
JustCopy: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
SimpleSmooth: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
TexNoise: msecs: 344 || ms/i: 0.172 || i/s: 5813.95
3x3Conv: msecs: 203 || ms/i: 0.203 || i/s: 4926.11
TEncode: msecs: 172 || ms/i: 0.172 || i/s: 5813.95
TDecode: msecs: 250 || ms/i: 0.25 || i/s: 4000
LinDiffINT: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
LinDiffINT16: msecs: 391 || ms/i: 0.1955 || i/s: 5115.09
LinDiffFP16: msecs: 391 || ms/i: 0.1955 || i/s: 5115.09
LinDiffFP32: msecs: 407 || ms/i: 0.2035 || i/s: 4914
PMTEncoded: msecs: 563 || ms/i: 0.563 || i/s: 1776.2
PMStandard: msecs: 625 || ms/i: 0.625 || i/s: 1600
PMBuffered: msecs: 93 || ms/i: 0.186 || i/s: 5376.34

Testing 64x64 image:
BufferCreateINT: msecs: 79 || ms/i: 13.1667 || i/s: 75.9494
BufferCreateINT16: msecs: 78 || ms/i: 13 || i/s: 76.9231
BufferCreateFP16: msecs: 62 || ms/i: 10.3333 || i/s: 96.7742
BufferCreateFP32: msecs: 78 || ms/i: 13 || i/s: 76.9231
JustCopy: msecs: 375 || ms/i: 0.1875 || i/s: 5333.33
SimpleSmooth: msecs: 390 || ms/i: 0.195 || i/s: 5128.21
TexNoise: msecs: 406 || ms/i: 0.203 || i/s: 4926.11
3x3Conv: msecs: 204 || ms/i: 0.204 || i/s: 4901.96
TEncode: msecs: 156 || ms/i: 0.156 || i/s: 6410.26
TDecode: msecs: 234 || ms/i: 0.234 || i/s: 4273.5
LinDiffINT: msecs: 391 || ms/i: 0.1955 || i/s: 5115.09
LinDiffINT16: msecs: 391 || ms/i: 0.1955 || i/s: 5115.09
LinDiffFP16: msecs: 391 || ms/i: 0.1955 || i/s: 5115.09
LinDiffFP32: msecs: 344 || ms/i: 0.172 || i/s: 5813.95
PMTEncoded: msecs: 547 || ms/i: 0.547 || i/s: 1828.15
PMStandard: msecs: 532 || ms/i: 0.532 || i/s: 1879.7
PMBuffered: msecs: 94 || ms/i: 0.188 || i/s: 5319.15

Testing 128x128 image:
BufferCreateINT: msecs: 63 || ms/i: 10.5 || i/s: 95.2381
BufferCreateINT16: msecs: 78 || ms/i: 13 || i/s: 76.9231
BufferCreateFP16: msecs: 78 || ms/i: 13 || i/s: 76.9231
BufferCreateFP32: msecs: 78 || ms/i: 13 || i/s: 76.9231
JustCopy: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
SimpleSmooth: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
TexNoise: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
3x3Conv: msecs: 187 || ms/i: 0.187 || i/s: 5347.59
TEncode: msecs: 172 || ms/i: 0.172 || i/s: 5813.95
TDecode: msecs: 234 || ms/i: 0.234 || i/s: 4273.5
LinDiffINT: msecs: 344 || ms/i: 0.172 || i/s: 5813.95
LinDiffINT16: msecs: 343 || ms/i: 0.1715 || i/s: 5830.9
LinDiffFP16: msecs: 359 || ms/i: 0.1795 || i/s: 5571.03
LinDiffFP32: msecs: 360 || ms/i: 0.18 || i/s: 5555.56
PMTEncoded: msecs: 656 || ms/i: 0.656 || i/s: 1524.39
PMStandard: msecs: 657 || ms/i: 0.657 || i/s: 1522.07
PMBuffered: msecs: 94 || ms/i: 0.188 || i/s: 5319.15

Testing 256x256 image:
BufferCreateINT: msecs: 62 || ms/i: 10.3333 || i/s: 96.7742
BufferCreateINT16: msecs: 78 || ms/i: 13 || i/s: 76.9231
BufferCreateFP16: msecs: 78 || ms/i: 13 || i/s: 76.9231
BufferCreateFP32: msecs: 79 || ms/i: 13.1667 || i/s: 75.9494
JustCopy: msecs: 390 || ms/i: 0.195 || i/s: 5128.21
SimpleSmooth: msecs: 407 || ms/i: 0.2035 || i/s: 4914
TexNoise: msecs: 422 || ms/i: 0.211 || i/s: 4739.34
3x3Conv: msecs: 312 || ms/i: 0.312 || i/s: 3205.13
TEncode: msecs: 188 || ms/i: 0.188 || i/s: 5319.15
TDecode: msecs: 312 || ms/i: 0.312 || i/s: 3205.13
LinDiffINT: msecs: 407 || ms/i: 0.2035 || i/s: 4914
LinDiffINT16: msecs: 344 || ms/i: 0.172 || i/s: 5813.95
LinDiffFP16: msecs: 360 || ms/i: 0.18 || i/s: 5555.56
LinDiffFP32: msecs: 344 || ms/i: 0.172 || i/s: 5813.95
PMTEncoded: msecs: 562 || ms/i: 0.562 || i/s: 1779.36
PMStandard: msecs: 578 || ms/i: 0.578 || i/s: 1730.1
PMBuffered: msecs: 297 || ms/i: 0.594 || i/s: 1683.5

Testing 512x512 image:
BufferCreateINT: msecs: 62 || ms/i: 10.3333 || i/s: 96.7742
BufferCreateINT16: msecs: 62 || ms/i: 10.3333 || i/s: 96.7742
BufferCreateFP16: msecs: 78 || ms/i: 13 || i/s: 76.9231
BufferCreateFP32: msecs: 78 || ms/i: 13 || i/s: 76.9231
JustCopy: msecs: 187 || ms/i: 0.187 || i/s: 5347.59
SimpleSmooth: msecs: 360 || ms/i: 0.36 || i/s: 2777.78
TexNoise: msecs: 343 || ms/i: 0.343 || i/s: 2915.45
3x3Conv: msecs: 485 || ms/i: 0.97 || i/s: 1030.93
TEncode: msecs: 93 || ms/i: 0.186 || i/s: 5376.34
TDecode: msecs: 141 || ms/i: 0.282 || i/s: 3546.1
LinDiffINT: msecs: 406 || ms/i: 0.406 || i/s: 2463.05
LinDiffINT16: msecs: 406 || ms/i: 0.406 || i/s: 2463.05
LinDiffFP16: msecs: 406 || ms/i: 0.406 || i/s: 2463.05
LinDiffFP32: msecs: 594 || ms/i: 0.594 || i/s: 1683.5
PMTEncoded: msecs: 407 || ms/i: 0.814 || i/s: 1228.5
PMStandard: msecs: 1031 || ms/i: 2.062 || i/s: 484.966
PMBuffered: msecs: 265 || ms/i: 1.06 || i/s: 943.396

Testing 1024x1024 image:
BufferCreateINT: msecs: 63 || ms/i: 10.5 || i/s: 95.2381
BufferCreateINT16: msecs: 79 || ms/i: 13.1667 || i/s: 75.9494
BufferCreateFP16: msecs: 62 || ms/i: 10.3333 || i/s: 96.7742
BufferCreateFP32: msecs: 94 || ms/i: 15.6667 || i/s: 63.8298
JustCopy: msecs: 672 || ms/i: 0.672 || i/s: 1488.1
SimpleSmooth: msecs: 1391 || ms/i: 1.391 || i/s: 718.907
TexNoise: msecs: 1109 || ms/i: 1.109 || i/s: 901.713
3x3Conv: msecs: 1828 || ms/i: 3.656 || i/s: 273.523
TEncode: msecs: 78 || ms/i: 0.156 || i/s: 6410.26
TDecode: msecs: 641 || ms/i: 1.282 || i/s: 780.031
LinDiffINT: msecs: 1562 || ms/i: 1.562 || i/s: 640.205
LinDiffINT16: msecs: 1594 || ms/i: 1.594 || i/s: 627.353
LinDiffFP16: msecs: 1594 || ms/i: 1.594 || i/s: 627.353
LinDiffFP32: msecs: 2781 || ms/i: 2.781 || i/s: 359.583
PMTEncoded: msecs: 1453 || ms/i: 2.906 || i/s: 344.116
PMStandard: msecs: 4125 || ms/i: 8.25 || i/s: 121.212
PMBuffered: msecs: 2844 || ms/i: 11.376 || i/s: 87.9044

Finished. Press return key to close...
                Don't forget to copy the results!
 
Oh. Duh. I've gotten so used to using Mr. Mouse that it never occurred to me at first to try it any other way. Using the menu in the window frame, select all, copy, got it. Again --Duh!

Code:
GL filter framework 1.2999 test application by Peter Thoman 2004-2005

Gui initialized successfully.
DevIL initialized successfully.
 - DevIL Version: 167
OpenGL initialized successfully.
ILUT OpenGL mode set successfully.
Loaded required OpenGL extensions for GLPixelShader.
Loaded required OpenGL extensions for GLRenderTexture.
Loaded required OpenGL extensions for GLFilterStep.
Initialization complete.

Press return key to start benchmark...



Testing 32x32 image:
BufferCreateINT: msecs: 156 || ms/i: 26 || i/s: 38.4615No suitable INT format fo
und. Trying FP... (Flaky 6600 workaround)

BufferCreateINT16: msecs: 203 || ms/i: 33.8333 || i/s: 29.5567
BufferCreateFP16: msecs: 109 || ms/i: 18.1667 || i/s: 55.0459
BufferCreateFP32: msecs: 125 || ms/i: 20.8333 || i/s: 48
JustCopy: msecs: 203 || ms/i: 0.1015 || i/s: 9852.22
SimpleSmooth: msecs: 297 || ms/i: 0.1485 || i/s: 6734.01
TexNoise: msecs: 297 || ms/i: 0.1485 || i/s: 6734.01
3x3Conv: msecs: 297 || ms/i: 0.297 || i/s: 3367
TEncode: msecs: 187 || ms/i: 0.187 || i/s: 5347.59
TDecode: msecs: 203 || ms/i: 0.203 || i/s: 4926.11
LinDiffINT: msecs: 281 || ms/i: 0.1405 || i/s: 7117.44
LinDiffINT16: msecs: 297 || ms/i: 0.1485 || i/s: 6734.01
LinDiffFP16: msecs: 281 || ms/i: 0.1405 || i/s: 7117.44
LinDiffFP32: msecs: 281 || ms/i: 0.1405 || i/s: 7117.44
PMTEncoded: msecs: 547 || ms/i: 0.547 || i/s: 1828.15
PMStandard: msecs: 437 || ms/i: 0.437 || i/s: 2288.33
PMBuffered: msecs: 31 || ms/i: 0.062 || i/s: 16129

Testing 64x64 image:
BufferCreateINT: msecs: 125 || ms/i: 20.8333 || i/s: 48
BufferCreateINT16: msecs: 203 || ms/i: 33.8333 || i/s: 29.5567
BufferCreateFP16: msecs: 125 || ms/i: 20.8333 || i/s: 48
BufferCreateFP32: msecs: 125 || ms/i: 20.8333 || i/s: 48
JustCopy: msecs: 218 || ms/i: 0.109 || i/s: 9174.31
SimpleSmooth: msecs: 204 || ms/i: 0.102 || i/s: 9803.92
TexNoise: msecs: 234 || ms/i: 0.117 || i/s: 8547.01
3x3Conv: msecs: 109 || ms/i: 0.109 || i/s: 9174.31
TEncode: msecs: 109 || ms/i: 0.109 || i/s: 9174.31
TDecode: msecs: 141 || ms/i: 0.141 || i/s: 7092.2
LinDiffINT: msecs: 282 || ms/i: 0.141 || i/s: 7092.2
LinDiffINT16: msecs: 266 || ms/i: 0.133 || i/s: 7518.8
LinDiffFP16: msecs: 265 || ms/i: 0.1325 || i/s: 7547.17
LinDiffFP32: msecs: 266 || ms/i: 0.133 || i/s: 7518.8
PMTEncoded: msecs: 453 || ms/i: 0.453 || i/s: 2207.51
PMStandard: msecs: 454 || ms/i: 0.454 || i/s: 2202.64
PMBuffered: msecs: 16 || ms/i: 0.032 || i/s: 31250

Testing 128x128 image:
BufferCreateINT: msecs: 125 || ms/i: 20.8333 || i/s: 48
BufferCreateINT16: msecs: 219 || ms/i: 36.5 || i/s: 27.3973
BufferCreateFP16: msecs: 109 || ms/i: 18.1667 || i/s: 55.0459
BufferCreateFP32: msecs: 125 || ms/i: 20.8333 || i/s: 48
JustCopy: msecs: 203 || ms/i: 0.1015 || i/s: 9852.22
SimpleSmooth: msecs: 204 || ms/i: 0.102 || i/s: 9803.92
TexNoise: msecs: 218 || ms/i: 0.109 || i/s: 9174.31
3x3Conv: msecs: 125 || ms/i: 0.125 || i/s: 8000
TEncode: msecs: 109 || ms/i: 0.109 || i/s: 9174.31
TDecode: msecs: 125 || ms/i: 0.125 || i/s: 8000
LinDiffINT: msecs: 266 || ms/i: 0.133 || i/s: 7518.8
LinDiffINT16: msecs: 281 || ms/i: 0.1405 || i/s: 7117.44
LinDiffFP16: msecs: 266 || ms/i: 0.133 || i/s: 7518.8
LinDiffFP32: msecs: 422 || ms/i: 0.211 || i/s: 4739.34
PMTEncoded: msecs: 437 || ms/i: 0.437 || i/s: 2288.33
PMStandard: msecs: 657 || ms/i: 0.657 || i/s: 1522.07
PMBuffered: msecs: 125 || ms/i: 0.25 || i/s: 4000

Testing 256x256 image:
BufferCreateINT: msecs: 125 || ms/i: 20.8333 || i/s: 48
BufferCreateINT16: msecs: 203 || ms/i: 33.8333 || i/s: 29.5567
BufferCreateFP16: msecs: 125 || ms/i: 20.8333 || i/s: 48
BufferCreateFP32: msecs: 109 || ms/i: 18.1667 || i/s: 55.0459
JustCopy: msecs: 219 || ms/i: 0.1095 || i/s: 9132.42
SimpleSmooth: msecs: 218 || ms/i: 0.109 || i/s: 9174.31
TexNoise: msecs: 219 || ms/i: 0.1095 || i/s: 9132.42
3x3Conv: msecs: 203 || ms/i: 0.203 || i/s: 4926.11
TEncode: msecs: 109 || ms/i: 0.109 || i/s: 9174.31
TDecode: msecs: 141 || ms/i: 0.141 || i/s: 7092.2
LinDiffINT: msecs: 281 || ms/i: 0.1405 || i/s: 7117.44
LinDiffINT16: msecs: 500 || ms/i: 0.25 || i/s: 4000
LinDiffFP16: msecs: 500 || ms/i: 0.25 || i/s: 4000
LinDiffFP32: msecs: 1625 || ms/i: 0.8125 || i/s: 1230.77
PMTEncoded: msecs: 750 || ms/i: 0.75 || i/s: 1333.33
PMStandard: msecs: 2500 || ms/i: 2.5 || i/s: 400
PMBuffered: msecs: 719 || ms/i: 1.438 || i/s: 695.41

Testing 512x512 image:
BufferCreateINT: msecs: 125 || ms/i: 20.8333 || i/s: 48
BufferCreateINT16: msecs: 203 || ms/i: 33.8333 || i/s: 29.5567
BufferCreateFP16: msecs: 125 || ms/i: 20.8333 || i/s: 48
BufferCreateFP32: msecs: 125 || ms/i: 20.8333 || i/s: 48
JustCopy: msecs: 187 || ms/i: 0.187 || i/s: 5347.59
SimpleSmooth: msecs: 359 || ms/i: 0.359 || i/s: 2785.52
TexNoise: msecs: 266 || ms/i: 0.266 || i/s: 3759.4
3x3Conv: msecs: 343 || ms/i: 0.686 || i/s: 1457.73
TEncode: msecs: 78 || ms/i: 0.156 || i/s: 6410.26
TDecode: msecs: 234 || ms/i: 0.468 || i/s: 2136.75
LinDiffINT: msecs: 328 || ms/i: 0.328 || i/s: 3048.78
LinDiffINT16: msecs: 828 || ms/i: 0.828 || i/s: 1207.73
LinDiffFP16: msecs: 890 || ms/i: 0.89 || i/s: 1123.6
LinDiffFP32: msecs: 2922 || ms/i: 2.922 || i/s: 342.231
PMTEncoded: msecs: 1375 || ms/i: 2.75 || i/s: 363.636
PMStandard: msecs: 4640 || ms/i: 9.28 || i/s: 107.759
PMBuffered: msecs: 1000 || ms/i: 4 || i/s: 250

Testing 1024x1024 image:
BufferCreateINT: msecs: 203 || ms/i: 33.8333 || i/s: 29.5567
BufferCreateINT16: msecs: 391 || ms/i: 65.1667 || i/s: 15.3453
BufferCreateFP16: msecs: 157 || ms/i: 26.1667 || i/s: 38.2166
BufferCreateFP32: msecs: 172 || ms/i: 28.6667 || i/s: 34.8837
JustCopy: msecs: 859 || ms/i: 0.859 || i/s: 1164.14
SimpleSmooth: msecs: 1688 || ms/i: 1.688 || i/s: 592.417
TexNoise: msecs: 984 || ms/i: 0.984 || i/s: 1016.26
3x3Conv: msecs: 1531 || ms/i: 3.062 || i/s: 326.584
TEncode: msecs: 328 || ms/i: 0.656 || i/s: 1524.39
TDecode: msecs: 953 || ms/i: 1.906 || i/s: 524.659
LinDiffINT: msecs: 1296 || ms/i: 1.296 || i/s: 771.605
LinDiffINT16: msecs: 3672 || ms/i: 3.672 || i/s: 272.331
LinDiffFP16: msecs: 3641 || ms/i: 3.641 || i/s: 274.65
LinDiffFP32: msecs: 11906 || ms/i: 11.906 || i/s: 83.9913
PMTEncoded: msecs: 5234 || ms/i: 10.468 || i/s: 95.5292
PMStandard: msecs: 18656 || ms/i: 37.312 || i/s: 26.801
PMBuffered: msecs: 495016 || ms/i: 1980.06 || i/s: 0.505034

Finished. Press return key to close...
                Don't forget to copy the results!

Intel 2.4c @ 2.88 Win XP Home SP2, Forceware 71.84.
BFG 6800GT OC 395/1.05 AGP
1024MB system ram
 
P4 3.2C HT + R9800Pro 128mb (Catalyst 5.3)

Code:
GL filter framework 1.2999 test application by Peter Thoman 2004-2005

Gui initialized successfully.
DevIL initialized successfully.
 - DevIL Version: 167
OpenGL initialized successfully.
ILUT OpenGL mode set successfully.
Loaded required OpenGL extensions for GLPixelShader.
Loaded required OpenGL extensions for GLRenderTexture.
Loaded required OpenGL extensions for GLFilterStep.
Initialization complete.

Press return key to start benchmark...



Testing 32x32 image:
BufferCreateINT: msecs: 109 || ms/i: 18.1667 || i/s: 55.0459
BufferCreateINT16: msecs: 94 || ms/i: 15.6667 || i/s: 63.8298
BufferCreateFP16: msecs: 94 || ms/i: 15.6667 || i/s: 63.8298
BufferCreateFP32: msecs: 93 || ms/i: 15.5 || i/s: 64.5161
JustCopy: msecs: 609 || ms/i: 0.3045 || i/s: 3284.07
SimpleSmooth: msecs: 625 || ms/i: 0.3125 || i/s: 3200
TexNoise: msecs: 641 || ms/i: 0.3205 || i/s: 3120.12
3x3Conv: msecs: 344 || ms/i: 0.344 || i/s: 2906.98
TEncode: msecs: 313 || ms/i: 0.313 || i/s: 3194.89
TDecode: msecs: 531 || ms/i: 0.531 || i/s: 1883.24
LinDiffINT: msecs: 734 || ms/i: 0.367 || i/s: 2724.8
LinDiffINT16: msecs: 719 || ms/i: 0.3595 || i/s: 2781.64
LinDiffFP16: msecs: 719 || ms/i: 0.3595 || i/s: 2781.64
LinDiffFP32: msecs: 735 || ms/i: 0.3675 || i/s: 2721.09
PMTEncoded: msecs: 1031 || ms/i: 1.031 || i/s: 969.932
PMStandard: msecs: 1141 || ms/i: 1.141 || i/s: 876.424
PMBuffered: msecs: 125 || ms/i: 0.25 || i/s: 4000

Testing 64x64 image:
BufferCreateINT: msecs: 110 || ms/i: 18.3333 || i/s: 54.5455
BufferCreateINT16: msecs: 109 || ms/i: 18.1667 || i/s: 55.0459
BufferCreateFP16: msecs: 94 || ms/i: 15.6667 || i/s: 63.8298
BufferCreateFP32: msecs: 94 || ms/i: 15.6667 || i/s: 63.8298
JustCopy: msecs: 687 || ms/i: 0.3435 || i/s: 2911.21
SimpleSmooth: msecs: 719 || ms/i: 0.3595 || i/s: 2781.64
TexNoise: msecs: 750 || ms/i: 0.375 || i/s: 2666.67
3x3Conv: msecs: 375 || ms/i: 0.375 || i/s: 2666.67
TEncode: msecs: 344 || ms/i: 0.344 || i/s: 2906.98
TDecode: msecs: 578 || ms/i: 0.578 || i/s: 1730.1
LinDiffINT: msecs: 735 || ms/i: 0.3675 || i/s: 2721.09
LinDiffINT16: msecs: 766 || ms/i: 0.383 || i/s: 2610.97
LinDiffFP16: msecs: 750 || ms/i: 0.375 || i/s: 2666.67
LinDiffFP32: msecs: 735 || ms/i: 0.3675 || i/s: 2721.09
PMTEncoded: msecs: 1062 || ms/i: 1.062 || i/s: 941.62
PMStandard: msecs: 1031 || ms/i: 1.031 || i/s: 969.932
PMBuffered: msecs: 125 || ms/i: 0.25 || i/s: 4000

Testing 128x128 image:
BufferCreateINT: msecs: 110 || ms/i: 18.3333 || i/s: 54.5455
BufferCreateINT16: msecs: 93 || ms/i: 15.5 || i/s: 64.5161
BufferCreateFP16: msecs: 94 || ms/i: 15.6667 || i/s: 63.8298
BufferCreateFP32: msecs: 109 || ms/i: 18.1667 || i/s: 55.0459
JustCopy: msecs: 625 || ms/i: 0.3125 || i/s: 3200
SimpleSmooth: msecs: 641 || ms/i: 0.3205 || i/s: 3120.12
TexNoise: msecs: 656 || ms/i: 0.328 || i/s: 3048.78
3x3Conv: msecs: 344 || ms/i: 0.344 || i/s: 2906.98
TEncode: msecs: 343 || ms/i: 0.343 || i/s: 2915.45
TDecode: msecs: 594 || ms/i: 0.594 || i/s: 1683.5
LinDiffINT: msecs: 750 || ms/i: 0.375 || i/s: 2666.67
LinDiffINT16: msecs: 687 || ms/i: 0.3435 || i/s: 2911.21
LinDiffFP16: msecs: 750 || ms/i: 0.375 || i/s: 2666.67
LinDiffFP32: msecs: 734 || ms/i: 0.367 || i/s: 2724.8
PMTEncoded: msecs: 1187 || ms/i: 1.187 || i/s: 842.46
PMStandard: msecs: 1172 || ms/i: 1.172 || i/s: 853.242
PMBuffered: msecs: 188 || ms/i: 0.376 || i/s: 2659.57

Testing 256x256 image:
BufferCreateINT: msecs: 94 || ms/i: 15.6667 || i/s: 63.8298
BufferCreateINT16: msecs: 109 || ms/i: 18.1667 || i/s: 55.0459
BufferCreateFP16: msecs: 109 || ms/i: 18.1667 || i/s: 55.0459
BufferCreateFP32: msecs: 110 || ms/i: 18.3333 || i/s: 54.5455
JustCopy: msecs: 719 || ms/i: 0.3595 || i/s: 2781.64
SimpleSmooth: msecs: 734 || ms/i: 0.367 || i/s: 2724.8
TexNoise: msecs: 750 || ms/i: 0.375 || i/s: 2666.67
3x3Conv: msecs: 766 || ms/i: 0.766 || i/s: 1305.48
TEncode: msecs: 344 || ms/i: 0.344 || i/s: 2906.98
TDecode: msecs: 609 || ms/i: 0.609 || i/s: 1642.04
LinDiffINT: msecs: 750 || ms/i: 0.375 || i/s: 2666.67
LinDiffINT16: msecs: 766 || ms/i: 0.383 || i/s: 2610.97
LinDiffFP16: msecs: 766 || ms/i: 0.383 || i/s: 2610.97
LinDiffFP32: msecs: 812 || ms/i: 0.406 || i/s: 2463.05
PMTEncoded: msecs: 1078 || ms/i: 1.078 || i/s: 927.644
PMStandard: msecs: 1390 || ms/i: 1.39 || i/s: 719.424
PMBuffered: msecs: 672 || ms/i: 1.344 || i/s: 744.048

Testing 512x512 image:
BufferCreateINT: msecs: 110 || ms/i: 18.3333 || i/s: 54.5455
BufferCreateINT16: msecs: 109 || ms/i: 18.1667 || i/s: 55.0459
BufferCreateFP16: msecs: 109 || ms/i: 18.1667 || i/s: 55.0459
BufferCreateFP32: msecs: 94 || ms/i: 15.6667 || i/s: 63.8298
JustCopy: msecs: 344 || ms/i: 0.344 || i/s: 2906.98
SimpleSmooth: msecs: 890 || ms/i: 0.89 || i/s: 1123.6
TexNoise: msecs: 656 || ms/i: 0.656 || i/s: 1524.39
3x3Conv: msecs: 1032 || ms/i: 2.064 || i/s: 484.496
TEncode: msecs: 172 || ms/i: 0.344 || i/s: 2906.98
TDecode: msecs: 360 || ms/i: 0.72 || i/s: 1388.89
LinDiffINT: msecs: 1094 || ms/i: 1.094 || i/s: 914.077
LinDiffINT16: msecs: 1078 || ms/i: 1.078 || i/s: 927.644
LinDiffFP16: msecs: 1078 || ms/i: 1.078 || i/s: 927.644
LinDiffFP32: msecs: 1281 || ms/i: 1.281 || i/s: 780.64
PMTEncoded: msecs: 1000 || ms/i: 2 || i/s: 500
PMStandard: msecs: 2391 || ms/i: 4.782 || i/s: 209.118
PMBuffered: msecs: 547 || ms/i: 2.188 || i/s: 457.038

Testing 1024x1024 image:
BufferCreateINT: msecs: 110 || ms/i: 18.3333 || i/s: 54.5455
BufferCreateINT16: msecs: 109 || ms/i: 18.1667 || i/s: 55.0459
BufferCreateFP16: msecs: 109 || ms/i: 18.1667 || i/s: 55.0459
BufferCreateFP32: msecs: 110 || ms/i: 18.3333 || i/s: 54.5455
JustCopy: msecs: 1265 || ms/i: 1.265 || i/s: 790.514
SimpleSmooth: msecs: 3500 || ms/i: 3.5 || i/s: 285.714
TexNoise: msecs: 2484 || ms/i: 2.484 || i/s: 402.576
3x3Conv: msecs: 4062 || ms/i: 8.124 || i/s: 123.092
TEncode: msecs: 187 || ms/i: 0.374 || i/s: 2673.8
TDecode: msecs: 1719 || ms/i: 3.438 || i/s: 290.867
LinDiffINT: msecs: 4250 || ms/i: 4.25 || i/s: 235.294
LinDiffINT16: msecs: 4250 || ms/i: 4.25 || i/s: 235.294
LinDiffFP16: msecs: 4250 || ms/i: 4.25 || i/s: 235.294
LinDiffFP32: msecs: 5063 || ms/i: 5.063 || i/s: 197.511
PMTEncoded: msecs: 3859 || ms/i: 7.718 || i/s: 129.567
PMStandard: msecs: 9422 || ms/i: 18.844 || i/s: 53.0673
PMBuffered: msecs: 3141 || ms/i: 12.564 || i/s: 79.5925

Finished. Press return key to close...
                Don't forget to copy the results!
 
Could someone explain why I seem to be getting worse results than just about everyone, when I'm running an A64 3500+ and a X850XT?

Code:
GL filter framework 1.2999 test application by Peter Thoman 2004-2005

Gui initialized successfully.
DevIL initialized successfully.
 - DevIL Version: 167
OpenGL initialized successfully.
ILUT OpenGL mode set successfully.
Loaded required OpenGL extensions for GLPixelShader.
Loaded required OpenGL extensions for GLRenderTexture.
Loaded required OpenGL extensions for GLFilterStep.
Initialization complete.

Press return key to start benchmark...



Testing 32x32 image:
BufferCreateINT: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateINT16: msecs: 657 || ms/i: 109.5 || i/s: 9.13242
BufferCreateFP16: msecs: 671 || ms/i: 111.833 || i/s: 8.94188
BufferCreateFP32: msecs: 657 || ms/i: 109.5 || i/s: 9.13242
JustCopy: msecs: 313 || ms/i: 0.1565 || i/s: 6389.78
SimpleSmooth: msecs: 312 || ms/i: 0.156 || i/s: 6410.26
TexNoise: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
3x3Conv: msecs: 188 || ms/i: 0.188 || i/s: 5319.15
TEncode: msecs: 156 || ms/i: 0.156 || i/s: 6410.26
TDecode: msecs: 235 || ms/i: 0.235 || i/s: 4255.32
LinDiffINT: msecs: 375 || ms/i: 0.1875 || i/s: 5333.33
LinDiffINT16: msecs: 375 || ms/i: 0.1875 || i/s: 5333.33
LinDiffFP16: msecs: 375 || ms/i: 0.1875 || i/s: 5333.33
LinDiffFP32: msecs: 375 || ms/i: 0.1875 || i/s: 5333.33
PMTEncoded: msecs: 532 || ms/i: 0.532 || i/s: 1879.7
PMStandard: msecs: 594 || ms/i: 0.594 || i/s: 1683.5
PMBuffered: msecs: 78 || ms/i: 0.156 || i/s: 6410.26

Testing 64x64 image:
BufferCreateINT: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateINT16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP32: msecs: 657 || ms/i: 109.5 || i/s: 9.13242
JustCopy: msecs: 360 || ms/i: 0.18 || i/s: 5555.56
SimpleSmooth: msecs: 375 || ms/i: 0.1875 || i/s: 5333.33
TexNoise: msecs: 390 || ms/i: 0.195 || i/s: 5128.21
3x3Conv: msecs: 203 || ms/i: 0.203 || i/s: 4926.11
TEncode: msecs: 188 || ms/i: 0.188 || i/s: 5319.15
TDecode: msecs: 265 || ms/i: 0.265 || i/s: 3773.58
LinDiffINT: msecs: 391 || ms/i: 0.1955 || i/s: 5115.09
LinDiffINT16: msecs: 375 || ms/i: 0.1875 || i/s: 5333.33
LinDiffFP16: msecs: 375 || ms/i: 0.1875 || i/s: 5333.33
LinDiffFP32: msecs: 375 || ms/i: 0.1875 || i/s: 5333.33
PMTEncoded: msecs: 515 || ms/i: 0.515 || i/s: 1941.75
PMStandard: msecs: 515 || ms/i: 0.515 || i/s: 1941.75
PMBuffered: msecs: 79 || ms/i: 0.158 || i/s: 6329.11

Testing 128x128 image:
BufferCreateINT: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateINT16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP16: msecs: 657 || ms/i: 109.5 || i/s: 9.13242
BufferCreateFP32: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
JustCopy: msecs: 312 || ms/i: 0.156 || i/s: 6410.26
SimpleSmooth: msecs: 313 || ms/i: 0.1565 || i/s: 6389.78
TexNoise: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
3x3Conv: msecs: 172 || ms/i: 0.172 || i/s: 5813.95
TEncode: msecs: 156 || ms/i: 0.156 || i/s: 6410.26
TDecode: msecs: 235 || ms/i: 0.235 || i/s: 4255.32
LinDiffINT: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
LinDiffINT16: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
LinDiffFP16: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
LinDiffFP32: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
PMTEncoded: msecs: 515 || ms/i: 0.515 || i/s: 1941.75
PMStandard: msecs: 594 || ms/i: 0.594 || i/s: 1683.5
PMBuffered: msecs: 93 || ms/i: 0.186 || i/s: 5376.34

Testing 256x256 image:
BufferCreateINT: msecs: 671 || ms/i: 111.833 || i/s: 8.94188
BufferCreateINT16: msecs: 657 || ms/i: 109.5 || i/s: 9.13242
BufferCreateFP16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP32: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
JustCopy: msecs: 312 || ms/i: 0.156 || i/s: 6410.26
SimpleSmooth: msecs: 313 || ms/i: 0.1565 || i/s: 6389.78
TexNoise: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
3x3Conv: msecs: 359 || ms/i: 0.359 || i/s: 2785.52
TEncode: msecs: 172 || ms/i: 0.172 || i/s: 5813.95
TDecode: msecs: 250 || ms/i: 0.25 || i/s: 4000
LinDiffINT: msecs: 391 || ms/i: 0.1955 || i/s: 5115.09
LinDiffINT16: msecs: 390 || ms/i: 0.195 || i/s: 5128.21
LinDiffFP16: msecs: 391 || ms/i: 0.1955 || i/s: 5115.09
LinDiffFP32: msecs: 391 || ms/i: 0.1955 || i/s: 5115.09
PMTEncoded: msecs: 531 || ms/i: 0.531 || i/s: 1883.24
PMStandard: msecs: 578 || ms/i: 0.578 || i/s: 1730.1
PMBuffered: msecs: 297 || ms/i: 0.594 || i/s: 1683.5

Testing 512x512 image:
BufferCreateINT: msecs: 672 || ms/i: 112 || i/s: 8.92857
BufferCreateINT16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP32: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
JustCopy: msecs: 187 || ms/i: 0.187 || i/s: 5347.59
SimpleSmooth: msecs: 360 || ms/i: 0.36 || i/s: 2777.78
TexNoise: msecs: 250 || ms/i: 0.25 || i/s: 4000
3x3Conv: msecs: 485 || ms/i: 0.97 || i/s: 1030.93
TEncode: msecs: 78 || ms/i: 0.156 || i/s: 6410.26
TDecode: msecs: 125 || ms/i: 0.25 || i/s: 4000
LinDiffINT: msecs: 407 || ms/i: 0.407 || i/s: 2457
LinDiffINT16: msecs: 406 || ms/i: 0.406 || i/s: 2463.05
LinDiffFP16: msecs: 391 || ms/i: 0.391 || i/s: 2557.54
LinDiffFP32: msecs: 594 || ms/i: 0.594 || i/s: 1683.5
PMTEncoded: msecs: 390 || ms/i: 0.78 || i/s: 1282.05
PMStandard: msecs: 1016 || ms/i: 2.032 || i/s: 492.126
PMBuffered: msecs: 250 || ms/i: 1 || i/s: 1000

Testing 1024x1024 image:
BufferCreateINT: msecs: 657 || ms/i: 109.5 || i/s: 9.13242
BufferCreateINT16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP32: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
JustCopy: msecs: 703 || ms/i: 0.703 || i/s: 1422.48
SimpleSmooth: msecs: 1766 || ms/i: 1.766 || i/s: 566.251
TexNoise: msecs: 890 || ms/i: 0.89 || i/s: 1123.6
3x3Conv: msecs: 2141 || ms/i: 4.282 || i/s: 233.536
TEncode: msecs: 94 || ms/i: 0.188 || i/s: 5319.15
TDecode: msecs: 516 || ms/i: 1.032 || i/s: 968.992
LinDiffINT: msecs: 1610 || ms/i: 1.61 || i/s: 621.118
LinDiffINT16: msecs: 1579 || ms/i: 1.579 || i/s: 633.312
LinDiffFP16: msecs: 1578 || ms/i: 1.578 || i/s: 633.714
LinDiffFP32: msecs: 2391 || ms/i: 2.391 || i/s: 418.235
PMTEncoded: msecs: 1625 || ms/i: 3.25 || i/s: 307.692
PMStandard: msecs: 4531 || ms/i: 9.062 || i/s: 110.351
PMBuffered: msecs: 1016 || ms/i: 4.064 || i/s: 246.063

Finished. Press return key to close...
                Don't forget to copy the results!

I'm assuming, lower is better right?
 
fallguy said:
Yeah, I cant get it to copy. I cant highlight the text, for some reason.
Did you try the menu in the window frame? Edit, select all and then edit, copy? That's what worked for me.
 
R9800se 128MB P4 2.8C 1Gig Ram
Code:
GL filter framework 1.2999 test application by Peter Thoman 2004-2005

Gui initialized successfully.
DevIL initialized successfully.
 - DevIL Version: 167
OpenGL initialized successfully.
ILUT OpenGL mode set successfully.
Loaded required OpenGL extensions for GLPixelShader.
Loaded required OpenGL extensions for GLRenderTexture.
Loaded required OpenGL extensions for GLFilterStep.
Initialization complete.

Press return key to start benchmark...



Testing 32x32 image:
BufferCreateINT: msecs: 782 || ms/i: 130.333 || i/s: 7.67263
BufferCreateINT16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP32: msecs: 750 || ms/i: 125 || i/s: 8
JustCopy: msecs: 594 || ms/i: 0.297 || i/s: 3367
SimpleSmooth: msecs: 609 || ms/i: 0.3045 || i/s: 3284.07
TexNoise: msecs: 672 || ms/i: 0.336 || i/s: 2976.19
3x3Conv: msecs: 375 || ms/i: 0.375 || i/s: 2666.67
TEncode: msecs: 297 || ms/i: 0.297 || i/s: 3367
TDecode: msecs: 532 || ms/i: 0.532 || i/s: 1879.7
LinDiffINT: msecs: 703 || ms/i: 0.3515 || i/s: 2844.95
LinDiffINT16: msecs: 625 || ms/i: 0.3125 || i/s: 3200
LinDiffFP16: msecs: 703 || ms/i: 0.3515 || i/s: 2844.95
LinDiffFP32: msecs: 719 || ms/i: 0.3595 || i/s: 2781.64
PMTEncoded: msecs: 1016 || ms/i: 1.016 || i/s: 984.252
PMStandard: msecs: 1110 || ms/i: 1.11 || i/s: 900.901
PMBuffered: msecs: 125 || ms/i: 0.25 || i/s: 4000

Testing 64x64 image:
BufferCreateINT: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateINT16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP32: msecs: 750 || ms/i: 125 || i/s: 8
JustCopy: msecs: 593 || ms/i: 0.2965 || i/s: 3372.68
SimpleSmooth: msecs: 594 || ms/i: 0.297 || i/s: 3367
TexNoise: msecs: 672 || ms/i: 0.336 || i/s: 2976.19
3x3Conv: msecs: 328 || ms/i: 0.328 || i/s: 3048.78
TEncode: msecs: 297 || ms/i: 0.297 || i/s: 3367
TDecode: msecs: 516 || ms/i: 0.516 || i/s: 1937.98
LinDiffINT: msecs: 625 || ms/i: 0.3125 || i/s: 3200
LinDiffINT16: msecs: 641 || ms/i: 0.3205 || i/s: 3120.12
LinDiffFP16: msecs: 641 || ms/i: 0.3205 || i/s: 3120.12
LinDiffFP32: msecs: 641 || ms/i: 0.3205 || i/s: 3120.12
PMTEncoded: msecs: 1016 || ms/i: 1.016 || i/s: 984.252
PMStandard: msecs: 969 || ms/i: 0.969 || i/s: 1031.99
PMBuffered: msecs: 140 || ms/i: 0.28 || i/s: 3571.43

Testing 128x128 image:
BufferCreateINT: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateINT16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP32: msecs: 750 || ms/i: 125 || i/s: 8
JustCopy: msecs: 671 || ms/i: 0.3355 || i/s: 2980.63
SimpleSmooth: msecs: 704 || ms/i: 0.352 || i/s: 2840.91
TexNoise: msecs: 750 || ms/i: 0.375 || i/s: 2666.67
3x3Conv: msecs: 531 || ms/i: 0.531 || i/s: 1883.24
TEncode: msecs: 297 || ms/i: 0.297 || i/s: 3367
TDecode: msecs: 531 || ms/i: 0.531 || i/s: 1883.24
LinDiffINT: msecs: 719 || ms/i: 0.3595 || i/s: 2781.64
LinDiffINT16: msecs: 625 || ms/i: 0.3125 || i/s: 3200
LinDiffFP16: msecs: 718 || ms/i: 0.359 || i/s: 2785.52
LinDiffFP32: msecs: 718 || ms/i: 0.359 || i/s: 2785.52
PMTEncoded: msecs: 1172 || ms/i: 1.172 || i/s: 853.242
PMStandard: msecs: 1031 || ms/i: 1.031 || i/s: 969.932
PMBuffered: msecs: 484 || ms/i: 0.968 || i/s: 1033.06

Testing 256x256 image:
BufferCreateINT: msecs: 765 || ms/i: 127.5 || i/s: 7.84314
BufferCreateINT16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP32: msecs: 750 || ms/i: 125 || i/s: 8
JustCopy: msecs: 704 || ms/i: 0.352 || i/s: 2840.91
SimpleSmooth: msecs: 1078 || ms/i: 0.539 || i/s: 1855.29
TexNoise: msecs: 797 || ms/i: 0.3985 || i/s: 2509.41
3x3Conv: msecs: 2046 || ms/i: 2.046 || i/s: 488.759
TEncode: msecs: 406 || ms/i: 0.406 || i/s: 2463.05
TDecode: msecs: 844 || ms/i: 0.844 || i/s: 1184.83
LinDiffINT: msecs: 1516 || ms/i: 0.758 || i/s: 1319.26
LinDiffINT16: msecs: 1516 || ms/i: 0.758 || i/s: 1319.26
LinDiffFP16: msecs: 1516 || ms/i: 0.758 || i/s: 1319.26
LinDiffFP32: msecs: 1781 || ms/i: 0.8905 || i/s: 1122.96
PMTEncoded: msecs: 1281 || ms/i: 1.281 || i/s: 780.64
PMStandard: msecs: 2672 || ms/i: 2.672 || i/s: 374.251
PMBuffered: msecs: 1859 || ms/i: 3.718 || i/s: 268.962

Testing 512x512 image:
BufferCreateINT: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateINT16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP32: msecs: 750 || ms/i: 125 || i/s: 8
JustCopy: msecs: 890 || ms/i: 0.89 || i/s: 1123.6
SimpleSmooth: msecs: 2094 || ms/i: 2.094 || i/s: 477.555
TexNoise: msecs: 937 || ms/i: 0.937 || i/s: 1067.24
3x3Conv: msecs: 3969 || ms/i: 7.938 || i/s: 125.976
TEncode: msecs: 219 || ms/i: 0.438 || i/s: 2283.11
TDecode: msecs: 1609 || ms/i: 3.218 || i/s: 310.752
LinDiffINT: msecs: 2922 || ms/i: 2.922 || i/s: 342.231
LinDiffINT16: msecs: 2922 || ms/i: 2.922 || i/s: 342.231
LinDiffFP16: msecs: 2921 || ms/i: 2.921 || i/s: 342.349
LinDiffFP32: msecs: 3438 || ms/i: 3.438 || i/s: 290.867
PMTEncoded: msecs: 2391 || ms/i: 4.782 || i/s: 209.118
PMStandard: msecs: 5188 || ms/i: 10.376 || i/s: 96.3763
PMBuffered: msecs: 3500 || ms/i: 14 || i/s: 71.4286

Testing 1024x1024 image:
BufferCreateINT: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateINT16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP16: msecs: 750 || ms/i: 125 || i/s: 8
BufferCreateFP32: msecs: 750 || ms/i: 125 || i/s: 8
JustCopy: msecs: 3485 || ms/i: 3.485 || i/s: 286.944
SimpleSmooth: msecs: 8266 || ms/i: 8.266 || i/s: 120.977
TexNoise: msecs: 3641 || ms/i: 3.641 || i/s: 274.65
3x3Conv: msecs: 15781 || ms/i: 31.562 || i/s: 31.6837
TEncode: msecs: 844 || ms/i: 1.688 || i/s: 592.417
TDecode: msecs: 6391 || ms/i: 12.782 || i/s: 78.235
LinDiffINT: msecs: 11609 || ms/i: 11.609 || i/s: 86.1401
LinDiffINT16: msecs: 11610 || ms/i: 11.61 || i/s: 86.1326
LinDiffFP16: msecs: 11609 || ms/i: 11.609 || i/s: 86.1401
LinDiffFP32: msecs: 13672 || ms/i: 13.672 || i/s: 73.1422
PMTEncoded: msecs: 9312 || ms/i: 18.624 || i/s: 53.6942
PMStandard: msecs: 20625 || ms/i: 41.25 || i/s: 24.2424
PMBuffered: msecs: 14906 || ms/i: 59.624 || i/s: 16.7718

Finished. Press return key to close...
                Don't forget to copy the results!
Edit: Added Ram details as requested.
 
Broken Hope said:
Could someone explain why I seem to be getting worse results than just about everyone, when I'm running an A64 3500+ and a X850XT?

Don't worry, your system is wickedly fast ;)

msecs and ms/i: lower is better
i/s: higher is better

I updated the first post to include the fixed version and copy instructions.
Thanks for all the results so far!
 
Hmm, I really have no idea what the problem is with 6x00 cards in the very last benchmark... the only thing that may make some sense is AGP/PCIE swapping.

Could you include the amount of RAM on your card when reporting the results? Perhaps that may shed some light on all of this.
 
geo said:
Oh yeah, btw, that Profit!! part, what's up with that? Are we speaking about karma or something more concrete? :D
http://en.wikipedia.org/wiki/Slashdot_subculture#Business_plans
Sorry, nothing tangible :D

You said your image processing framework, but frankly I don't know what that means and where you're going with this. Not that this is a prerequisite; just curious where my ticks are being donated. ;)
Erm, you could read
http://infmath.uibk.ac.at/teaching/...;table_id=tasks&men_task=fin_bak&sem= ;)

(direct pdf link: http://landesjugendtheater.at/misc/bakk1.pdf )
 
PeterT said:
Broken Hope said:
Could someone explain why I seem to be getting worse results than just about everyone, when I'm running an A64 3500+ and a X850XT?

Don't worry, your system is wickedly fast ;)

msecs and ms/i: lower is better
i/s: higher is better

I updated the first post to include the fixed version and copy instructions.
Thanks for all the results so far!

So how come my msecs seem to be so much higher than other peoples?
 
geo said:
fallguy said:
Yeah, I cant get it to copy. I cant highlight the text, for some reason.
Did you try the menu in the window frame? Edit, select all and then edit, copy? That's what worked for me.

Thanks!

Code:
GL filter framework 1.2999 test application by Peter Thoman 2004-2005

Gui initialized successfully.
DevIL initialized successfully.
 - DevIL Version: 167
OpenGL initialized successfully.
ILUT OpenGL mode set successfully.
Loaded required OpenGL extensions for GLPixelShader.
Loaded required OpenGL extensions for GLRenderTexture.
Loaded required OpenGL extensions for GLFilterStep.
Initialization complete.

Press return key to start benchmark...



Testing 32x32 image:
BufferCreateINT: msecs: 657 || ms/i: 109.5 || i/s: 9.13242
BufferCreateINT16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP32: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
JustCopy: msecs: 281 || ms/i: 0.1405 || i/s: 7117.44
SimpleSmooth: msecs: 281 || ms/i: 0.1405 || i/s: 7117.44
TexNoise: msecs: 313 || ms/i: 0.1565 || i/s: 6389.78
3x3Conv: msecs: 172 || ms/i: 0.172 || i/s: 5813.95
TEncode: msecs: 140 || ms/i: 0.14 || i/s: 7142.86
TDecode: msecs: 203 || ms/i: 0.203 || i/s: 4926.11
LinDiffINT: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
LinDiffINT16: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
LinDiffFP16: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
LinDiffFP32: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
PMTEncoded: msecs: 468 || ms/i: 0.468 || i/s: 2136.75
PMStandard: msecs: 516 || ms/i: 0.516 || i/s: 1937.98
PMBuffered: msecs: 78 || ms/i: 0.156 || i/s: 6410.26

Testing 64x64 image:
BufferCreateINT: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateINT16: msecs: 657 || ms/i: 109.5 || i/s: 9.13242
BufferCreateFP16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP32: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
JustCopy: msecs: 312 || ms/i: 0.156 || i/s: 6410.26
SimpleSmooth: msecs: 313 || ms/i: 0.1565 || i/s: 6389.78
TexNoise: msecs: 344 || ms/i: 0.172 || i/s: 5813.95
3x3Conv: msecs: 171 || ms/i: 0.171 || i/s: 5847.95
TEncode: msecs: 156 || ms/i: 0.156 || i/s: 6410.26
TDecode: msecs: 219 || ms/i: 0.219 || i/s: 4566.21
LinDiffINT: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
LinDiffINT16: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
LinDiffFP16: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
LinDiffFP32: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
PMTEncoded: msecs: 469 || ms/i: 0.469 || i/s: 2132.2
PMStandard: msecs: 469 || ms/i: 0.469 || i/s: 2132.2
PMBuffered: msecs: 78 || ms/i: 0.156 || i/s: 6410.26

Testing 128x128 image:
BufferCreateINT: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateINT16: msecs: 657 || ms/i: 109.5 || i/s: 9.13242
BufferCreateFP16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP32: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
JustCopy: msecs: 281 || ms/i: 0.1405 || i/s: 7117.44
SimpleSmooth: msecs: 297 || ms/i: 0.1485 || i/s: 6734.01
TexNoise: msecs: 297 || ms/i: 0.1485 || i/s: 6734.01
3x3Conv: msecs: 156 || ms/i: 0.156 || i/s: 6410.26
TEncode: msecs: 140 || ms/i: 0.14 || i/s: 7142.86
TDecode: msecs: 219 || ms/i: 0.219 || i/s: 4566.21
LinDiffINT: msecs: 312 || ms/i: 0.156 || i/s: 6410.26
LinDiffINT16: msecs: 313 || ms/i: 0.1565 || i/s: 6389.78
LinDiffFP16: msecs: 312 || ms/i: 0.156 || i/s: 6410.26
LinDiffFP32: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
PMTEncoded: msecs: 516 || ms/i: 0.516 || i/s: 1937.98
PMStandard: msecs: 515 || ms/i: 0.515 || i/s: 1941.75
PMBuffered: msecs: 78 || ms/i: 0.156 || i/s: 6410.26

Testing 256x256 image:
BufferCreateINT: msecs: 672 || ms/i: 112 || i/s: 8.92857
BufferCreateINT16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP16: msecs: 657 || ms/i: 109.5 || i/s: 9.13242
BufferCreateFP32: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
JustCopy: msecs: 312 || ms/i: 0.156 || i/s: 6410.26
SimpleSmooth: msecs: 313 || ms/i: 0.1565 || i/s: 6389.78
TexNoise: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
3x3Conv: msecs: 297 || ms/i: 0.297 || i/s: 3367
TEncode: msecs: 172 || ms/i: 0.172 || i/s: 5813.95
TDecode: msecs: 219 || ms/i: 0.219 || i/s: 4566.21
LinDiffINT: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
LinDiffINT16: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
LinDiffFP16: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
LinDiffFP32: msecs: 328 || ms/i: 0.164 || i/s: 6097.56
PMTEncoded: msecs: 485 || ms/i: 0.485 || i/s: 2061.86
PMStandard: msecs: 547 || ms/i: 0.547 || i/s: 1828.15
PMBuffered: msecs: 281 || ms/i: 0.562 || i/s: 1779.36

Testing 512x512 image:
BufferCreateINT: msecs: 672 || ms/i: 112 || i/s: 8.92857
BufferCreateINT16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP16: msecs: 657 || ms/i: 109.5 || i/s: 9.13242
BufferCreateFP32: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
JustCopy: msecs: 171 || ms/i: 0.171 || i/s: 5847.95
SimpleSmooth: msecs: 344 || ms/i: 0.344 || i/s: 2906.98
TexNoise: msecs: 344 || ms/i: 0.344 || i/s: 2906.98
3x3Conv: msecs: 468 || ms/i: 0.936 || i/s: 1068.38
TEncode: msecs: 78 || ms/i: 0.156 || i/s: 6410.26
TDecode: msecs: 125 || ms/i: 0.25 || i/s: 4000
LinDiffINT: msecs: 391 || ms/i: 0.391 || i/s: 2557.54
LinDiffINT16: msecs: 391 || ms/i: 0.391 || i/s: 2557.54
LinDiffFP16: msecs: 375 || ms/i: 0.375 || i/s: 2666.67
LinDiffFP32: msecs: 547 || ms/i: 0.547 || i/s: 1828.15
PMTEncoded: msecs: 375 || ms/i: 0.75 || i/s: 1333.33
PMStandard: msecs: 969 || ms/i: 1.938 || i/s: 515.996
PMBuffered: msecs: 235 || ms/i: 0.94 || i/s: 1063.83

Testing 1024x1024 image:
BufferCreateINT: msecs: 657 || ms/i: 109.5 || i/s: 9.13242
BufferCreateINT16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP16: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
BufferCreateFP32: msecs: 656 || ms/i: 109.333 || i/s: 9.14634
JustCopy: msecs: 703 || ms/i: 0.703 || i/s: 1422.48
SimpleSmooth: msecs: 1344 || ms/i: 1.344 || i/s: 744.048
TexNoise: msecs: 1156 || ms/i: 1.156 || i/s: 865.052
3x3Conv: msecs: 1750 || ms/i: 3.5 || i/s: 285.714
TEncode: msecs: 78 || ms/i: 0.156 || i/s: 6410.26
TDecode: msecs: 578 || ms/i: 1.156 || i/s: 865.052
LinDiffINT: msecs: 1484 || ms/i: 1.484 || i/s: 673.854
LinDiffINT16: msecs: 1516 || ms/i: 1.516 || i/s: 659.631
LinDiffFP16: msecs: 1516 || ms/i: 1.516 || i/s: 659.631
LinDiffFP32: msecs: 2234 || ms/i: 2.234 || i/s: 447.628
PMTEncoded: msecs: 1391 || ms/i: 2.782 || i/s: 359.454
PMStandard: msecs: 3938 || ms/i: 7.876 || i/s: 126.968
PMBuffered: msecs: 953 || ms/i: 3.812 || i/s: 262.329

Finished. Press return key to close...
                Don't forget to copy the results!

3500+@2.5, X850XT/PE.
 
Broken Hope said:
So how come my msecs seem to be so much higher than other peoples?

I think you are looking at the wrong numbers. For example:
Your PMTEncoded msecs at 1024x1024: 1625
Wireframe's PMTEncoded msecs at 1024x1024: 4844
bloodbob's PMTEncoded msecs at 1024x1024: 6093
trinibwoy's PMTEncoded msecs at 1024x1024: 3860

jvd's system is the only(slightly) faster one with: 1453 :oops:

See?
 
PeterT said:
Broken Hope said:
So how come my msecs seem to be so much higher than other peoples?

I think you are looking at the wrong numbers. For example:
Your PMTEncoded msecs at 1024x1024: 1625
Wireframe's PMTEncoded msecs at 1024x1024: 4844
bloodbob's PMTEncoded msecs at 1024x1024: 6093
trinibwoy's PMTEncoded msecs at 1024x1024: 3860

jvd's system is the only(slightly) faster one with: 1453 :oops:

See?

Yeah I see that, but then there's results where I get like

Testing 1024x1024 image:
BufferCreateINT: msecs: 657

and other people are getting like 62 for the same result, doesn't that make mine quite a bit slower?
 
Broken Hope said:
Yeah I see that, but then there's results where I get like

Testing 1024x1024 image:
BufferCreateINT: msecs: 657

and other people are getting like 62 for the same result, doesn't that make mine quite a bit slower?

Forget about the buffer creation stuff, that's more to satisfy my curiosity than anything else.
 
PeterT,

So what's the deal with the NV40/45? Is this due to some error in your code or are the results posted useful? It seems something is very wrong in the 1024x1024 test.

Look at the last values between Trinibwoy, Geo, and me. Not only does this one test take a minute to complete, which seems wrong, look at how Trinibwoy gets a 'much better' score.

Trinibwoy: PMBuffered: msecs: 54594 || ms/i: 218.376 || i/s: 4.57926
Geo: PMBuffered: msecs: 495016 || ms/i: 1980.06 || i/s: 0.505034
Me: PMBuffered: msecs: 278422 || ms/i: 1113.69 || i/s: 0.897918

Looks like we're running out of resources somewhere and the bottleneck is shifted. Otherwise it would be difficult to explain how I get almost twice the score of Geo and Trinibwoy gets roughly 5 times my score. All on very similar hardware/clock speeds.

Keep in mind what I said about the temperature flux on the 6800 Ultra when I ran it. Last test is close to idling in terms of temperature.

BTW, think you could compile a looped version of this test? It doesn't need output. It would make a very nice stress test tool. This thing heats up the GPU fast.
 
I thought about it some more, and I'm now quite sure that the nv4x based cards start AGP/PCIE swapping in that test (this is also the only one where that is possible due to some specifics of the implementation). So, most of the time spent on that benchmark is probably just the driver waiting for some data upload to happen across the (comparatively) extremely SLOW AGP/PCIE upstream. That means that you should discount that result.

What IS strange is that for example madmartyau's 128 MB 9800 seems to manage just fine without swapping...

Some NV driver guy want to take a look at this? ;)
 
PeterT said:
What IS strange is that for example madmartyau's 128 MB 9800 seems to manage just fine without swapping...

Do you know which ForceWare revision your application needs as a minimum to run? I have a theory...maybe not so much a theory, but something that may be worth testing.

PS. Any predictions of how SLI will impact performance on your code? We need someone with SLI to post their numbers, but I'd like to read your prediction before such numbers are revealed :D
 
Back
Top