GFFX Tests / Test Apps

For reference:

With all my various "dirty system" tasks running...

Fillrate Tester
--------------------------
Display adapter: ALL-IN-WONDER RADEON 8500
Driver version: 4.14.1.3659
Display mode: 1024x768x32bpp
--------------------------

Color writes enabled, z-writes disabled:
FFP - Pure fillrate - 1149.041748M pixels/sec
FFP - Single texture - 947.861145M pixels/sec
FFP - Dual texture - 534.023438M pixels/sec
FFP - Triple texture - 243.391159M pixels/sec
FFP - Quad texture - 181.625107M pixels/sec
PS_2_0 - Per pixel lighting - Failed!
PS_2_0 PP - Per pixel lighting - Failed!
PS_1_1 - Simple - 1097.858154M pixels/sec
PS_1_4 - Simple - 595.922119M pixels/sec
PS_2_0 - Simple - Failed!

Color writes disabled, z-writes enabled:
FFP - Pure fillrate - 946.800964M pixels/sec
FFP - Single texture - 949.292480M pixels/sec
FFP - Dual texture - 608.752991M pixels/sec
FFP - Triple texture - 263.951935M pixels/sec
FFP - Quad texture - 177.574600M pixels/sec
PS_2_0 - Per pixel lighting - Failed!
PS_2_0 PP - Per pixel lighting - Failed!
PS_1_1 - Simple - 1086.077515M pixels/sec
PS_1_4 - Simple - 586.311218M pixels/sec
PS_2_0 - Simple - Failed!
 
Interesting demalion, that the PS 1.4 simple test is about 1/2 the speed of the PS 1.1 test.

That goes a long way to explaning how the Radeon 8500 merely "competes" with the GeForce3/4 in Doom3. (And can be evidence of the hardware issue in the 8500 that has been bandied about that impacts performance of 1.4 shaders.)

Aside from the GeForceFX of course, I would love to see someone with a 9000, and 95/9700 run this test. Will be most interesting to see if either of those chips improves PS 1.4 performance relative to PS 1.1....

On the other hand...I don't know the specifics of the PS 1.4 test itself....it's conceivable that it "should" be roughly 1/2 the fill rate performance of the 1.1 test. ;)

MDolenc? Thoughts?
 
Here's my results with a Radeon 9000...


Fillrate Tester
--------------------------
Display adapter: RADEON 9000 SERIES
Driver version: 4.14.1.3659
Display mode: 1024x768x32bpp
--------------------------

Color writes enabled, z-writes disabled:
FFP - Pure fillrate - 986.831970M pixels/sec
FFP - Single texture - 955.843750M pixels/sec
FFP - Dual texture - 472.467651M pixels/sec
FFP - Triple texture - 288.102722M pixels/sec
FFP - Quad texture - 188.738724M pixels/sec
PS_2_0 - Per pixel lighting - Failed!
PS_2_0 PP - Per pixel lighting - Failed!
PS_1_1 - Simple - 519.341431M pixels/sec
PS_1_4 - Simple - 342.122009M pixels/sec
PS_2_0 - Simple - Failed!

Color writes disabled, z-writes enabled:
FFP - Pure fillrate - 994.819519M pixels/sec
FFP - Single texture - 1005.155029M pixels/sec
FFP - Dual texture - 486.977325M pixels/sec
FFP - Triple texture - 292.988586M pixels/sec
FFP - Quad texture - 184.690857M pixels/sec
PS_2_0 - Per pixel lighting - Failed!
PS_2_0 PP - Per pixel lighting - Failed!
PS_1_1 - Simple - 508.653778M pixels/sec
PS_1_4 - Simple - 337.604645M pixels/sec
PS_2_0 - Simple - Failed!
 
PS1.1 routine
ps_1_1

def c0, 0.3f, 0.7f, 0.2f, 0.4f

add r0, c0, -v0
add r0, r0, v1

PS1.4 routine
ps_1_4

def c0, 0.3f, 0.7f, 0.2f, 0.4f

texcrd r1.xyz, t0
texcrd r2.xyz, t1

add r3.xyz, c0, -r1
add r3.xyz, r3, r2

phase

mov r0.rgb, r3
+mov r0.a, c0.a
 
I just remain puzzled that my 8500 matches the PS 1.1 simple performance of the newer cards (Quadro FX 2000 and 9700 Pro...no 5800 ultra results with PS 1.1 simple listed that I saw)...

I'd think clock speed would let the 9700 pull ahead.

I am also puzzled as to why setting the Optimize (PVSCode and TexStages) to 0 gave me a boost (I guess my drivers on my slow CPU slow down the process by trying to analyze the code? Looks like the optimizer needs optimizing...the PS1.1 shader is very short! :p ).

Just FYI (OptimizeXXX registry entries set to 0):

Fillrate Tester
--------------------------
Display adapter: ALL-IN-WONDER RADEON 8500
Driver version: 4.14.1.3659
Display mode: 1024x768x32bpp
--------------------------

Color writes enabled, z-writes disabled:
FFP - Pure fillrate - 1249.172607M pixels/sec
FFP - Single texture - 1034.936646M pixels/sec
FFP - Dual texture - 563.419495M pixels/sec
FFP - Triple texture - 255.208725M pixels/sec
FFP - Quad texture - 168.004639M pixels/sec
PS_2_0 - Per pixel lighting - Failed!
PS_2_0 PP - Per pixel lighting - Failed!
PS_1_1 - Simple - 1233.932617M pixels/sec
PS_1_4 - Simple - 642.095886M pixels/sec
PS_2_0 - Simple - Failed!

Color writes disabled, z-writes enabled:
FFP - Pure fillrate - 1044.097412M pixels/sec
FFP - Single texture - 1095.797852M pixels/sec
FFP - Dual texture - 644.222656M pixels/sec
FFP - Triple texture - 283.257172M pixels/sec
FFP - Quad texture - 172.952133M pixels/sec
PS_2_0 - Per pixel lighting - Failed!
PS_2_0 PP - Per pixel lighting - Failed!
PS_1_1 - Simple - 1266.392456M pixels/sec
PS_1_4 - Simple - 630.688477M pixels/sec
PS_2_0 - Simple - Failed!
 
DaveBaumann said:
Try and keep discussion to a minimum in this thread and just post the tests/benchmarks you want run (with links if needs be) and the criteria for the test.
 
Fillrate Tester
--------------------------
Display adapter: NVIDIA Quadro FX 2000
Driver version: 6.14.1.4290
Display mode: 1024x768x32bpp
--------------------------

Color writes enabled, z-writes disabled:
FFP - Pure fillrate - 1512.724487M pixels/sec
FFP - Single texture - 1217.574219M pixels/sec
FFP - Dual texture - 975.951599M pixels/sec
FFP - Triple texture - 570.073181M pixels/sec
FFP - Quad texture - 542.109619M pixels/sec
PS_2_0 - Per pixel lighting - 58.340889M pixels/sec
PS_2_0 PP - Per pixel lighting - 58.350346M pixels/sec
PS_1_1 - Simple - 766.042053M pixels/sec
PS_1_4 - Simple - 480.761139M pixels/sec
PS_2_0 - Simple - 483.617737M pixels/sec

Color writes disabled, z-writes enabled:
FFP - Pure fillrate - 2893.234375M pixels/sec
FFP - Single texture - 2890.013672M pixels/sec
FFP - Dual texture - 2887.535156M pixels/sec
FFP - Triple texture - 2885.977051M pixels/sec
FFP - Quad texture - 2744.649414M pixels/sec
PS_2_0 - Per pixel lighting - 1999.308838M pixels/sec
PS_2_0 PP - Per pixel lighting - 1998.668091M pixels/sec
PS_1_1 - Simple - 2174.374512M pixels/sec
PS_1_4 - Simple - 2174.647461M pixels/sec
PS_2_0 - Simple - 2173.879883M pixels/sec
 
Fillrate Tester
--------------------------
Display adapter: RADEON 9700 SERIES 325/310
Driver version: 6.14.1.6292
Display mode: 1024x768x32bpp
--------------------------

Color writes enabled, z-writes disabled:
FFP - Pure fillrate - 1786.334229M pixels/sec
FFP - Single texture - 1962.872559M pixels/sec
FFP - Dual texture - 1028.006592M pixels/sec
FFP - Triple texture - 680.939575M pixels/sec
FFP - Quad texture - 498.430298M pixels/sec
PS_2_0 - Per pixel lighting - 188.913147M pixels/sec
PS_2_0 PP - Per pixel lighting - 188.947739M pixels/sec
PS_1_1 - Simple - 1230.885376M pixels/sec
PS_1_4 - Simple - 1223.748901M pixels/sec
PS_2_0 - Simple - 1223.748047M pixels/sec

Color writes disabled, z-writes enabled:
FFP - Pure fillrate - 2373.947266M pixels/sec
FFP - Single texture - 2369.830566M pixels/sec
FFP - Dual texture - 1207.244263M pixels/sec
FFP - Triple texture - 788.392456M pixels/sec
FFP - Quad texture - 578.483704M pixels/sec
PS_2_0 - Per pixel lighting - 188.934631M pixels/sec
PS_2_0 PP - Per pixel lighting - 188.935257M pixels/sec
PS_1_1 - Simple - 1232.245605M pixels/sec
PS_1_4 - Simple - 1225.064209M pixels/sec
PS_2_0 - Simple - 1225.133789M pixels/sec
 
Here are the results of my 9500 Pro

By the way - system stats:
Fresh boot, Fresh install Win XP pro
SP1, DX9, Cat 3.1 (official release)
Disk - fresh defrag
Asus - P4B533 Mobo
P4 1.6A OC 2.4
512MB Ram
120 GB Disk Drive
Intel App accelerator software installed

Fillrate Tester
--------------------------
Display adapter: RADEON 9700 & 9500 SERIES
Driver version: 6.14.1.6292
Display mode: 1024x768x32bpp
--------------------------

Color writes enabled, z-writes disabled:
FFP - Pure fillrate - 1755.308960M pixels/sec
FFP - Single texture - 1048.688477M pixels/sec
FFP - Dual texture - 651.341370M pixels/sec
FFP - Triple texture - 381.601257M pixels/sec
FFP - Quad texture - 277.572784M pixels/sec
PS_2_0 - Per pixel lighting - 160.131592M pixels/sec
PS_2_0 PP - Per pixel lighting - 160.132950M pixels/sec
PS_1_1 - Simple - 1046.095947M pixels/sec
PS_1_4 - Simple - 1038.867188M pixels/sec
PS_2_0 - Simple - 1038.838501M pixels/sec

Color writes disabled, z-writes enabled:
FFP - Pure fillrate - 2011.864990M pixels/sec
FFP - Single texture - 1938.748535M pixels/sec
FFP - Dual texture - 984.339905M pixels/sec
FFP - Triple texture - 508.533905M pixels/sec
FFP - Quad texture - 348.269348M pixels/sec
PS_2_0 - Per pixel lighting - 160.126846M pixels/sec
PS_2_0 PP - Per pixel lighting - 160.129547M pixels/sec
PS_1_1 - Simple - 1046.120728M pixels/sec
PS_1_4 - Simple - 1038.851807M pixels/sec
PS_2_0 - Simple - 1038.853638M pixels/sec
 
Amazing;

the R9000 is nearly as fast as the R8500 in Dual-Texture mode.

R8500 4 x 2 = FFP - Dual texture - 563.419495M pixels/sec
R9000 4 x 1 = FFP - Dual texture - 472.467651M pixels/sec

So now, we only need to know the actuall speed (MHz) of each card to draw an conclusion. ;)
 
Fillrate Tester
--------------------------
Display adapter: RADEON 9700 SERIES
Driver version: 4.14.1.150
Display mode: 1024x768x32bpp
--------------------------

Color writes enabled, z-writes disabled:
FFP - Pure fillrate - 2295.303223M pixels/sec
FFP - Single texture - 1989.129761M pixels/sec
FFP - Dual texture - 1036.949707M pixels/sec
FFP - Triple texture - 691.137817M pixels/sec
FFP - Quad texture - 501.286133M pixels/sec
PS_2_0 - Per pixel lighting - 189.051788M pixels/sec
PS_2_0 PP - Per pixel lighting - 189.035690M pixels/sec
PS_1_1 - Simple - 1231.847168M pixels/sec
PS_1_4 - Simple - 1224.331055M pixels/sec
PS_2_0 - Simple - 1224.244385M pixels/sec

Color writes disabled, z-writes enabled:
FFP - Pure fillrate - 2376.586914M pixels/sec
FFP - Single texture - 2370.298584M pixels/sec
FFP - Dual texture - 1212.695801M pixels/sec
FFP - Triple texture - 791.838074M pixels/sec
FFP - Quad texture - 581.834717M pixels/sec
PS_2_0 - Per pixel lighting - 189.063019M pixels/sec
PS_2_0 PP - Per pixel lighting - 189.078094M pixels/sec
PS_1_1 - Simple - 1231.904419M pixels/sec
PS_1_4 - Simple - 1224.646606M pixels/sec
PS_2_0 - Simple - 1224.536743M pixels/sec

OC results, 398/375

Fillrate Tester
--------------------------
Display adapter: RADEON 9700 SERIES
Driver version: 4.14.1.150
Display mode: 1024x768x32bpp
--------------------------

Color writes enabled, z-writes disabled:
FFP - Pure fillrate - 2774.083008M pixels/sec
FFP - Single texture - 2422.042480M pixels/sec
FFP - Dual texture - 1264.992432M pixels/sec
FFP - Triple texture - 838.960815M pixels/sec
FFP - Quad texture - 614.590210M pixels/sec
PS_2_0 - Per pixel lighting - 235.501831M pixels/sec
PS_2_0 PP - Per pixel lighting - 233.968399M pixels/sec
PS_1_1 - Simple - 1513.464478M pixels/sec
PS_1_4 - Simple - 1506.541016M pixels/sec
PS_2_0 - Simple - 1508.824829M pixels/sec

Color writes disabled, z-writes enabled:
FFP - Pure fillrate - 2911.676270M pixels/sec
FFP - Single texture - 2906.829590M pixels/sec
FFP - Dual texture - 1488.733276M pixels/sec
FFP - Triple texture - 973.073914M pixels/sec
FFP - Quad texture - 714.138245M pixels/sec
PS_2_0 - Per pixel lighting - 233.962234M pixels/sec
PS_2_0 PP - Per pixel lighting - 233.967163M pixels/sec
PS_1_1 - Simple - 1513.460449M pixels/sec
PS_1_4 - Simple - 1506.160278M pixels/sec
PS_2_0 - Simple - 1506.145020M pixels/sec
 
A few interesting notes:

1) The Radeon 9500/9700 seems to have "fixed" issues with the PS 1.4 path that the 8500 has, ans the 9000 has to a lesser extent. The 95/9700 is executing the simple PS 1_4 just as fast as PS 1_1 shaders. 8500/9000

I would guess that the 9500 NON PRO, may end up being a significantly better Doom3 card than the 8500/9000, despite the fact that it has similar fillrate and bandwidth.

2) 9500 Pro is clearly bandwidth limited when applying textures. (Not unexpected.), though with pixel shading, bandwidth may be less important. Pixel shading performance of the 9500 is very close to the 9700.

It will be very interesting to see how different driver revisions affect the pixel shading scores of the GeforceFX....
 
About "Radeon 8500 ps_1_4 issue":
There really is no issue... As I have wanted to see at what precision GeForce FX executes ps_1_4 shaders (and I think we can now say, that at least phase 1 is float), I have jumped through few hops to move all the computation into first phase. Since v0 and v1 are not available I had to use texture coordinates... As we can't yet simply use texture coordinates anywhere we want I had to use to additional "texcrd" instructions (so ps_1_4 has effectively 2x the instruction count of ps_1_1 or ps_2_0). These "texcrd" instructions get scraped really fast for ps_2_0 parts (since they can use texture coordinates anywhere), however Radeon 8500 doesn't. It does seam however that Radeon 8500 can't do tex* + math instruction in one clock.
 
Hmmm....8500 results are odd in general....it's dual texturing rate is getting just over half of the single textured pixel rate.

They should be pretty close to being the same, unless the 8500 is horrendously bandwidth starved?
 
Still not WHQL though. Warp2search did some tests and they are still missing visual effects in 3Dmark03. I suspect it will be along time till we see WHQL drivers for the NV30
 
Back
Top