About Geforce 7 Series architecture

Link

Look at the link and compare the 2 GPUs. One of the reasons is that each ALU in nvidia's 7 series has a mini-alu to assist it. Although the 7600 GT has fewer pipes than the 6800 GT, it still boasts higher no. of shader operations and better texture fillrate. If i am to take a guess, the bandwidth in the 6800 GT goes wasted, so it is of no use, however, i think the opposite is true when utilizing AA. I really dunno, and don't take my words solid, there are more knowledgeable people around here, so just wait for someone's answer.
Evidently. According to this comparision (based on reviews from computerbase.de), GF6800 wasn't able to utilize it's bandwidth...

perf_per_gbps.png
 
hmm i got it :yep2


and about the mini-ALUs , what do they do ?


so , primary ALU's MADD capability improves shading performance , but does this capability improve texture fillrate too ?
 
and about the mini-ALUs , what do they do ?
I believe NV40 and G70 have the same mini-ALU's. They're used to perform other operations then multiplication and addition. For example division, square root, sine, exponent, etc. The mini-ALU's work on just one component (instead of a whole vector), hence the name.
so , primary ALU's MADD capability improves shading performance , but does this capability improve texture fillrate too ?
No. Both NV40 and G70 have pixel pipelines where first a texture is sampled and then the ALU's can perform arithmetic operations. So texture fillrate depends only on the number of pipelines and the clock frequency.
 
Evidently. According to this comparision (based on reviews from computerbase.de), GF6800 wasn't able to utilize it's bandwidth...

perf_per_gbps.png

I'd like to see that graph with 4x AA too. (I got a 6800GT myself and only use it with 4x or better, I think the bandwith somewhat makes sense then.)
 
Just in case anyones interested (since this thread started out as 6800gt vs 7600gt)

some benchmark results from this benchmark
http://forum.beyond3d.com/showthread.php?t=41647

both done on a p4 3ghz win xp (slightly different driver versions)

left = 6800gt right = 7600gt (lower = better)

------------------
-- size: 3

BENCHMARK
Jacobi iteration: 18.7 micros -- 7.8125 micros
Residual calculation: 18.8 micros -- 6.25 micros
Restriction: fw: 14.1 micros -- 7.8125 micros
Interpolation + add: 14 micros -- 6.25 micros
VCycle: 828 micros -- 609.375 micros


BENCHMARK
Jacobi iteration: 12.5 micros -- 6.25 micros
Residual calculation: 12.5 micros -- 6.25 micros
Restriction: fw: 9.4 micros -- 7.8125 micros
Interpolation + add: 9.4 micros -- 6.25 micros
VCycle: 797 micros -- 609.375 micros


BENCHMARK
Jacobi iteration: 10.9 micros -- 6.25 micros
Residual calculation: -- 12.5 micros 6.25 micros
Restriction: fw: 9.4 micros -- 7.8125 micros
Interpolation + add: 9.4 micros -- 6.25 micros
VCycle: 797 micros -- 625 micros


------------------
-- size: 1023

BENCHMARK
Jacobi iteration: 14218.5 micros -- 8804.69 micros
Residual calculation: 14515.5 micros -- 9437.5 micros
Restriction: fw: 12484.5 micros -- 7312.5 micros
Interpolation + add: 11429.5 micros -- 5968.75 micros
VCycle: 165470 micros -- 114453 micros


BENCHMARK
Jacobi iteration: 14101.5 micros -- 8781.25 micros
Residual calculation: 14500 micros -- 9421.88 micros
Restriction: fw: 12539 micros -- 7335.94 micros
Interpolation + add: 11476.5 micros -- 5968.75 micros
VCycle: 168360 micros -- 114453 micros


BENCHMARK
Jacobi iteration: 14164 micros -- 8765.63 micros
Residual calculation: 14445.5 -- micros 9453.13 micros
Restriction: fw: 12508 micros -- 7328.13 micros
Interpolation + add: 11468.5 -- micros 5976.56 micros
VCycle: 165315 micros -- 114375 micros
 
Last edited by a moderator:
Back
Top