8800GTX Shadermark results

Status
Not open for further replies.
You're missing my point which is that it is entirely self evident (unless you name is Fuad) and thus redundant that, all other things equal, a faster GPU will increase the likelyhood of an application to become CPU limited.

I just find it strange that is suddenly being 'discovered' as being particularly special for the 8800GT(X|S).

Especially for a GTS which is allegedly about as fast as a 7950GX2.

(Unless, of course, there is a fundamental architectural difference in the driver that increases the load of the CPU for an 8800GTS. Now that would be a much interesting thing to look at, but without context we can't know.)
Very well put. While I was reading @XS, my annoyance saw no end. I mean, those clueless folks compare the 3DMark06 scores of G80, with Core 2 Duo and Core 2 Quad respectively, then conclude that "G80 is very CPU-dependent." Are they really that stupid or do their discussions always run circles around 3DMark?

Yeah faster GPU needs faster CPU, but there is nothing new about it and todays CPUs are plenty fast to feed even SLI'ed G80.
 
Well, I think you are correct in that the ratios haven't gone down, if anything, more TEX power is available per ALU I bet. Likewise, ROP power has gone up, atleast effective ROP power (MSAA, HDR, memory bandwidth). Whether you view this as a good thing depends on assumptions of what developers will do with TEX:ALU. I think already the R580, Xenos, even RSX/G71, have massive amounts of ALU power that has gone unused, in part, because figuring out what to do with it and maximize efficiency is hard. But using more texturing, fillrate, and bandwidth power is much easier to figure out from a developer perspective.
Yeah, it's tough to say which is better for 3D graphics progress.

Personally, I love high texture throughput for PRT and spherical harmonic techniques. You're basically just loading a texture and doing a dot product over and over, and precompution in general is always a powerful technique.

A lot of pixel shaders seem to be throwing in ops for next to no reason (i.e. little visual benefit). You can see why ATI is going the route they are, but I think texturing offers more potential for realism. The biggest difference I see between CGI and realtime is neighbourhood transfer of radiance/occlusion, and data transfer is critical in overcoming this hurdle.

That said, math is still pretty useful. If it really is as cheap as it appeared to be in the R520-R580 transition, it's not so bad if it goes underutilized most of the time, since the peak processing ability comes in handy.
 
Mintmaster - concerning free tri-linear - doesn't DX10 mandate FP32 filtering? If so, perhaps just 1 FP32 bilinear sample per clock is supported, and the cost of the HW for another FP16 bilinear sample (at a related texture address) is relatively small, or at least small compared to additional texture addressing HW required for the usual "1 texture address and bilinear sample per clock" setups.

Assuming decent levels of AF and 8/16 bit per component textures as the common case, my bet would be that the second bilinear sample unit won't be idle very much of the the time. In other words - if the design target is high IQ (reasonable, considering how long NV has been getting bashed on this point), being able to perform 2 bilinear samples per clock (free trilinear, 2x AF) doesn't seem weird at all.

Long term, I think that transistor count is going to scale up faster than bandwidth, and that this will naturally lead to higher ALU to TEX ratio (will be interesting to see if 65nm G8x bears this out).
 
He is counting the presumed/speculated MADD and MUL units in each G80 ALU.

MADD = 2 FLOPs
MUL = 1 FLOP.

Also on G70 he is thinking :

24 MADD x 16 FLOPs = 48 MADD x 8 FLOPs.

Correct; my mistake was the double pumped thingy which should have read for G7x "dual issue"; oh and I of course forgot to deduct the texture OPs from one of the "sub-ALUs" too.

If there's nothing wrong in my layman's math, then it should rather read 250GFLOPs - X% for texture OPs vs. 518GFLOPs.
 
By the way on what exactly are you guys basing any "free trilinear" assumptions? On the leaked Archmark results? If yes then one of the screenshots taken on a 45 degree angle in another thread show an improved version of "brilinear" unless my eyes have gone bad. If Archmark was run with that one enabled it's no suprise to me that trilinear seems free only that I'd rather call it "free brilinear".

So far the rumours concerning texturing units and their arrangements could suggest very fast and efficient AF; whether trilinear is really for free remains to be seen and I personally need a tad more data then someone running random tests w/o being absolutely sure what the results stand for.

And thank God for Mintmaster pointing out that the primary cause for shimmering is underfiltering and not higher angle dependency. If you turn on all optimisations on G7x you're close to shimmering hell and not even Supersampling is going to change anything.

There were real in game screenshots somewhere from Oblivion; granted proper judgement with AF can only occur in motion, but at least on a still screenshot the result seems equal to what a best case scenario of that AF "Iwannabeacircle" thingy of the filtering tester shows.

Especially if the last one should be true I tip my hat off to ATI first and next to NVIDIA; and since I can be pedantic I don't think with the usual drop highest quality - non optimized AF has there's much of a need to still have any optimisations operational on default.
 
Ailuros - Yes, I was basing it on the Archmark numbers, and yes I'm assuming *a lot*. I will speculate until the very end though! :D I very much hope that you're right and high quality filtering will become the default.

DaveBaumann - thanks for the info... that casts some serious doubt on my theory. What are the FP32 filtering use-cases anyway (besides braindead "true" HDR implementations?)
 
Last edited by a moderator:
Texture stage optimizations are perhaps the biggest culprit for shimmering than anything else IMO. Thats why its much more noticable in some games and not others. Some software simply isnt as sensitive to these optimizations as others. On the G7x you could get straight bilinear 2x AF on all but the primary texture stage. Personally I really hope to see texture stage opts go the way of the dinosaur. As games evolve I am finding the optimization even less and less acceptable.

Chris
 
It is possible that trilinear filtering may be free without anisotropic filtering, but incur a hit with anisotropic enabled.
 
Status
Not open for further replies.
Back
Top