Futuremark: 3DMark06

Thanks, Chal. Seeing all those _PPs in that shader reminded me that there may be some performance help from such for 7800GTX in 3DMark06, but perhaps as you say it is small.
 
Yes.

There's the register bandwidth problem to contend with in G70 - maximum of four FP32 registers as operands in any one clock - which will get in the way of certain combinations of instructions, the obvious one being dual-issued MADs:

MAD r0, r1, r2, r3
MAD r2, r1, r4, r5

can't be issued if all the source registers are FP32s - but it's fine if they're all FP16s (or some mixture of FP32/FP16, since two FP16s actually "fit" into one FP32).

Jawed
 
Follow up question for the experts. How much does _PP help the 7800GTX in the shader above?
For any willing to speculate...
 
ERK said:
Follow up question for the experts. How much does _PP help the 7800GTX in the shader above?
For any willing to speculate...
Not too much, I guess, at least to a lesser extent than the NV4x cards. The NV3x series needed it badly, the NV4x slightly (though in some cases it did help) but hopefully doesn't matter too much for the G70. More than the _pp gain, nVIDIA cards gain performance from using lookup tables over straight math. Also, the DST thing is a big performance gainer.
 
Chalnoth said:
Probably soft shadows.
Probably, but it wouldn't make a difference if they're using 8 test samples (which is probably why they're using DB! Mwahahahaha). I don't see the water using DB either (from what's given in their whitepaper - 2 scrolling normal maps and 4 Gerstner wave functions). The Heterogenous Fog is another likely candidate, though they clearly mention that they're able to get away with just 5 samples (which wouldn't require DB).
 
poly-gone said:
Probably, but it wouldn't make a difference if they're using 8 test samples (which is probably why they're using DB! Mwahahahaha).
I believe Nick mentioned somewhere in this thread that they use a custom 16-sample pattern.
 
Chalnoth said:
I believe Nick mentioned somewhere in this thread that they use a custom 16-sample pattern.
Yes, that can be done by encoding the offsets in a 3D volume map. You use 8 test samples to check if the pixel is fully shadowed (then exit the shader quickly), otherwise fetch the remaining 8 samples to soften the edges. It doesn't require DB though, if you skip the testing.
 
Slightly off topic, but could someone explain the reason Intel dual core CPU's are providing a significant boost to SM2.0 and HDR/SM3.0 scores in comparison to AMD dual core CPU's? I just saw the review over at AMDZone

http://www.amdzone.com/modules.php?...s&file=index&req=viewarticle&artid=229&page=3

Breakdown of results:

SM2.0
P4 840D+7800GT: 1458
FX-60+7800GT: 1266

P4 840D + X1800XL: 1251
FX-60 + X1800XL: 1178

HDR/SM3.0
P4 840D + 7800GT: 1451
FX-60 + 7800GT: 1264

P4 840D + X1800XL: 1317
FX-60 + X1800XL: 1212

The CPU scores show a completely different picture

CPU Score
P4 840D + 7800GT: 1416
FX-60 + 7800GT: 1891

P4 840D + X1800XL: 1388
FX-60 + X1800XL: 1863

-EDIT-

silly mistake :)
 
Last edited by a moderator:
You've got a typo in your CPU score results, mongoled :)

But yeah, that definitely seems very strange to me.
 
Chalnoth said:
You've got a typo in your CPU score results, mongoled :)

But yeah, that definitely seems very strange to me.

Probably has to do with the chipsets, Intel's tend to be very good and fast.
 
ANova said:
Probably has to do with the chipsets, Intel's tend to be very good and fast.
Both used the nForce4. So it more likely has something to do with hyperthreading. But what, I don't know.
 
Chalnoth said:
You've got a typo in your CPU score results, mongoled :)

But yeah, that definitely seems very strange to me.
Thanxs for tht, sorted it out. Still interested to see if someone else can shed more light on this as the difference in the scores is quite obvious. Would the CPU's be doing work with regards to SM2.0 and SM3.0?
 
Last edited by a moderator:
Fox5 said:
Weren't both cpus used dual core cpus?
Yes. But that doesn't necessarily mean that hyperthreading wouldn't have had an effect. It is, after all, the primary advantage available for the P4.
 
poly-gone said:
Yes, that can be done by encoding the offsets in a 3D volume map. You use 8 test samples to check if the pixel is fully shadowed (then exit the shader quickly), otherwise fetch the remaining 8 samples to soften the edges.
Just because 8 samples lie within, or out of, the shadow doesn't mean all 16 will! A better solution is to generate an edge mask so that only pixels within the edge mask get all 16 samples. Pixels outside the edge mask only need 1 sample to determine whether they are shadowed or not.
 
Cheat! Cheat! Cheat! Cheat! Cheat! Cheat!

geo said:
I think it's time for Ati to admit that they lost the IQ crown in favor of the technologically superior nVIDIA™ cards.

This bug on a particular surface in a particular test of 3DMARKS 2006™ is a clear indicative that Ati are not only cheating with their drivers, but are also deceiving their customers, their friends and all the humankind, plus Rys (Since he's not exactly a part of the humankind, he's like an evolved hamster according to some folks).

Here's another unrelated and inconclusive anedoctical evidence to go with the TR one, the other day I saw an ati logo in a magazine, and the logo had clearly image quality issue, like color dithering or something.
You know what that means? Ati cheats with their Drivers. Yeah, even the paper drivers!

I think it's time to call for a general boycott of all Ati products. i'm starting an Online Petition right away!

And before someone points out that it's maybe a simple anecdotical driver bug, and that the TR wanted a few cheap clicks. Let me tell you this you folks of little faith,... You're probably right.
 
Funny that, don't we already have Hanners's word that DF24/Fetch4 isn't actually working on X1900XT and is only working on X1600XT (no comments either way for X1300)?

Jawed
 
Back
Top