Futuremark: 3DMark06

Unknown Soldier said:
I think that's pretty interesting. A single core 3500+ that beats a dual core X2 4200+

Looks like the 4200+ completely demolishes the 3500+, 1655 to 866. If you're referring to the few points higher on the GPU tests, that is probably well within the margin for error.
 
Pete said:
"Beats" is a pretty strong word. The difference in the 3D tests is a mere 1-2%, practically margin of error.

Indeed, and I should mention those scores were on different motherboards (Both A8N-SLI based, but still), so there are likely different memory timings in play there for starters. It was just a quick and dirty comparison really.
 
Has anyone tried to bench nvidia hardware on the sm2 codepath yet ? I get a small, consistent fps increase on both nv40 and g70 in GT1 with it. Of course, it can't run the SM3.0 tests on it...

As for the arguments over uniquely accelerated features, I think they should be held in check until we see more benches using 3DM06's thoughtfully-included switches to disable "hardware" shadow mapping and FP filtering.

My results are more or less in line with Hanners'. Disabling HSM hurts the 7800gtx by ~24.5% in gt1 and ~18% in gt2. Forcing shader FP filtering induces a 6% performance hit in gt1 and an 8% one in gt2 on the same card.
 
Dave Baumann said:
X1800 is the odd one out of the X1000 line - it, and only it, does not support Fetch4.

You're right, my mistake. Six pages later.

So we know that the X1900 supports fetch4, but does it support D24X8?
 
Pete said:
Thought I'd kick in my 9800P scores, just to see how much my AXP 2400+ (@2000MHz, 256kB, single-channel DDR266) is crippling me:

545 3DMarks
SM2.0: 246
CPU: 654

Weird. My score is slightly better, eventhough you have a better videocard. Did you put "mipmap detail level" to "performance" instead of "quality" in the Catalyst control center? That could explain it.


System: XP2800+ (Barton 2.08GHz), 9700 Pro, 1024MB PC2700 (2x 512MB).

3DMark06 score: 652

SM2.0 score: 312
CPU score: 730


I decided to try the older 3DMarks with Catalyst 6.1 drivers as well;
3DMark05 score: from 2274 to 2382
3DMark03 score: from 5062 to 5059
3DMark01 score: from 14492 to 13099 =O !
 
Last edited by a moderator:
7800 GTX 512 (580/1730), AMD X2 4800+, 2 GB, ASUS A8N-SLI Premium, Driver Version 81.98, High Quality

5479 3DMark06 Score
2242 SM 2.0
2288 HDR/SM 3.0
1852 CPU

:D
 
wont someone with a gtx pretty please force HQ AF 16 and list scores, pretty please?

Here u go:

Without 16xAF
3430 total
SM 2: 1715
SM 3: 1608
CPU: 654

With 16xAF:
3069 total
SM 2: 1347
SM 3: 1562
CPU: 652

The scores are probably worse then what I should be getting with a 3000+ and a 256MB 7800GTX (not oced at all of course :p) but I didn't bother to reebot and I had tons of apps running the background (WMP, folding, azureus ect.).

I dunno, a slow CPU can probably hamper gameplay as much as a slow GPU. I'm of the opinion that framerate comes first, everything else second. One could argue whether DC will show as much of an improvement as SC, of course, but I think 360 and PS3 should make exploiting DC quite common--and possibly similar to fancy effects, if the majority of the PC market is SC and so the second core is just used for neat tricks like more boxes or boulders.

I'd like to see more testing done on Q4, considering how much of an improvement it sees with DCs. Specifically, we'd want to test without AA and preferably at 12x10, to more closely mimic 3DM's vision of future games.

As for the arguments over uniquely accelerated features, I think they should be held in check until we see more benches using 3DM06's thoughtfully-included switches to disable "hardware" shadow mapping and FP filtering.

Hanner's limited (NV-only) testing does seem to suggest that ATI wasn't too silly to skip "fixed" FP16 filtering (Nick wasn't kidding when he said their SW fallback was "highly efficient"), though I'd like to see SS compares to examine IQ differences, if any. OTOH, NV's huge hits w/o HSM (25% on a 6800GT, 17% on a 7800GT) beg the question why FM couldn't have implemented a SW-based HDR AA workaround and considered it an equivalent situation?

Ah, FM says, but AA isn't part of their standard suite, just an option. Well, is HDR isn't part of SM3's standard suite? HSM? FP filtering? If the answer is that 3DM isn't a D3D test, but a gamer's test, then surely gamers use AA as (much as) they would fancy shadows, as an IQ enhancer?

I'm cool with most of the test. I think it's eminently fair to take advantage of HW features, as surely game devs would do the same. Only the (GF 6's and 7's lack of) AA score reporting puzzles me. Though we can calculate it by hand, we shouldn't have to, and forcing us to do so only diminishes the holistic "3DMarks" relevance.

Yay a logical post for once!
 
Pete said:
OTOH, NV's huge hits w/o HSM (25% on a 6800GT, 17% on a 7800GT) beg the question why FM couldn't have implemented a SW-based HDR AA workaround and considered it an equivalent situation?

Ah, FM says, but AA isn't part of their standard suite, just an option. Well, is HDR isn't part of SM3's standard suite? HSM? FP filtering? If the answer is that 3DM isn't a D3D test, but a gamer's test, then surely gamers use AA as (much as) they would fancy shadows, as an IQ enhancer?
Er, it's very easy to implement FP filtering in the shader for simple situations (and since FP filtering is probably only used for tonemapping, this should be extremely easy). It's impossible to implement multisampling AA in the shader.

One could obviously implement supersampling AA in software, but this would hardly be equivalent either in performance or in visual quality.
 
I believe that 6 and 7 series can generate MSAA samples for an FP16 backbuffer - it's simply that the samples cannot be resolved in hardware, nor can a backbuffer blend be done in hardware.

So, the question is, can MSAA and FP blending be made to work in this scenario?

Jawed
 
mrcorbo said:
Re: CPU scaling

Keep in mind that if you change your CPU by O/C the HT bus that you are not only increasing your CPU perfomance, but also the memory and of course the HT performance, as well. The proper way to test this is as suggested. Underclock your CPU by changing the multiplier down.

HT has nothing to do with memory bus performance in A64 architecture.
 
Jawed said:
I believe that 6 and 7 series can generate MSAA samples for an FP16 backbuffer - it's simply that the samples cannot be resolved in hardware, nor can a backbuffer blend be done in hardware.
If you can generate the samples, then clearly it would be possible to resolve them, provided you can access the framebuffer as just a buffer that is higher in resolution than the desired resolution (i.e. a 1280x960 4xAA framebuffer would be stored as 2560x1920).

The question is, why do you think that the NV4x can generate MSAA samples for an FP16 backbuffer?
 
15 minutes of vain

trinibwoy said:
Looks like somebody just grew a third leg....:eek:
It's been there before. I swear!

Ok, I know. Bragging about 3DMark scores is sooo vain. :oops: But this is actually the first time a new 3DMark comes out that I don't feel like upgrading instantly.
 
Pete said:
Thought I'd kick in my 9800P scores, just to see how much my AXP 2400+ (@2000MHz, 256kB, single-channel DDR266) is crippling me:

545 3DMarks
SM2.0: 246
CPU: 654

Honestly, not too bad compared to a dual 2.1GHz A64 w/ practically four times the RAM bandwidth (64b DDR266 -> 128b DDR400), and merely iffy compared to a 2.4GHz P-M. Too bad the difference isn't as small in games, where I'm sure faster RAM or more L2 would greatly benefit me.

Nocturn's XT's extra VRAM likely contributes to his much higher SM2 score.

What's the vid card clocks Pete? When I get home I will rerun the tests with my vid card at the fastest it can go - (I think 375/355) - its summer here so I down clock most things unless needed. It's really a 9800SE on a 9700 board softmodded. I did that test whilst grabbing dinner.

But the CPU doesn't seem to be weighing in as much as what people are saying. NocturnDragon's 3Dmarks score smashes mine yet the cpu score for yours is 1/3 mine.
 
Freak'n Big Panda said:
Here u go:

Without 16xAF
3430 total
SM 2: 1715
SM 3: 1608
CPU: 654

With 16xAF:
SM 2: 1347
SM 3: 1562


The scores are probably worse then what I should be getting with a 3000+ and a 256MB 7800GTX (not oced at all of course :p) but I didn't bother to reebot and I had tons of apps running the background (WMP, folding, azureus ect.).

256 GTX Without 16xAF
SM 2: 1715
SM 3: 1608

vs http://www.anandtech.com/video/showdoc.aspx?i=2675&p=3 numbers
1747 in SM2.0
1729 in SM3.0/HDR

With 16xAF:
SM 2: 1347
SM 3: 1562

vs X1800XT HQ16AF Forced:

1615 SM2.0 Score
1749 HDR/SM3.0 score

your 3dmark score is not off by much, as the default settings doesnt appear overly effected per other results.

I find the difference between 16AF on ATI vs 16 AF on Nvidia very interesting if not a bit misleading compared to default scoring. On one hand we have the 256mb GTX actually leading the X1800, as firingsquad pointed out going against the grain, and once you enable HQ you have the X1800 taking a rather substantial lead. Feel free to re-bench if you like, but i dont think what you had running in the background effected your graphic scores much. The HDR tests seem the most CPU limited, so i'm sure you are losing some points, but i think i got what i was looking for. My CPU is a 2.0 and the one used on Anand was an FX55.

I think Nvidia has done a bit more homework in optimizations is all.

if anyone else would care to do the same, i wouldnt mind seeing if this is a constant drop the G70 cores are taking.
 
Last edited by a moderator:
Chalnoth said:
If you can generate the samples, then clearly it would be possible to resolve them, provided you can access the framebuffer as just a buffer that is higher in resolution than the desired resolution (i.e. a 1280x960 4xAA framebuffer would be stored as 2560x1920).
Theoretically, yes - but I don't think reading the samples from Z while rendering is a trivial matter (in order to decide from the samples' Z values which samples to overwrite with the newly generated pixel's samples). As a non-dev this is simply beyond my ken :oops: It might, for example, only start to be possible by doing a z-prepass to build Z which is dumped into a texture, in order that it can be queried while doing the colour pass... erm...

The question is, why do you think that the NV4x can generate MSAA samples for an FP16 backbuffer?
Apparently G70 can, as this was a topic that came up as part of the discussion of nAo's implementation of HDR in Heavenly Sword (PS3 title). I presume it applies to NV40, too, as their ROPs are generally seen as common in function.

It was news to me, anyway - and I hope I've interpreted it correctly. It would be nice to get more insight because the concept was left hanging in the air.

Jawed
 
Jawed said:
Theoretically, yes - but I don't think reading the samples from Z while rendering is a trivial matter
Well, if you can't do that, then you can't generate the multisample buffer in the first place.
 
Control Panel set to HQ. Forceware 82.12. 7800GTX @ 490/1300.

Setting, SM2.0, SM3.0/HDR

Default 1781, 1767
Software FP16 filtering 1779 0%, 1648 6.7%
Disabled Hardware Shadow Mapping 1463 17.9%, 1769 0%
16x AF 1394 21.7%, 1476 16.5%

Percentages calculated using default scores as base hence they represent the performance hit in each situation.
 
Back
Top