G70 Core Clock Variance

  • Thread starter Deleted member 2197
  • Start date
ALT-F13 said:
So, as you can see, the performance difference is really not linear
It never will be. Other limitations frequently come into play, such as memory bandwidth.
 
Chalnoth said:
Sharkfood said:
My only concern is what is the sustainable (as in over a number of hours) performance output in 'average' conditions if there is indeed some form of throttling going in the chip.
I would think that current benchmarks already rule out this as a problem.
How so? How many benchmarkers looped tests for multiple hours to ensure iteration 1 performance = iteration, say, 50 faired?

Link please... I haven't seen a single site or user posting such tests.
 
ALT-F13 said:
I did some testing with 3DMark 2k1 Game 4 - Nature, as it is the most precise gfx testing tool for me.

Single Leadtek card, 'cos my SLI setup don't go higher than 485 on gpus. Forceware 80.40, all stock settings, memory clock always 1200 MHz.

MHz - FPS - FPS delta

430 - 239.4 - 0.0
440 - 240.6 - 1.2
450 - 243.2 - 2.6
460 - 245.5 - 2.3
470 - 246.2 - 0.7
480 - 248.6 - 2.4
490 - 249.5 - 0.9
500 - 251.6 - 2.1
510 - 252.7 - 0.9
520 - 253.3 - 0.6

530 is too much for the card.

So, as you can see, the performance difference is really not linear (although this test is extremely precise!) and FPS delta jumps from 0.6 to 2.6 for each 10 MHz. BUT! Predicted clocks for bumps should be 430+27=457, 457+27=484 and 484+27=511. Instead of this we can see highest bumps (over 2 FPS/10MHz) between 440 and 450, 450 and 460, 470 and 480, 490 and 500 which don't fit in theory :(

Hi Alt-F13

That's interesting, can you try 03 Gt4 nature and see if it is the same ?

Regards

Andy ( zakelwe on XS.org)
 
ALT-F13 said:
I did some testing with 3DMark 2k1 Game 4 - Nature, as it is the most precise gfx testing tool for me.

Single Leadtek card, 'cos my SLI setup don't go higher than 485 on gpus. Forceware 80.40, all stock settings, memory clock always 1200 MHz.

MHz - FPS - FPS delta

430 - 239.4 - 0.0
440 - 240.6 - 1.2
450 - 243.2 - 2.6
460 - 245.5 - 2.3
470 - 246.2 - 0.7
480 - 248.6 - 2.4
490 - 249.5 - 0.9
500 - 251.6 - 2.1
510 - 252.7 - 0.9
520 - 253.3 - 0.6

530 is too much for the card.

So, as you can see, the performance difference is really not linear (although this test is extremely precise!) and FPS delta jumps from 0.6 to 2.6 for each 10 MHz. BUT! Predicted clocks for bumps should be 430+27=457, 457+27=484 and 484+27=511. Instead of this we can see highest bumps (over 2 FPS/10MHz) between 440 and 450, 450 and 460, 470 and 480, 490 and 500 which don't fit in theory :(

ALT, you shouldn't expect that "jumps" exactly at N*27MHz clocks. There can be different approaches for feedback divider selection, for example they can select closest divider generating a clock with smaller delta between target and the real one:

e.g. when you set 440, abs(440-432) is less than abs(440-459), so the first one is used.
And when you set 450, abs(450-432) is greater than abs(450-459), so the second case is used.
That's just a question of algorithm generating dividers for required clock.

Also, I'd still recommend you to use pure fillrate tests for veryfying ROP clock instead of using synthetic test bottlenecked by multiple parts of chip.
 
Sharkfood said:
How so? How many benchmarkers looped tests for multiple hours to ensure iteration 1 performance = iteration, say, 50 faired?
I'm not sure why you'd expect that to make a difference. From the interview, clocks are adjusted based upon usage, not temperature.
 
trinibwoy said:
digitalwanderer said:
bigz said:
And he basically didn't tell you anything about it. :?

What do you mean? He said the variable clocks were derived from mobile tech, are the reason for the lower power consumption and heat on the G70, that there are more than just 3 and that they will be providing info to Unwinder on which clocks are most suitable for end-user overclocking. What were you expecting - schematics?

To be honest I'm a bit sceptical about "more than 3 clock domains". G70 BIOS's internals can be used as the simpliest way to verify amount of programmable clock domians in the core - there is a PLL programming routine, which _must_ switch all domains to bus clock during PLL programming (to avoid undesired clock frequency jumps, as post divider and reference / feedback dividers are located in different registers so we cannot program whole PLL in a single step). The routine I'm talking about switches just clock sources for just a 3 domains.
 
trinibwoy said:
digitalwanderer said:
bigz said:
And he basically didn't tell you anything about it. :?

What do you mean? He said the variable clocks were derived from mobile tech, are the reason for the lower power consumption and heat on the G70, that there are more than just 3 and that they will be providing info to Unwinder on which clocks are most suitable for end-user overclocking. What were you expecting - schematics?

Also, I'm a bit sceptical about "new architecture" too. Currently I'm comparing NV47's clocking approach with NV40's one. As far as I can see now, the previous highe-end chip has that 3 independently clockable domains too, which are simply synchronically clocked with a single PLL. However, it is not prooved info yet. Experimenting with it now.
 
Unwinder said:
ALT-F13 said:
I did some testing with 3DMark 2k1 Game 4 - Nature, as it is the most precise gfx testing tool for me.

Single Leadtek card, 'cos my SLI setup don't go higher than 485 on gpus. Forceware 80.40, all stock settings, memory clock always 1200 MHz.

MHz - FPS - FPS delta

430 - 239.4 - 0.0
440 - 240.6 - 1.2
450 - 243.2 - 2.6
460 - 245.5 - 2.3
470 - 246.2 - 0.7
480 - 248.6 - 2.4
490 - 249.5 - 0.9
500 - 251.6 - 2.1
510 - 252.7 - 0.9
520 - 253.3 - 0.6

530 is too much for the card.

So, as you can see, the performance difference is really not linear (although this test is extremely precise!) and FPS delta jumps from 0.6 to 2.6 for each 10 MHz. BUT! Predicted clocks for bumps should be 430+27=457, 457+27=484 and 484+27=511. Instead of this we can see highest bumps (over 2 FPS/10MHz) between 440 and 450, 450 and 460, 470 and 480, 490 and 500 which don't fit in theory :(

ALT, you shouldn't expect that "jumps" exactly at N*27MHz clocks. There can be different approaches for feedback divider selection, for example they can select closest divider generating a clock with smaller delta between target and the real one:

e.g. when you set 440, abs(440-432) is less than abs(440-459), so the first one is used.
And when you set 450, abs(450-432) is greater than abs(450-459), so the second case is used.
That's just a question of algorithm generating dividers for required clock.

Also, I'd still recommend you to use pure fillrate tests for veryfying ROP clock instead of using synthetic test bottlenecked by multiple parts of chip.

Small update on that issue. I asked a tester with 7800 (I don't have it and able to analyze its' driver/BIOS only) and 3-domain clock monitoring capable beta of RT 15.7 to reproduce this situation. The assumption was correct, the driver does seem to use "min delta" clock generation rule:

Desired clock set via CoolBits / Real clock monitored by RT:

430 -> 432
440 -> 432
450 -> 459
460 -> 459
470 -> 459
 
Lemme make sure I'm getting this right since I'm still pre-coffee, but are you saying someone clocking their 7800 to 450-470 is always going to end up with a card clocked at 459? :?

(I think I'm reading it wrong, 'cause that just doesn't make a lot of sense!)
 
If you're reading it wrong, then so am I. . .I wonder if you do auto-overclocking which it picks to show you --the highest or lowest in the range? :p
 
Unwinder said:
Also, I'm a bit sceptical about "new architecture" too. Currently I'm comparing NV47's clocking approach with NV40's one. As far as I can see now, the previous highe-end chip has that 3 independently clockable domains too, which are simply synchronically clocked with a single PLL. However, it is not prooved info yet. Experimenting with it now.
I wonder if the 6800 Ultra in Dell's XPS laptop has multiple clock domains, as nV did say they learned from their mobile parts, and that one had some surprisingly good performance.
 
digitalwanderer said:
Lemme make sure I'm getting this right since I'm still pre-coffee, but are you saying someone clocking their 7800 to 450-470 is always going to end up with a card clocked at 459? :?

(I think I'm reading it wrong, 'cause that just doesn't make a lot of sense!)
For the the majority of the chip yes, but not for the parts that are actually running at 450-470Mhz (and we're not too sure of which these are, except they're probably "geometry", heh)

Uttar
 
Until you get to the next step 2/3 of your card is working at the lower speed, but you will still be getting some improvement from the geometry domain until you jump to the next level.

Of course, for the overclocker having everything at 1Mhz multipliers is the main thing because you would have to be lucky for your card to max out on one of the 27Mhz multiplier steps.

Hopefully Unwinder can crack this and allow 1Mhz increments to the PLL for all domains.

That Unwinder is a clever chap, a lot of the guys on this forum know all the software stuff but there are not as many here who know the hardware side like him :)

The interesting thing of course is why/how the geometry clock manages to get away with being clocked higher than the ROP or shader clocks. Anyone got any theories ?
 
You want to limit the possible clocks for performance reasons, though. If you have a lot of different systems that are potentially out of sync, your caches between these systems would need to be of different sizes depending on the clock discrepancies.

So even if smaller increments would be viable, you may well find out that the pipelines will start stalling all over the place with the wrong combination of clockspeeds chosen.
 
Chalnoth said:
I'm not sure why you'd expect that to make a difference. From the interview, clocks are adjusted based upon usage, not temperature.
For all the reasons already outlined prior... so I'll take this answer as 'Nope, you didn't miss anything. There are no links of such tests performed'... Thanks, I thought I missed one somewhere... being so many sites doing 7800 reviews.
 
I don't read all reviews thoroughly, so just because I'm not aware of something doesn't mean it isn't out there.

I just doubt that anybody has reported it, and I seriously doubt this is an issue.
 
Back
Top