Strange Intel behaviour

Before anyone screams at me, this is NOT the whole page nor even a good portion of it....just hopefully enough so someone can explain it to me. :oops:

What are the consequence of this "new" behaviour ? - Obviously, benchmark can be fooled. As we mentionned, the effect in normal conditions is very hardly noticeable, as the CPU may operate at its maximum multiplier. But what about if the CPU is throttling ? If the CPU uses TM2 thermal management (that decreases the frequency in case of overheat), the multiplier will decrease, but the reported frequency won't change. The user may not even notice that his CPU is throttling. - If the TSC is not incremented regarding the CPU frequency, what is its aim ? Having a fixed frequency timer ? All timers on the PC are already based on a fixed frequency, the TSC is (was) the only exception. If the TSC does not its job, it looses its interest and becomes useless. Digging more deeper, we noticed that, on the Prescott F4x CPUs, the TSC already uses the boot multiplier as a reference, and stays being incremented at this frequency, regardless to the changes that can occur thought C1E or EIST. We tried to contact Intel to get more information, and to get an explanation regarding this change. As often with Intel, we obtained no response, regarless to the fact that this problem concerns CPU that are already sold for a couple of months. Then we posted on Intel's developer forum here. Intel's answer is a nice workaround : "The answer to this includes Intel confidential information, so we are unable to post the resolution to this board." We obviously won't get our answer. So, we can only try to guess why Intel made this change :

1. Fool benchmarks ? Very unprobable, as the cheat would have appear one day. And when used in normal conditions, the problem does not affect the results.

2. Hide the real CPU frequency ? As the use of clock modulation mechanisms tend to generalize, the CPUs tend to display frequencies that are below the specification they were sold for. For example the Pentium M speeds lot of its time at its lower mutiplier, that does not affect its global performance at all, but may cause troubles among users. On a communication point of view, always display the stock frequency will avoid lot of questions from users.

3. Limit overclocking ? With two clocks running at differents speeds, one at max "rated" speed and one at real speed, that will be easier to prevent the real clock to goes x% higher than the rated speed.

4. Another technical reason ? We already know that Intel plans to make great changes in the clock management of the dual cores CPUs. Each core should indeed be able to run at its own clock speed, and both could be different. In this case, the TSC would be incremented at the same speed, whatever the individual speeds are. This could make things easier, but in this case why use this feature in single core CPUs ?

Whatever the reason, Intel does not want to give explanations, and did not think it would be relevant to mention this change in any publication, or even among developers. After we found this, Intel answered as follows :

---------------
The current PRM does not include a complete description for the latest Intel(r) Pentium(r) 4 Processor TSC operation. Intel is currently working on a clarification of the Programmers Reference Manual (PRM) in relation to, but not inclusive of, the following points.

For Intel(r) Pentium(r) 4 Processors with CPUID (Family, Model, Stepping) greater than 0xF30 the designed implementation of the TSC is for the counter to operate at a constant rate. This was implemented due to a request from Operating System Software vendor(s). That rate may be set by the maximum core-clock to bus-clock ratio of the processor or may be set by the frequency at which the processor is booted. The specific processor configuration will determine the exact behavior.

This constant TSC behavior ensures that the duration of each clock tick is uniform and supports the use of the TSC as a high resolution wall clock timer even while the processor core may change frequency. The use of the TSC as a wall clock timer has effectively been prioritized over other uses of the TSC. This is the architectural behavior for the TSC moving forward.

To count processor core clocks or to calculate the average processor frequency Intel recommends using the PMON counters Monitoring data from the event counters over the period of time for which the average frequency is required. See PRM Volume 3 Chapter 15 Debugging and Performance Measuring,Section 15.10.9 and Appendix A Performance Monitoring Events for details on the Global_Power_Events, event.
---------------
In a word, Intel agrees about the change in the TSC behaviour on the Prescott CPUs line, and will update the documentation in this way. We think that this update should have been made in the same time that they released the Prescott, and not one year after. In addition, this change was, according to Intel, motivated by the requests of OS vendors (namely Microsoft, who else could have such an influence on Intel's chips design ?). Very convenient and impossible to check. The real reason is still a (marketing or technical?) mystery.
 
In response to the points they raise...

1.) Well fooling benchmarks in that manner would only work against Intel, lets use the FLOPS example they gave, FLOPS = Number of operations / time.

So if the number of ops drops substantially, but the time remains the same, the FLOPS gets hurt significantly more so than if they altered the timer to correspond to the new clock.

2.) Absolutely, no one wants to buy a 4GHz P4 only to find out it runs at 2GHz when they really need the power due to thermal throttling.

3.) Same as above, except you no longer have the right to expect the product to be working at those frequencies.

4.) I imagine it all has to do with thermal throttling. And since prescott is going to need to throttle it would suck if some of your software begins to act in unexpected ways just because the wall timer is no longer constant.

Overall in my opinion, which is based only on information from this article it's just a matter of Intel trying to keep the platform constant, and not openly letting on that it's throttling. So nothing malicious here about cheating in benchmarks, it's just a matter of not letting the user know that their chip is overheating and not running at the speed they thought it should be.
 
Killer-Kris said:
Overall in my opinion, which is based only on information from this article it's just a matter of Intel trying to keep the platform constant, and not openly letting on that it's throttling. So nothing malicious here about cheating in benchmarks, it's just a matter of not letting the user know that their chip is overheating and not running at the speed they thought it should be.
Thanks, that article was a bit above my head. :oops:
 
EIST sounds like it could possibly be a problem, since it actually downclocks the processor during normal operation, not just during times of thermal stress.

I'm rather curious how AMD handles this, since Cool and Quiet could also mess with timekeeping.
 
Applications that actually use the TSC for timekeeping rather than for just cycle counting generally fail on Athlon64 when Cool'n'Quiet is enabled. They will also fail in the same manner, for the same resons, on Pentium-M.
 
Doesn't seem to be anything new to me - similar issues exist on Pentium-M and Athlon 64 (in Cool'n'Quiet mode). It's possible this happens on a finer grain, so it's impossible to query ACPI or similar to derive the current power state, but it's almost certainly highly similar to the old problem.

Most apps for which it's a problem have fixes for it anyway (e.g. MAME has the 'Don't use RDTSC timing' option - before I found that, MAME on my laptop was a pretty variable experience :D)
 
Dio said:
Doesn't seem to be anything new to me - similar issues exist on Pentium-M and Athlon 64 (in Cool'n'Quiet mode). It's possible this happens on a finer grain, so it's impossible to query ACPI or similar to derive the current power state, but it's almost certainly highly similar to the old problem.

Most apps for which it's a problem have fixes for it anyway (e.g. MAME has the 'Don't use RDTSC timing' option - before I found that, MAME on my laptop was a pretty variable experience :D)
Well, it IS different with intel's EIST. I think I've read about this before, and that's exactly the reason intel changed the behaviour of the TSC - to allow programs which use the TSC for timing to continue to run without problems (since the frequency of the TSC never changes).
It's a pretty debatable move, instead of breaking apps which use the TSC for timing (which is already the case with all mobile cpus and A64), intel now breaks apps which use the TSC for other uses (e.g. cycle counting for various reasons).
 
mczak said:
.... intel now breaks apps which use the TSC for other uses (e.g. cycle counting for various reasons).

I'm curious about what type of apps do this and what the ramifications may be for them. I can't seem to think of any off the top of my head, with the exception of a flops like benchmark.
 
digitalwanderer said:
Killer-Kris said:
Overall in my opinion, which is based only on information from this article it's just a matter of Intel trying to keep the platform constant, and not openly letting on that it's throttling. So nothing malicious here about cheating in benchmarks, it's just a matter of not letting the user know that their chip is overheating and not running at the speed they thought it should be.
Thanks, that article was a bit above my head. :oops:

No problem, though I don't think it helped that the article sounded a little sensationalist. Granted I tend to agree with the sentiment about hiding from the user the fact that the processor is not running at the speed it otherwise should be.
 
Well, I've always wondered in how much Intel would want to increase clockspeeds by just upping the clock and dividing it by two before actually running things. Then came Pentium-M, which made me think they had given up on clocks over all. But I hear a lot of complaints from resellers about the low clockspeed of those chips. Which is not surprising, as Intel has made sure for decades on end, that consumers only look at that clockspeed for a speed and value reference.

In any case, we can conclude that they now have a mechanism to try and screen the actual speed of the chip. I'm curious what will happen next, as we know the clockrate won't pass 4 GHz as it stands.
 
I see this as a way of NOT breaking application that use it to keep time. The varying result from RDTSC on CPU that do clock throtteliing is problematic if you are using it to do timing. So Intel have 'fixed' it.
 
Killer-Kris said:
mczak said:
.... intel now breaks apps which use the TSC for other uses (e.g. cycle counting for various reasons).

I'm curious about what type of apps do this and what the ramifications may be for them. I can't seem to think of any off the top of my head, with the exception of a flops like benchmark.
It'll might have implications for profiling. Also, some apps might use this to figure out which code is faster (if they have different code paths for the same thing), if they are unhappy the results might be wrong because there was a frequency change (thus code which would have been faster now used more "TSC cycles".
 
Colourless said:
I see this as a way of NOT breaking application that use it to keep time. The varying result from RDTSC on CPU that do clock throtteliing is problematic if you are using it to do timing. So Intel have 'fixed' it.

Yes, and this is how Transmeta do it. If you try to measure clock cycle through TSC, you always get a fairly constant number, no matter what clock the CPU is actually running (for example, when a 600MHz TM5600 runs at 300MHz, you still measures about 600MHz using TSC).
 
mczak said:
It'll might have implications for profiling. Also, some apps might use this to figure out which code is faster (if they have different code paths for the same thing), if they are unhappy the results might be wrong because there was a frequency change (thus code which would have been faster now used more "TSC cycles".

If you are profiling and the CPU is constantly changing its clock rate, won't it just ruin your results?
Furthermore, if the changing clocks is expected (such as in Foxton), then you should have a TSC constant to time rather than clock cycle, otherwise you won't know the real performance of your programs.
 
Back
Top