C/C++ Millisecond timers?

I was wondering if anyone here has any knowledge about any timing functions in C/C++ that give millisecond resolution. I haven't found anything decent through google, and I'm at about my wits end. The best the standard library gives is seconds but we really need better than that.
 
epicstruggle said:
hehe, you arent very specific as to what environment your working under.

Portability is a key concern, though this is primarily being run under linux, and potentially under FreeBSD.

I was hoping there was some obscure library I didn't know about, or that we were using the time.h functions incorrectly or something like that.

but:
http://www.codeproject.com/datetime/NanoSecondTimer.asp
should help you out. Incase that particular project doesnt help you, there are a few other examples of timers/timing projects that you can look up via search [of that particular site].

hope one of them helps you out.

epic

That should help, since the clock_t clock() function didn't appear to be working. We'll just have to implement our own using that link you gave me.

Thank you for the help. I suppose this is just one more thing that justifies my favoring java.
 
Ok, I found that using the struct timeb there is a millisecond field that seems to be accurate and seems to be working.

Thank you again.
 
Generally, there is no single "good" method for accurate timing on the PC. Some of the methods have problems on weird-ass chipsets (think early VIA), some - on dual-processor machines, some - on laptops which adjust their clockrate dynamically. So beware.
 
Killer-Kris said:
epicstruggle said:
hehe, you arent very specific as to what environment your working under.

Portability is a key concern, though this is primarily being run under linux, and potentially under FreeBSD.

I was hoping there was some obscure library I didn't know about, or that we were using the time.h functions incorrectly or something like that.

but:
http://www.codeproject.com/datetime/NanoSecondTimer.asp
should help you out. Incase that particular project doesnt help you, there are a few other examples of timers/timing projects that you can look up via search [of that particular site].

hope one of them helps you out.

epic

That should help, since the clock_t clock() function didn't appear to be working. We'll just have to implement our own using that link you gave me.

Thank you for the help. I suppose this is just one more thing that justifies my favoring java.
glad to offer any help. Been a while (a long while) since i did any programming. Damn my skills are rusting away. :(

If you get it to work, let us know what you used. all the best.

epic
 
Ok,once again I'd like to thank everyone for the help. And what we ended up using goes roughly as follows...

#include <sys/types.h>
#include <sys/timeb.h>
#include <time.h>

struct timeb start, stop; // Start and stop times structures
double elapsed; // Elapsed time in seconds

ftime(&start);

//code to be benchmarked

ftime(&stop);

elapsed=((double) stop.time + ((double) stop.millitm * 0.001)) - ((double) start.time + ((double) start.millitm * 0.001));
 
<hijack>
Havent used these functions in ages -
are QueryPerformanceFrequency ()/QueryPerformanceCounter() on Windows obsolete nowadays? (clock throttling and all that)
</hijack>
 
tobbe said:
<hijack>
Havent used these functions in ages -
are QueryPerformanceFrequency ()/QueryPerformanceCounter() on Windows obsolete nowadays? (clock throttling and all that)
</hijack>

As far as I know, it should be the preferred method, since it is the most precise and accounts for all that stuff behind the scene (clock throttling/multiple cpus...)
 
cristic said:
Killer-Kris, the function you are using is obsolete. You are better off using gettimeofday, like Humus is suggesting.

http://linux.ctyme.com/man/man0834.htm

Besides with gettimeofday it is easier to calculate the elapsed time.

I'll definitely keep that in mind for the next time I need to time something under Linux. But for this project, it isn't something that was going to be sticking around. We just needed to quickly convince someone that a different algorithm was in fact faster. And at least on the system we are working on that function still works and seems pretty accurate.

tobbe said:
<hijack>
Havent used these functions in ages -
are QueryPerformanceFrequency ()/QueryPerformanceCounter() on Windows obsolete nowadays? (clock throttling and all that)
</hijack>

Not entirely with the newer processors. There was a thread over in the hardware forums the other day...
http://www.beyond3d.com/forum/viewtopic.php?t=20893
 
The real problem with QueryPerformanceCounter and QueryPerformanceFrequency is that they're so amazingly slow that they completely screw with the profile. It's not significant if you're measuring something that takes a really long time by itself, but if you're trying to time a block that's just a few hundred cycles or something, the Win32 functions will just kill you. Especially QueryPerformanceFrequency, which seems to vary extremely wildly from run to run as well. I've seen it measure my clock speed as everything from 7 GHz to 88 KHz. If there was a nice way to get the current exact clock speed within a cycle or two, that'd be nice, but no such luck.

You could try doing a long running measure yourself and just store the result, but that doesn't always work out. Especially if you're on a CPU with active power reduction schemes that alter clock speed dynamically. Even otherwise, there's always natural fluctuations of the clock speed as a result of heat and EMI and such.
 
ShootMyMonkey said:
The real problem with QueryPerformanceCounter and QueryPerformanceFrequency is that they're so amazingly slow that they completely screw with the profile. It's not significant if you're measuring something that takes a really long time by itself, but if you're trying to time a block that's just a few hundred cycles or something, the Win32 functions will just kill you.

Well run that code like a gazillion times and divide the measured time with a gazillion. Running and timing a few instructions once will most likely tell you nothing.

ShootMyMonkey said:
Especially QueryPerformanceFrequency, which seems to vary extremely wildly from run to run as well. I've seen it measure my clock speed as everything from 7 GHz to 88 KHz.

If you are calling QueryPerformanceFrequency every time in your loop, then your alghoritm is flawed. There's no need to do this, because the frequency of the performance counter will *not* change while the system is up and running. Call it once, and store the result in a variable somewhere then use that variable. The QueryPerformance* functions take a while to execute, because they take into account lots of things (clock throtling, multi-processor systems..) to give you an accurate result.
 
Well run that code like a gazillion times and divide the measured time with a gazillion. Running and timing a few instructions once will most likely tell you nothing.
True, but the main point was just that for doing roughly the same thing, the fact that the Win32 functions will in addition do all this extra Win32-related stuff means that it's going to screw up the profile if you're trying to isolate something that's individually very small, but still gets run a lot.

If you are calling QueryPerformanceFrequency every time in your loop, then your alghoritm is flawed.
Actually, I was referring to a separate software tool that someone wrote which uses QueryPerformanceFrequency to estimate frequency. Actually doing a refresh several times manually yields radically different numbers. In practice, you would want to store the frequency once, but even this isn't really accurate for physical reasons -- but the scale of change is generally very small unless you're dealing with a machine that actually DOES clock throttling. Ideal, of course, would be to get the immediate frequency on every cycle through a profiled block and just divide by the weighted average. Of course, that would be flat out impossible short of having specialized "profiling" circuitry and profile flag markers in the hardware -- while a lot of us would be fairly happy, I think it's not as often these days that people actually do cycle-level profiling on mainstream hardware.

If auto-vectorization ever really truly happens (and I don't mean things like ICC which basically auto-vectorize SPEC) and constructs like Cilk become more commonplace, I also think even the very performance-minded people will rely on the compiler more and more to just produce fast code for them. I've never been impressed by the optimization levels that come out of commercial compilers, but some of the stuff that I see in academia is actually pretty good -- the problem is that they all tend to solve one problem at a time and often tend to be focused on taking advantage of the style of the language (typically not C++ or even anything remotely similar).
 
cristic said:
ShootMyMonkey said:
The real problem with QueryPerformanceCounter and QueryPerformanceFrequency is that they're so amazingly slow that they completely screw with the profile. It's not significant if you're measuring something that takes a really long time by itself, but if you're trying to time a block that's just a few hundred cycles or something, the Win32 functions will just kill you.

Well run that code like a gazillion times and divide the measured time with a gazillion. Running and timing a few instructions once will most likely tell you nothing.
It's not that simple. I can confirm what ShootMyMonkey said because I hit the same problem myself.
QueryPerformanceCounter takes -- on average (and that's the problem) -- one microsecond to complete. That time will be included in your measurements. Apart from the annoying direct consequence that code sprinkled with QPC based profiling timers can be slowed down a whole lot, there's no simple way to compensate for the huge, inherent error margin, because it fluctuates wildly from call to call.

That's been a huge problem for me.
I originally used RDTSC for everything and had great, reliable results on my old (fixed clock speed) machine. But I figure that's not good enough anymore. OTOH QPC isn't good enough for profiling.

The solution I've come up with was to use QPC only for timing large things, in the 5ms+ ballpark. Seconds per frame for animation purposes is a prime candidate here. I still use RDTSC for profiling, but that's not active in "shipping" code. Whenever I need to collect accurate profiling data, I just deactive "Cool'n'Quiet" and all is fine.

In my little benchmark project it turned out to be more complicated. There, I need to make sure that some specific loops run only for a limited amount of time, but it's unknown (at compile time) how many iterations that would take.

I originally had something like this:
Code:
RDTSCTimer t;
int frames=0;

//prep benchmark

t.reset();
do
{
//workload
  ++frames;
} while ((frames<1000)&&(t.elapsed_seconds()<0.75));
double delta_t=t.elapsed_seconds();
Limiting runtime with a QPC based timer takes a lot of time away from the actual benchmark workloads, while taking a benchmark result time with RDTSC is vulnerable to clock speed variations. So I had to do something.

I finally settled on a hybrid approach. Elapsed seconds is very fast to compute on my RDTSCTimer class. I figured that I don't need the time limit to be precise. A rough ballpark is enough, I can easily tolerate +/-50 per cent error here. The overall delta_t for the run needs to be precise OTOH, so I use QPC for that.
Like this:
Code:
RDTSCTimer t;
QPCTimer robust_t;
int frames=0;

//prep benchmark

robust_t.reset();
t.reset();
do
{
//workload
  ++frames;
} while ((frames<1000)&&(t.elapsed_seconds()<0.75));
double delta_t=robust_t.elapsed_seconds();
 
zeckensack said:
QueryPerformanceCounter takes -- on average (and that's the problem) -- one microsecond to complete.
That is amazingly craptastic.

You'd think there'd be some read only registers you could get to in a few cycles.

But I'm just an embedded jockey, so I think in terms like that. ;)
 
RussSchultz said:
zeckensack said:
QueryPerformanceCounter takes -- on average (and that's the problem) -- one microsecond to complete.
That is amazingly craptastic.

You'd think there'd be some read only registers you could get to in a few cycles.

But I'm just an embedded jockey, so I think in terms like that. ;)
It sure sucks.
RDTSC is internal to the processor, and you can just read it into registers (in user mode).
QPC uses the "real time clock" which is an extra hardware device that can only be read from kernel mode, so there's a lot of overhead. It doesn't help that it's probably somewhere on the (slow) SMBus.
 
Back
Top