C/C++ Millisecond timers?

Discussion in 'Rendering Technology and APIs' started by Killer-Kris, Mar 11, 2005.

  1. Killer-Kris

    Regular

    Joined:
    May 20, 2003
    Messages:
    540
    Likes Received:
    4
    I was wondering if anyone here has any knowledge about any timing functions in C/C++ that give millisecond resolution. I haven't found anything decent through google, and I'm at about my wits end. The best the standard library gives is seconds but we really need better than that.
     
  2. epicstruggle

    epicstruggle Passenger on Serenity
    Veteran

    Joined:
    Jul 24, 2002
    Messages:
    1,903
    Likes Received:
    45
    Location:
    Object in Space
    hehe, you arent very specific as to what environment your working under.

    but:
    http://www.codeproject.com/datetime/NanoSecondTimer.asp
    should help you out. Incase that particular project doesnt help you, there are a few other examples of timers/timing projects that you can look up via search [of that particular site].

    hope one of them helps you out.

    epic
     
  3. Killer-Kris

    Regular

    Joined:
    May 20, 2003
    Messages:
    540
    Likes Received:
    4
    Portability is a key concern, though this is primarily being run under linux, and potentially under FreeBSD.

    I was hoping there was some obscure library I didn't know about, or that we were using the time.h functions incorrectly or something like that.

    That should help, since the clock_t clock() function didn't appear to be working. We'll just have to implement our own using that link you gave me.

    Thank you for the help. I suppose this is just one more thing that justifies my favoring java.
     
  4. Killer-Kris

    Regular

    Joined:
    May 20, 2003
    Messages:
    540
    Likes Received:
    4
    Ok, I found that using the struct timeb there is a millisecond field that seems to be accurate and seems to be working.

    Thank you again.
     
  5. Ray Adams

    Newcomer

    Joined:
    Jun 3, 2004
    Messages:
    20
    Likes Received:
    0
    If you think only about Windows, you can use GetTickCount.
     
  6. assen

    Veteran

    Joined:
    May 21, 2003
    Messages:
    1,377
    Likes Received:
    19
    Location:
    Skirts of Vitosha
    Generally, there is no single "good" method for accurate timing on the PC. Some of the methods have problems on weird-ass chipsets (think early VIA), some - on dual-processor machines, some - on laptops which adjust their clockrate dynamically. So beware.
     
  7. epicstruggle

    epicstruggle Passenger on Serenity
    Veteran

    Joined:
    Jul 24, 2002
    Messages:
    1,903
    Likes Received:
    45
    Location:
    Object in Space
    glad to offer any help. Been a while (a long while) since i did any programming. Damn my skills are rusting away. :(

    If you get it to work, let us know what you used. all the best.

    epic
     
  8. Humus

    Humus Crazy coder
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,217
    Likes Received:
    77
    Location:
    Stockholm, Sweden
  9. Killer-Kris

    Regular

    Joined:
    May 20, 2003
    Messages:
    540
    Likes Received:
    4
    Ok,once again I'd like to thank everyone for the help. And what we ended up using goes roughly as follows...

    #include <sys/types.h>
    #include <sys/timeb.h>
    #include <time.h>

    struct timeb start, stop; // Start and stop times structures
    double elapsed; // Elapsed time in seconds

    ftime(&start);

    //code to be benchmarked

    ftime(&stop);

    elapsed=((double) stop.time + ((double) stop.millitm * 0.001)) - ((double) start.time + ((double) start.millitm * 0.001));
     
  10. cristic

    Newcomer

    Joined:
    Jan 30, 2004
    Messages:
    179
    Likes Received:
    0
    Location:
    nowhere near
    Killer-Kris, the function you are using is obsolete. You are better off using gettimeofday, like Humus is suggesting.

    http://linux.ctyme.com/man/man0834.htm

    Besides with gettimeofday it is easier to calculate the elapsed time.
     
  11. tobbe

    Newcomer

    Joined:
    Feb 8, 2002
    Messages:
    60
    Likes Received:
    0
    Location:
    Stockholm
    <hijack>
    Havent used these functions in ages -
    are QueryPerformanceFrequency ()/QueryPerformanceCounter() on Windows obsolete nowadays? (clock throttling and all that)
    </hijack>
     
  12. epicstruggle

    epicstruggle Passenger on Serenity
    Veteran

    Joined:
    Jul 24, 2002
    Messages:
    1,903
    Likes Received:
    45
    Location:
    Object in Space
    Ill throw my hat in with humus and cristic. should be easier.

    epic
     
  13. cristic

    Newcomer

    Joined:
    Jan 30, 2004
    Messages:
    179
    Likes Received:
    0
    Location:
    nowhere near
    As far as I know, it should be the preferred method, since it is the most precise and accounts for all that stuff behind the scene (clock throttling/multiple cpus...)
     
  14. Killer-Kris

    Regular

    Joined:
    May 20, 2003
    Messages:
    540
    Likes Received:
    4
    I'll definitely keep that in mind for the next time I need to time something under Linux. But for this project, it isn't something that was going to be sticking around. We just needed to quickly convince someone that a different algorithm was in fact faster. And at least on the system we are working on that function still works and seems pretty accurate.

    Not entirely with the newer processors. There was a thread over in the hardware forums the other day...
    http://www.beyond3d.com/forum/viewtopic.php?t=20893
     
  15. ShootMyMonkey

    Veteran

    Joined:
    Mar 21, 2005
    Messages:
    1,177
    Likes Received:
    72
    The real problem with QueryPerformanceCounter and QueryPerformanceFrequency is that they're so amazingly slow that they completely screw with the profile. It's not significant if you're measuring something that takes a really long time by itself, but if you're trying to time a block that's just a few hundred cycles or something, the Win32 functions will just kill you. Especially QueryPerformanceFrequency, which seems to vary extremely wildly from run to run as well. I've seen it measure my clock speed as everything from 7 GHz to 88 KHz. If there was a nice way to get the current exact clock speed within a cycle or two, that'd be nice, but no such luck.

    You could try doing a long running measure yourself and just store the result, but that doesn't always work out. Especially if you're on a CPU with active power reduction schemes that alter clock speed dynamically. Even otherwise, there's always natural fluctuations of the clock speed as a result of heat and EMI and such.
     
  16. cristic

    Newcomer

    Joined:
    Jan 30, 2004
    Messages:
    179
    Likes Received:
    0
    Location:
    nowhere near
    Well run that code like a gazillion times and divide the measured time with a gazillion. Running and timing a few instructions once will most likely tell you nothing.

    If you are calling QueryPerformanceFrequency every time in your loop, then your alghoritm is flawed. There's no need to do this, because the frequency of the performance counter will *not* change while the system is up and running. Call it once, and store the result in a variable somewhere then use that variable. The QueryPerformance* functions take a while to execute, because they take into account lots of things (clock throtling, multi-processor systems..) to give you an accurate result.
     
  17. ShootMyMonkey

    Veteran

    Joined:
    Mar 21, 2005
    Messages:
    1,177
    Likes Received:
    72
    True, but the main point was just that for doing roughly the same thing, the fact that the Win32 functions will in addition do all this extra Win32-related stuff means that it's going to screw up the profile if you're trying to isolate something that's individually very small, but still gets run a lot.

    Actually, I was referring to a separate software tool that someone wrote which uses QueryPerformanceFrequency to estimate frequency. Actually doing a refresh several times manually yields radically different numbers. In practice, you would want to store the frequency once, but even this isn't really accurate for physical reasons -- but the scale of change is generally very small unless you're dealing with a machine that actually DOES clock throttling. Ideal, of course, would be to get the immediate frequency on every cycle through a profiled block and just divide by the weighted average. Of course, that would be flat out impossible short of having specialized "profiling" circuitry and profile flag markers in the hardware -- while a lot of us would be fairly happy, I think it's not as often these days that people actually do cycle-level profiling on mainstream hardware.

    If auto-vectorization ever really truly happens (and I don't mean things like ICC which basically auto-vectorize SPEC) and constructs like Cilk become more commonplace, I also think even the very performance-minded people will rely on the compiler more and more to just produce fast code for them. I've never been impressed by the optimization levels that come out of commercial compilers, but some of the stuff that I see in academia is actually pretty good -- the problem is that they all tend to solve one problem at a time and often tend to be focused on taking advantage of the style of the language (typically not C++ or even anything remotely similar).
     
  18. Rolf N

    Rolf N Recurring Membmare
    Veteran

    Joined:
    Aug 18, 2003
    Messages:
    2,494
    Likes Received:
    55
    Location:
    yes
    It's not that simple. I can confirm what ShootMyMonkey said because I hit the same problem myself.
    QueryPerformanceCounter takes -- on average (and that's the problem) -- one microsecond to complete. That time will be included in your measurements. Apart from the annoying direct consequence that code sprinkled with QPC based profiling timers can be slowed down a whole lot, there's no simple way to compensate for the huge, inherent error margin, because it fluctuates wildly from call to call.

    That's been a huge problem for me.
    I originally used RDTSC for everything and had great, reliable results on my old (fixed clock speed) machine. But I figure that's not good enough anymore. OTOH QPC isn't good enough for profiling.

    The solution I've come up with was to use QPC only for timing large things, in the 5ms+ ballpark. Seconds per frame for animation purposes is a prime candidate here. I still use RDTSC for profiling, but that's not active in "shipping" code. Whenever I need to collect accurate profiling data, I just deactive "Cool'n'Quiet" and all is fine.

    In my little benchmark project it turned out to be more complicated. There, I need to make sure that some specific loops run only for a limited amount of time, but it's unknown (at compile time) how many iterations that would take.

    I originally had something like this:
    Code:
    RDTSCTimer t;
    int frames=0;
    
    //prep benchmark
    
    t.reset();
    do
    {
    //workload
      ++frames;
    } while ((frames<1000)&&(t.elapsed_seconds()<0.75));
    double delta_t=t.elapsed_seconds();
    Limiting runtime with a QPC based timer takes a lot of time away from the actual benchmark workloads, while taking a benchmark result time with RDTSC is vulnerable to clock speed variations. So I had to do something.

    I finally settled on a hybrid approach. Elapsed seconds is very fast to compute on my RDTSCTimer class. I figured that I don't need the time limit to be precise. A rough ballpark is enough, I can easily tolerate +/-50 per cent error here. The overall delta_t for the run needs to be precise OTOH, so I use QPC for that.
    Like this:
    Code:
    RDTSCTimer t;
    QPCTimer robust_t;
    int frames=0;
    
    //prep benchmark
    
    robust_t.reset();
    t.reset();
    do
    {
    //workload
      ++frames;
    } while ((frames<1000)&&(t.elapsed_seconds()<0.75));
    double delta_t=robust_t.elapsed_seconds();
     
  19. RussSchultz

    RussSchultz Professional Malcontent
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,855
    Likes Received:
    55
    Location:
    HTTP 404
    That is amazingly craptastic.

    You'd think there'd be some read only registers you could get to in a few cycles.

    But I'm just an embedded jockey, so I think in terms like that. ;)
     
  20. Rolf N

    Rolf N Recurring Membmare
    Veteran

    Joined:
    Aug 18, 2003
    Messages:
    2,494
    Likes Received:
    55
    Location:
    yes
    It sure sucks.
    RDTSC is internal to the processor, and you can just read it into registers (in user mode).
    QPC uses the "real time clock" which is an extra hardware device that can only be read from kernel mode, so there's a lot of overhead. It doesn't help that it's probably somewhere on the (slow) SMBus.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...