On TechReport's frame latency measurement and why gamers should care

Discussion in '3D Hardware, Software & Output Devices' started by Andrew Lauritzen, Jan 1, 2013.

  1. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,079
    Likes Received:
    648
    Location:
    O Canada!
    The driver guys did flag this up to me last night; evidently FRAPS starts the capture at the point the application calls Present, not the time that the GPU renders it. For applications that are sufficiently GPU bound, given that DX can allow up to 3 frames of command buffer to be gathered, the higher latency plots captured from FRAPS don't necessarily translate into an uneven render time and display to the end user. From our analysis this appears to be the case Sleeping Dogs where you see a high latency frame quickly followed by a low latency one; this is due to how we batch some things in the driver and FRAPS records that, but as the app (in these settings) is GPU bound this isn't representative of the render output as we still have sufficient GPU workload in the queue.
     
  2. lanek

    Veteran

    Joined:
    Mar 7, 2012
    Messages:
    2,469
    Likes Received:
    315
    Location:
    Switzerland
    I have try play a bit with all that ( i use a cfx setup and so can play with single and dual cards, and even nvidia in SLI ( as i have access to many pc here ) ....

    More i play with the graphs with fraps, less i understand it.. if you look the graphs i have post upper, i have some incredible jump of fps in Hitman.. ( with an average of 152fps, i have suddenly some 300-400fps )

    Something really strange is if i just read the min max average given by fraps, it absolutely never say this result ( max is 177fps or something like that )
     
  3. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    14,891
    Likes Received:
    2,309
    but would you notice jumps at that framerate seeing as lcd's refresh at 60hz ?
     
  4. caveman-jim

    Regular

    Joined:
    Sep 19, 2005
    Messages:
    305
    Likes Received:
    0
    Location:
    Austin, TX
    FRAPS min, average and max are all averages of frames per second so the min fps is the slowest second in the set, the max is the fastest second in the set, and average is the obvious. to get the slowest and fastest single frame render time you have to go to the frametimes file. You can convert that millisecond value to FPS if it makes you feel more comfortable examining the data but just keep in mind you didnt experience 400fps, you experienced a single frame at 400fps which was pushed to a display that refreshes at 60hz (or 120hz or whatever).
     
  5. caveman-jim

    Regular

    Joined:
    Sep 19, 2005
    Messages:
    305
    Likes Received:
    0
    Location:
    Austin, TX
    Does this mean that the long render time frame was caused by the driver loading up the GPU for multiple successive frames causing the initial slow frame render time, and the rapid clear out of the queue from the driver appears to FRAPS to be faster renders that actually are not full frame render times because the CPU side is moving onto more work while the GPU is still processing?
     
  6. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    But Dave this is the major point of my original post... what FRAPS measures is also what the game simulation measures and uses to update the motion. Even if you don't starve the GPU and still end up delivering frames smoothly to the display, you will still notice obvious jitters as the simulation is driven by the underlying signal that FRAPS is measuring. Watch Scott's videos for the example... stuff jumps, then "slows down" to realign with the wall clock. Very distracting.

    And yes games could handle this slightly better by "smoothing" the raw frame times to try and ride over spikes, but none that I know of do this. And even if they did, higher-paced games are trending towards one buffered frame (not three) so that doesn't leave much of a margin for spikes anyways.

    Thus given current game render loops, it's really not okay to deliver this uneven back-pressure, as the games will interpret it as instantaneous changes in render pipe throughput and adjust the simulations accordingly (i.e. suddenly, and jittery).
     
    #86 Andrew Lauritzen, Jan 10, 2013
    Last edited by a moderator: Jan 10, 2013
  7. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    6,806
    Likes Received:
    473
    Is there any way the driver could simply lie and carry forward a bit of the latency to smooth this out?
     
  8. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    6,806
    Likes Received:
    473
    How do you force this on the driver?
     
  9. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    Ideally DXGIDevice1::SetMaximumFrameLatency, but most games just intentionally stall waiting on a dummy occlusion query that they issued one frame ago.
     
  10. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    14,891
    Likes Received:
    2,309
    does the max frames rendered ahead setting in nv control panel + radeon pro do the same thing ?
     
  11. ECH

    ECH
    Regular

    Joined:
    May 24, 2007
    Messages:
    682
    Likes Received:
    7
    If you change the flip queue size from it's default of 3 to 1 what would be the end result then? All I've seen is that there might be spikes in the graph when FQS was left at default of 3 but there have been no comparisons when it was changed to 1 or 5.
     
  12. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    Probably, yes.

    You'd get the same big spike, but a smaller number of "fast" frames after it basically. i.e. larger chance that you actually stall the GPU, depending on the magnitude of the spike. Either situation will still visually look like a jitter in most/all games (for the reasons that I explained), but the fewer frames are buffered the less possibility of avoiding it.
     
  13. ECH

    ECH
    Regular

    Joined:
    May 24, 2007
    Messages:
    682
    Likes Received:
    7
    I'm not so sure it's that cut and dry though. You have FQS that is influenced by the drivers then you have those frames rendered ahead that is dictated by the game itself. The only way to know for sure is to test it and see. But make sure that both the game and the drivers are using the same frames rendered ahead. For example, BF3 allows cvar for this called:
    RenderDevice.ForceRenderAheadLimit
    which can be adjusted without using the drivers to do it. I can change the FQS for the drivers and BF3 will still show the exact same value for RenderDevice.ForceRenderAheadLimit indicating no change to the cvar. In order for me to see a change in frames rendered ahead I have to change the cvar.

    Why am I bringing this up? Lets say that driver A for video card A has a profile that changes frames rendered ahead. Driver B for video card B sticks with the traditional default of 3. In that case you would have to make sure that both A and B both use the appropriate default in order to test them.

    I need to see that:
    -both the game and drivers are using the same frames rendered ahead
    -both drivers are using the exact same frames rendered ahead. It's assumed that it's left at default but that's just a guess. It should be verified.
    -can the drivers make any influence on frames rendered ahead for that game
    -can changes of those frames rendered ahead impact results. If so how?
    -if there are any fluctuations found in the graph can it actually be visibly seen in game? For me it doesnt.


    IMO, there simply not enough testing on this.
     
    #93 ECH, Jan 10, 2013
    Last edited by a moderator: Jan 10, 2013
  14. XTF

    XTF
    Newcomer

    Joined:
    Jan 3, 2013
    Messages:
    27
    Likes Received:
    0
    Can't this be zero (for minimum latency)?
     
  15. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    Like I said, I'm pretty sure BF3 is one of the games that uses an occlusion query to basically stall the rendering thread if the GPU is too far behind.

    You can't even notice the issue in the high-speed camera video that Scott posted at TR? For me it's painfully obvious even at regular speed in person, but even for people who can't consciously notice it there, almost everyone I've asked can see it just fine in the slowed-down video.

    Not totally clear what "zero" means - depends on how you count. Command buffers are always submitted with a fair chunk of work at once, there's no getting around that. There's a fairly large overhead to the kernel mode transition/driver work that is involved in submitting a command buffer so you can't just stream commands one by one to the GPU. Thus in practice you are going to always see one frame-ish of latency minimum. Integrated CPU/GPUs can conceptually do better if/when we can cut out the kernel mode driver entirely (practically, this requires a unified address space and some other features - basically the GPU command streamer has to get smarter in a few ways).

    And yes, how many frames you buffer can definitely affect average FPS measurements, hence why frames are buffered at all. Forcing only one frame buffered will usually lower FPS and make spikes more serious, but it will decrease input latency, which is often more important in fast-paced games.
     
  16. OpenGL guy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,357
    Likes Received:
    28
    Sure, if you want much worse performance. Let's say your rendering loop looks like this:
    Code:
    Sample timer
    Compute physics
    Render stuff
    
    Now let's assume that it takes you ~20ms to compute all the physics and generate the draw calls and ~20ms for the GPU to render the frame. That means a total frame time of 40ms, or 25fps. Now, if you overlap the previous frame's rendering with the current frame's computations, you double your frame rate without increasing latency at all.

    You are always going to see at least one frame behind where the game is computing the next frame to be. That won't be avoided by not allowing any buffering at all.
     
  17. ECH

    ECH
    Regular

    Joined:
    May 24, 2007
    Messages:
    682
    Likes Received:
    7
    BF3 isn't the only game though. Which is why more information/testing is needed.


    Both of them stuttered. But I saw no correlation between what the graph showed and what was seen.
     
    #97 ECH, Jan 11, 2013
    Last edited by a moderator: Jan 11, 2013
  18. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    No one is ever going to say that more information is a bad thing :) But that doesn't affect my argument in the original post.

    Ah ok. To me the difference is very obvious, especially if you watch the trees near the edge of the screen or something with higher contrast. The video compression hurts the perception a little bit, but it's still quite clear from the comments that most people are able to see the difference. Note that the frame time graphs that Scott had in his article are not the same run as the high speed video to my knowledge, but the pattern is quite similar (jump due to long frame, sped up time due to short frames, then back to normal speed, etc). You're looking for those spikes, not for the more subtle stuff that is the same on both images and probably related more to LCD refresh and video compression and so on than the game.

    If you're arguing that "yes, the spikes are there in FRAPS but I can't see them", then I envy you :) I can see them, and when they are present they affect my ability to get ridiculous scores in BF3 ;)
     
    #98 Andrew Lauritzen, Jan 11, 2013
    Last edited by a moderator: Jan 11, 2013
  19. ECH

    ECH
    Regular

    Joined:
    May 24, 2007
    Messages:
    682
    Likes Received:
    7
    What I've provided is just the tip of the iceburg and more research is needed to make sure there aren't other factors influencing the results. Another thing that I've found puzzling is there is no defined litmus for how deviation in the graph = 100% something visual. Because of that it's very hard to point out when stutter starts or stops in the graph. Other then suggesting that what you see when watching the video because the video isn't smooth. So, IMO, there has to be a more scientific representation showing what degree of variation is needed before stutter can be observed. And, in what way the graph shows the stutter being observed. It appears that not all games are the same. You can have all sorts of anomalies like:
    -hitching
    -skipping
    -pulsating
    -warping
    -stutter
    -etc
    Also verification, identification and correlation to what the graph is showing. Something I'm just not seeing. Now, back to the video. One card does appear to show a form of pulsating while the other more of stutter. Neither of which are clearly shown in the graphs even though both aren't smooth. So again, the graph must be clearly defined at what part the anomaly starts depending on what variation is found. How big the variation must be for that game and how often does the variation have to occur in order to see it. Again, tip of the iceberg. Because there isn't enough information I'm not seeing the results that you are.




    But I'm not seeing what you're seeing. Yes, they both aren't smooth however, there is no indication to me that it relates to the graph. This is where we disagree.

    Edit:
    Here's something I've found interesting in all this. Both show and uneven fluidity in the demo/run through visually. Yet the graph for each video card clearly show contrasting results. Looking at the graph alone could suggest that one is more smoother then the other but visually it's simply not the case. So how reliable are the graphs then in this case? Apparently, reducing the spikes and valleys isn't enough to show a smooth, fluid run through. So, what will it take to get that result for that game?

    In a nutshell, more information/testing is needed.
     
    #99 ECH, Jan 12, 2013
    Last edited by a moderator: Jan 12, 2013
  20. digitalwanderer

    digitalwanderer Dangerously Mirthful
    Legend

    Joined:
    Feb 19, 2002
    Messages:
    17,267
    Likes Received:
    1,783
    Location:
    Winfield, IN USA
    Like Andy said, more info/testing is never a bad thing and is always welcome. :)
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...