Draw Calls

Discussion in 'Beginners Zone' started by DavidGraham, Feb 18, 2012.

  1. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    1,288
    Why are the Draw Calls on the PC more expensive than the Xbox360 (for example) , Developers can do more of them on the Xbox than on PC which sounds pretty strange considering that PCs have way more GPU and CPU horsepower .
     
  2. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,183
    Location:
    Helsinki, Finland
    On consoles you can directly write commands to the GPU ring buffer, and you will write the commands directly in a format that the GPU hardware understands. It's just a few lines of code to add a sincle draw call.

    PC has both user space and kernel space drivers that process the draw calls. More than one software can be adding GPU commands simultaneously, and the driver must synchronize and store/return the GPU state accordingly (a single mutex lock is already over 1000 cycles). The GPU commands must be translated by the driver to a format understood by the GPU (many different manufacturers and GPU families). The commands and modified data must be sent over a standardized external bus to the GPU. On Xbox for example both GPU and CPU share same memory and nothing needs to be send over a relatively slow bus.

    On consoles you can also edit GPU resources without locking them if you are sure that the GPU is not using them currently. On PC everything must be properly synchronized and all commands and resource references must be validated (software cannot be allowed to crash the GPU or modify/access data of other programs). PC drivers also automatically manage GPU memory allocation (moving in/out resources based on usage). Depending on allocator/cache algorithms used this can also be relatively expensive.
     
  3. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,340
  4. swaaye

    swaaye Entirely Suboptimal
    Legend

    Joined:
    Mar 15, 2003
    Messages:
    7,906
    Location:
    WI, USA
    sebbi that sure sounds like a mess. Oh the uglies of a general purpose, expandable system.
     
  5. hoho

    Veteran

    Joined:
    Aug 21, 2007
    Messages:
    1,218
    Location:
    Estonia
    At least until DX11 OpenGL had far smaller overhead when doing drawing calls than DX. Not sure how things are now.

    Also, having to feed the GPU with command stream in a specific format isn't all that fun any more once you have to deal with more than a couple of different GPUs or even versions of the same GPU core. Consoles can allow that kind of "uglyness" as hey are using fixed hardware for years and have no problems with incompatibility.
     
  6. pcchen

    Moderator Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,645
    Location:
    Taiwan
    If the GPU becomes more mature then it may be possible to have a somewhat fixed hardware "instruction set" for GPUs, and it'd be possible to have a much leaner driver stack on PC.

    There are, of course, some problems. The obvious one is, who gets to design this "instruction set." Most hardware 'standard' developed from a single product by a single company, which becomes very popular and then be used as a "de facto" standard (and may become a real industrial standard). It's really hard to make a new standard out of nothing. And design by a committee doesn't work. Microsoft is another possible candidate, but they probably don't understand enough about the underlying hardware architecture to make a good design.

    Another way is to design an "intermediate code" which is translated by a software into hardware commands. But then this is not very different from a command buffer, and probably not going to bring much performance advantage.

    There are other problems too. For example, since the driver would not be able to do safe keeping works, the hardware will have to. Basically you'll want the GPU to be like a CPU, with all the security modes and controls. Personally I think this is a good thing, as with GPU getting more flexible and GPGPU there will be more related problems with security, so it's probably better done with hardware anyway.
     
  7. ERP

    ERP
    Moderator Veteran

    Joined:
    Feb 11, 2002
    Messages:
    3,669
    Location:
    Redmond, WA
    Modern GPU hardware is already a pretty close copy of the API.
    Translating isn't really the issue.

    The predominant problem is having to deal with multiple processes sharing the GPU. This will not change, short of adding hardware to the GPU to enable fast context switches which at some point may be justifiable.

    DX11 and the win7 driver model removes a lot of the superfluous Driver overhead that existed. Plus you finally get command buffers, and state isn't global in the same way.

    You still have the stupid stuff in the PC drivers, fixes/workarounds for poorly optimized or broken game code, that eats a lot of CPU since it involves analyzing everything going to the GPU. Of course devs are forced to work around these, which the driver writers then have to detect a d fix.....

    And of course new PC GPU's are optimized to run last years best tech.
     
  8. pcchen

    Moderator Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,645
    Location:
    Taiwan
    My point is, yes, you'll want GPU to handle these things, including memory protection and others. That probably needs a small CPU on the GPU to do some task managing works.

    Then we can standardize these commands so applications will be able to send commands to the GPU directly, without any extra overhead. You can handle older applications with drivers, and newer applications will be able to access GPU more directly.

    Of course, on a normal desktop OS, you probably still can't let applications access GPU directly, as it involves memory mapped I/O and that needs to be in kernel mode. However, its overhead should be much less than what we have now.
     
  9. Dominik D

    Regular

    Joined:
    Mar 23, 2007
    Messages:
    780
    Location:
    Wroclaw, Poland
    This is not going to happen. 1. HW from different vendors is to dissimilar for a common ISA. Not to mention that instructions are not everything GPUs process: there's some state involved, which is entirely HW-specific. 2. Part of what goes to the buffer is so tied to hardware it may be covered by patents (not that I would know anything about patents, just assuming this may very well be the case). 3. Even for the actual code there's a huge variation in what and how you want encoded in the command buffer which depends on the HW you're feeding.
     
  10. pcchen

    Moderator Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,645
    Location:
    Taiwan
    Well, it's not likely in the immediate future, but never say never ;)
    HW from different vendor is probably going to be a moot point as the number of important GPU vendors in x86 space is now only three, and they probably all have cross licensing deals, so patent is not a serious problem. Advance in GPGPU also brings GPU from different vendors closer. Although it's probably not going to happen in maybe a few years, but at least it's not technically impossible and if there's enough incentive they may want to do that.

    But that brings to the main point though: is there enough incentive for IHVs to do that? Right now I don't see that happening, as there are really no strong demand for very high performance desktop graphics.
     
  11. darkblu

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,642
    While I agree with most of the points you bring up, I'm not sure the (in)efficiency of today's draw call on the desktop is that much related to the GPU ISA per se. State models, buffer sync/lock mechanisms and such are much more relevant to the subject than what the compiler outputs. With the advent of binary-program APIs you could even consider the compiler's stage as (near) zero-cost these days, and that would not change that much the cost of the draw call. I mean, draw calls were a factor to reckon with much before GPU ISAs were a topic for conversations.
     
  12. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,340
    I assumed pcchen wasn't referring to the shader ISA, rather a theoretical command buffer "ISA".
     
  13. darkblu

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,642
    On a second read, I think that must be the case. My bad.
     
  14. homerdog

    homerdog donator of the year
    Legend Veteran Subscriber

    Joined:
    Jul 25, 2008
    Messages:
    5,559
    Location:
    still camping with a mauler
    So what can be done to reduce the cost of draw calls on PC? I know instancing is used to reduce the number of draw calls needed, but is there a way to actually reduce the amount of CPU time needed per call?
    Keep in mind my understanding of these things could not even be called "beginner". More like "ignorant spectator". :)
     
  15. Arwin

    Arwin Now Officially a Top 10 Poster
    Moderator Legend

    Joined:
    May 17, 2006
    Messages:
    17,102
    Location:
    Maastricht, The Netherlands
    All I know is that Crytek had pointed out the main weaknesses they considered to still be in DirectX11, and that they were working with Microsoft to sort them out. I don't know what has become of that actually, would have expected to have heard something about that, but maybe I just missed it.
     
  16. Dominik D

    Regular

    Joined:
    Mar 23, 2007
    Messages:
    780
    Location:
    Wroclaw, Poland
    Draw call cost depends on the HW. Some things have to be translated for a given card and this imposes extra CPU cost per call. One could imagine that modern hardware may not support certain topologies (triangle fans would be something I guess most cards don't support directly; perhaps some support just plain TRIs or just lists). But it's not really a draw call that kills you, it's the (unnecessary) state changes between draw calls and stuff that has to be translated. Pretty much every modern HW out there simulates fixed pipeline in the driver, so that's extra CPU cost for you. Weird texture formats may require some processing. There's a lot happening beyond draw calls. And there are lots of things you can do to minimize CPU usage.
     
  17. Rodéric

    Rodéric a.k.a. Ingenu
    Moderator Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,472
    Location:
    Planet Earth.
    Make GPU standard just like CPU thank you.

    Not investigated the cost of a draw call in a while, a lot happens behind the hood for sure, but we are used to minimizing them since D3D9...
    A draw call basically gets all the states and check their validity/consistency before filling the command stream.

    Anyone working on drivers can explain how that works in D3D10/11 ?
    (I know the runtime does a lot because of NV :p, but I'd be curious to see how much is left to the driver. Could "just" be transcoding the D3D command stream into a GPU specific one.)
     
  18. darkblu

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,642
    CPUs are not standard either ; ) Moreover, a proper level of abstraction beats standardized hw most of the time (e.g. HL programming languages vs assembly, etc)

    I can't speak of D3D, but I have observations from GLES - the driver there has two (or two-and-a-half) major functions:

    1. state tracking
    2. shader compilation (normally depending on both client shaders and active state)
    3. interfacing with the kernel mem allocators for buffer objects management and related fences/syncs.

    The last one of those does not really belong in there, as it can be taken out of the driver and into a bog standard "GPU buffer API", or if you wish, a "DMA-coherent buffer API", perhaps even in flavors based on whether the device is MMU-equipped (so it can "comprehend" page tables) or not.

    That said, we can optimize drawcalls all we want, but they will never be 'free' - they'll always cost CPU cycles, whether in housekeeping or in CPU/GPU rendezvous mechanisms.
     
  19. Dominik D

    Regular

    Joined:
    Mar 23, 2007
    Messages:
    780
    Location:
    Wroclaw, Poland
    Sure. Who's going to create the de facto standard the way Intel's x86 is? I vote for PowerVR to lead in this space. :>
     
  20. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,340
    AMD hardware supports triangle fans though I would like to see future APIs drop support for any primitive type that isn't a list.
     

Share This Page

  • About Beyond3D

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...