Does DirectX Raytracing API supports MultiGPU rendering?

Discussion in 'Rendering Technology and APIs' started by Bipul Mohanto, Sep 15, 2021.

  1. Bipul Mohanto

    Joined:
    Aug 24, 2021
    Messages:
    3
    Likes Received:
    0
    Hi!

    I am a beginner in DirectX world. The raytracing API is very impressive. However, I have a question, does it support multiGPU rendering? If so, how?

    If the DirectX raytracing supports multiple rtx gpus, real time rendering could be even faster. I would like to have your suggestions.
     
  2. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,774
    Likes Received:
    17,075
    Location:
    The North
    I believe you can assign DXR to different mgpu instances. So I believe the basic answer is yes. With dx12 you are in nearly in full command of the gpus so you will ultimately decide how to put the image back together once the gpus are done.
     
    Bipul Mohanto likes this.
  3. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    17,185
    Likes Received:
    4,574
    Is this question just "Does raytracing work with sli" or is it something different ?
     
  4. Albuquerque

    Albuquerque Red-headed step child
    Veteran

    Joined:
    Jun 17, 2004
    Messages:
    4,034
    Likes Received:
    612
    Location:
    35.1415,-90.056
    Yeah, Davros is onto the same thought train I am... Are we talking about a homogeneous GPU platform (essentially multiples of the same card) or are we talking heterogeneous platform (an Intel Xe plus a Radon 6700 plus an NV 3070, all at the same time.)

    iroboto's post above makes me think a heterogeneous platform might actually be feasible in DX12, if an application developer wanted to make it work. Is that true? Crazy if so...
     
  5. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,774
    Likes Received:
    17,075
    Location:
    The North
    Yep, possible I think, I was there in person when Max McCullen presented it IIRC.

    https://www.pcgamer.com/directx-12-will-be-able-to-use-your-integrated-gpu-to-improve-performance/
    I guess there are different ways to cut it up, but it's a lot of work for a developer to take on. Seems to only make sense if you know everyone has the same configuration.
     
    BRiT likes this.
  6. Albuquerque

    Albuquerque Red-headed step child
    Veteran

    Joined:
    Jun 17, 2004
    Messages:
    4,034
    Likes Received:
    612
    Location:
    35.1415,-90.056
    I believe I've seen either that article, or one very similar to it, a few years back. Specifically focused around using the iGPU which comes with most entry and midrange CPUs these days. Agree with your summary: looks like you can do almost as much as you feel like you want to bite off...
     
    BRiT likes this.
  7. JoeJ

    Veteran Newcomer

    Joined:
    Apr 1, 2018
    Messages:
    1,511
    Likes Received:
    1,748
    My guess is it won't work well enough to be a win in most cases.
    We could generate BVH on GPU1, copy it to GPU2, so both do not need to access VRAM of the other while tracing. Sounds practical if we can generate all BVH at level load, but not so much if we have open world and constantly generate new geometry, causing unsteady transfers. (idk if compatible or equal GPUs with some link can do such transfers faster)
    We could also generate BVH on both GPUs. But then we duplicate both memory and processing time.
    I'd rather try to keep RT on the stronger GPU entirely, and use the second for other work which can run independently. E.g. using iGPU for physics acceleration, audio, eventually postprocessing, shadow maps, etc.

    The question feels a bit rhetorical, now that we have to be happy if people can afford a single dGPU. But on the other hand, afaik AMD plans to put iGPU on most future CPUs like Intel does. So dGPU+iGPU migh become a standard configuration we can rely on.
     
    Remij and iroboto like this.
  8. Osamar

    Newcomer

    Joined:
    Sep 19, 2006
    Messages:
    217
    Likes Received:
    39
    Location:
    40,00ºN - 00,00ºE
    I have not much idea about it, but it could be useful and practical to denoisse in the iGPU?
    To much data movement?
     
  9. JoeJ

    Veteran Newcomer

    Joined:
    Apr 1, 2018
    Messages:
    1,511
    Likes Received:
    1,748
    Usually you hide the movement behind latency. So you lag one frame behind with display, then you get 16ms to transfer stuff 'for free'.
    Combining denoising with other post processing tasks makes sense as well ofc.
    But i lack personal experience. Currently debug transfers take the most of my frame time, but did not try to fix this yet.
     
    iroboto likes this.
  10. Bipul Mohanto

    Joined:
    Aug 24, 2021
    Messages:
    3
    Likes Received:
    0
    For this, lets think (1) I have two RTX 3090 GPU card in two different computers, with different Motherboard configuration, and also (2) I have one machine with two GPUs (again RTX 3090) with NVLink connection bridge.
     
  11. Ext3h

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    423
    Likes Received:
    491
    GPUs with a direct bridge can do sufficiently fast transfers, at 15-30GB/s (or 60GB/s latest gen). But the key feature is no the bandwidth, the keyfeature is not screwing up latency for everything CPU communication based. Because even though PCIe is full-duplex, you still have no chance of getting the scheduling right to *achieve* that (very poor design choices on the driver side, regarding how they interpreted copy queues!), and you often run into the situation that you risk getting stalled by a bulk transfer while trying to issue work to unrelated units...

    Bonus points for something like full NVLink which allows almost transparent unified memory space, at a somewhat neglible overhead for *not* hitting the local memory.
     
    pharma, Krteq, JoeJ and 1 other person like this.
  12. Malo

    Malo Yak Mechanicum
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    8,520
    Likes Received:
    5,001
    Location:
    Pennsylvania
    Rendering apps will certainly utilize multiple GPUs taking full advantage of the RT cores via Optix-compatible renderers.
     
    Krteq and iroboto like this.
  13. JoeJ

    Veteran Newcomer

    Joined:
    Apr 1, 2018
    Messages:
    1,511
    Likes Received:
    1,748
    Do you know if Vulkan does better here than DX12?
     
  14. Dictator

    Regular Newcomer

    Joined:
    Feb 11, 2011
    Messages:
    580
    Likes Received:
    3,394
    Quake2RTX and Shadow of The Tomb Raider support MGPU that works with ray tracing. Never tried it though.
     
  15. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    17,185
    Likes Received:
    4,574
    We can wait ;)
     
    milk and Krteq like this.
  16. Ext3h

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    423
    Likes Received:
    491
    As far as I'm aware, no. Discussed that in a different thread before, but so far the only API for which the drivers got transfers about right was CUDA on NVidia GPUs. And that mostly because the stream syntax over there is explicitly only a frontend to true dependency graph based scheduling, acknowledging that the developer would do a horrible job at properly at grouping command buffers. (Don't get me wrong, command buffers have their uses if we talk about batching micro-grid kernel launches, as the performance for recording the buffer is everything for that use case. But everything not matching that label is a bad fit, API wise.)

    And a clever driver internal optimization which reserves the DMA engines exclusively per direction and peer, sorting every single transfer to the correct engine, rather than dropping everything into a single resource pool. It's like the difference between half- and full-duplex Ethernet. Little difference in raw numbers, but one just works so much better that it's inconceivable to ever go back.
     
    JoeJ, pharma and iroboto like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...