DX11 vs DX12

Discussion in '3D Hardware, Software & Output Devices' started by iroboto, Jan 15, 2015.

  1. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,380
    I get the lots of unique objects, but not the little overdraw. Is that because, with overdraw, the CPU could cull the objects instead of sending them down the GPU pipe?
     
  2. Andrew Lauritzen

    Andrew Lauritzen Moderator
    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,553
    Likes Received:
    633
    Location:
    British Columbia, Canada
    I just mean low overdraw lightens the GPU load. RTS tends to be relatively heavier on the CPU and relatively lighter on the GPU vs. other genres.
     
  3. ToTTenTranz

    Legend Veteran

    Joined:
    Jul 7, 2008
    Messages:
    12,236
    Likes Received:
    7,192
  4. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,532
    Likes Received:
    957
    That sounds like a recipe for disaster.
     
  5. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,556
    Likes Received:
    4,729
    Location:
    Well within 3d
    You make it sound like having a multi-GPU setup's performance limited to whichever vendor's card is slowest for a given scene, featureset limited to the intersection of card features both could support reliably, on drivers never meant to work in concert with their competitors, with a PC installed with two proprietary driver packages with separate update methods and schedules, drivers that frequently have problems with conflicts with older versions of themselves, with different .NET or other environmental requirements, on hardware that was not designed/developed/tested in the presence of a competitor that frequently has problems with differing hardware from the same vendor, with low-level differences in execution behavior, little history of cooperative implementations, multiple render paths in an engine that were never designed/coded/tested to run simultaneously, in a platform that has at best problematically supported switching between one GPU or the other in mobile, between vendors who have every interest and an ongoing history of getting in each other's way could lead to undesirable outcomes.
     
    Kej, 3dcgi, Malo and 15 others like this.
  6. Rys

    Rys Graphics @ AMD
    Moderator Veteran Alpha

    Joined:
    Oct 9, 2003
    Messages:
    4,174
    Likes Received:
    1,545
    Location:
    Beyond3D HQ
    I think that's my favourite forum post in the history of the Internet.
     
    psolord likes this.
  7. liquidboy

    Regular Newcomer

    Joined:
    Jan 16, 2013
    Messages:
    416
    Likes Received:
    77
    So it sounds like you're against devs being able to programme against a heterogenous HW setup (like multiple GPU's) ...

    This makes sense to me that we should be able to do this, just that the API's arnt there yet. Yes there are lots of engineering challenges to overcome.

    I have a vision in my head of how I would like to be able to programme against multiple devices, and spin up Graphics/Compute/DMA contexts and use them in parallel ... I'm hoping the Dx12 API's make this logically simple, as it should be..
     
  8. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,556
    Likes Received:
    4,729
    Location:
    Well within 3d
    I think that interpretation is putting too much on the API.
    An API ideally allows you to create code that can run on different implementations, but it doesn't mean they have to run it the same way, perform the same internal steps, or provide bit-identical outputs.
    It wouldn't be an abstraction if it dictates how everything below it must work.

    At best, I would have doubts about getting something consistent enough out of such a setup. The vendors like to slot their products between one another in the price/performance continuum, and we already have problems with asymmetric multi-GPU from one vendor. The way they treat their inputs and outputs are consistent within their own realm (well within the same IP level or family name, or used to) in the best case, but they have not been coordinated with each other.

    Since the software and hardware implementations are so complex and interact with low-level parts of the system, I am skeptical they are going to know how to handle one another quickly and without doing something that is going to tick the OS off.
    Lucid's Hydra sort of touted such a possibility, but that was in a different time, and the dedicated hardware and a high-level graphics command intercept methodology wouldn't fly here. There were also inconsistent behaviors between the portions that went to different chips, which could possibly be massaged with a nice thick latency-padding driver layer.

    A more segregated method like running otherwise independent contexts like physics on one and graphics on the other could happen, if the drivers don't shut it down.
    There's also the range of architectures from Fermi to Maxwell to GCN 1.0 to GCN 1.whateverthehellitis over years of implementations. The claim that DX12 extends a futuristic unification of architectures that spread out involves too much time travel for me.
     
  9. lanek

    Veteran

    Joined:
    Mar 7, 2012
    Messages:
    2,469
    Likes Received:
    315
    Location:
    Switzerland
    Possible in theory.. not practical.. or will ask for a really complicate work from developper..
    And when i look on the past history, i specially doubt that GPU vendors will even allow it. ( For the cross branding gpu's multi gpu's. )

    ( Ofc i dont speak about SFR method with the same gpu's brand )

    But who know, there's even a GDC presentation the 2 march ( next monday ), " advanced Graphic features in DX12 " co presented by AMD and Nvidia ( Dave Oldcorn, Software Engineering Fellow, AMD; Evan Hart, Principal Engineer, NVIDIA )
     
    #69 lanek, Feb 25, 2015
    Last edited: Feb 25, 2015
  10. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,556
    Likes Received:
    4,729
    Location:
    Well within 3d
    Wouldn't the endgame of any such hybrid scenario be that each IHV would best be served by sabotaging interoperability and blaming the other vendor's driver?
    They'd have plausible deniability by claiming that this involves an incredible level of complexity, because it would.
    Disgust would force a person with the setup to give up and spend money on two cards from the same vendor. In the case of a 50/50 split, that's a break-even situation where the same number of cards are sold as the hybrid scenario, except now nobody needs to explode the size of their validation and development budget to match the breadth of their competitor's product range.
    If one vendor were dominant in market share, they would still have the incentive to lock down the hybrid mode, and the minority player could do nothing about it.

    Without that rare setup being a possible development target, developers wouldn't need to scale up their render path development efforts, and Microsoft wouldn't need to expand what it needs to do for certified drivers.
    Fans of any particular vendor would be so certain the other brand is being evil, giving them something to validate their emotional investment.
     
  11. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    918
    Likes Received:
    1,122
    Location:
    55°38′33″ N, 37°28′37″ E
    So they are suggesting that each GPU can independently render separate parts of a frame to a single unified framebuffer (and the programmers don't really have to duplicate all the resources in each GPU's local memory). How in hell that is supposed to work?

    The rule of the thumb for the last 20 years was to have the framebuffer and the texture memory in the GPU's local RAM, and that RAM has to be fast even with lots of caches.
    If you have two separate framebuffers/depth buffers on each GPU, you would have to synchronize them after each draw call using relatively slow PCIe bus, that's a lot of read-and-write madness.
    If you place the frame/depth buffers in the system memory, you will have to semaphore each GPU's access to the buffer, so they will stall each other waiting for end of write.
    hUMA (heterogeneous Uniform Memory Access) with cache coherent views could work around it, but it's meant to work with on-die CPU/GPU cores, not over PCIe bus.


    Maybe they are talking about MRTs for things like portals, mirrors, or shadows, where some rendering threads will process render targets drawn using a secondary GPU, while other threads will render on the main CPU?
     
  12. TheAlSpark

    TheAlSpark Moderator
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    22,146
    Likes Received:
    8,533
    Location:
    ಠ_ಠ
    mm... was just about to ask about shadows in particular - perhaps for multiple shadow casters?

    Particles ala physx?
     
  13. liquidboy

    Regular Newcomer

    Joined:
    Jan 16, 2013
    Messages:
    416
    Likes Received:
    77
    by the way ... Civ: Beyond Earth already implements "split frame rendering" using mantle and multiple AMD GPU's .. Is it so difficult to believe that something similar couldn't be accomplished using different vendor GPU's with another low level api that is more cross platform (Dx12) ..

    Won't be long till we find out, GDC 2015 ?!
     
  14. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    918
    Likes Received:
    1,122
    Location:
    55°38′33″ N, 37°28′37″ E
    Yes, it is, because the article you cite starts with
     
  15. ToTTenTranz

    Legend Veteran

    Joined:
    Jul 7, 2008
    Messages:
    12,236
    Likes Received:
    7,192
    Crossfire's latest implementation physically links two GPUs through the PCI-Express bus, which was already the implementation made by Lucid Hydra.



    Everyone keeps saying heterogeneous multi-gpu isn't practical, it's a disaster, etc.
    Here's my take on the subject as a consumer:
    - I don't care how painful/complicated/practical it is to implement. If they do it and it works, I'll enjoy it for sure. There are obvious advantages for getting a graphics card from either vendor, and if I get the chance to use the best of both worlds and not having to choose, I will.
     
  16. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    918
    Likes Received:
    1,122
    Location:
    55°38′33″ N, 37°28′37″ E
    Hmmm.
    http://community.amd.com/community/.../01/03/modernizing-multi-gpu-gaming-with-xdma

    No, Crossfire Direct Memory Access (XDMA) is not a dedicated physical link, though it's quite similar to (h)UMA over PCIe bus. The current implementation is probably connected with the shared 64-bit CPU virtual address space and tier 2 tiled resources - hence GCN 1.1 requirement.

    The performance increase seems to be around 60-70% on average, but some titles actually suffer a slowdown with current driver-controlled automatic mode...

    http://www.guru3d.com/articles_pages/radeon_r9_290_crossfire_review_benchmarks,2.html
    http://wccftech.com/amd-r9-290x-xdma-crossfire-benchmarks-revealed-good-bridge-based-cf/
    etc.

    Which was a software-only solution characterized by having "poor game support, small if any performance gain over a single video card, graphical artifacts, unstable gameplay".
     
  17. ToTTenTranz

    Legend Veteran

    Joined:
    Jul 7, 2008
    Messages:
    12,236
    Likes Received:
    7,192
    This discussion makes no sense.

    The Crossfire in Mantle option that you quoted yourself works with the Hawaii cards, which don't have a dedicated physical connection. The Radeon 290/290X cards use the PCI-Express bus and nothing else, and they work with SFR in Mantle in the Civilization Beyond Earth game.
    What other proof do you need that this has nothing to do with dedicated connections between GPUs?
     
  18. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    918
    Likes Received:
    1,122
    Location:
    55°38′33″ N, 37°28′37″ E
    Huh? I explicitly said in my post above that hardware (h)UMA/DMA should work, unlike pure software solutions.

    Thus GCN 1.1 parts like the R9 290/290X (Hawaii), which implement hardware DMA engine and support Crossfire DMA, and other vendor's parts which use similar DMA approach to access each other's video memory over PCIe virtual address space, should be OK for multi-vendor multi-GPU if disparity in capabilities can be worked out.

    Older GCN 1.0 parts and current Nvidia cards use a dedicated proprietary physical link instead of PCIe, so no luck.

    Only your last reply.
     
  19. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,556
    Likes Received:
    4,729
    Location:
    Well within 3d
    I'd say it would be theoretically possible, and possibly hackable to some degree unless there are implementation or platform issues at a low level that could flat-out nuke the setup. Said nukes would probably come and go based on every factor we could imagine that could perturb the system environment and game code.

    All that aside, however, SFR in particular is troublesome because different sections of the same frame are going to behave differently.
    One side or the other is going to be the slowest, and unless we're discussing dropping half the screen every couple ticks, that means performance will trend towards which vendor does worse.
    Feature support, in terms of checkboxes ticked off or the performance/quality of the support would differ across the split, or consistency means losing more features with both cards than having just one. Certain features that could change the behavior of residency or paging of data like variations of partially resident textures might need to be turned off if there is not full equivalence.
    Other elements of the how the driver and hardware stacks implement the API may not agree below the abstraction. Proprietary internal optimizations/problems would not, so it might actually become a mis-optimization for each side.

    AFR wouldn't suffer from some of that, although the vendors do not seem to agree on how they handle frame pacing, since Nvidia claims to have had hardware features to help with that problem for longer than AMD was aware of said problem.
    If you don't mind every other frame not rendering the same way, and possibly requiring some sort of conversion step for data reused between frames, and losing a significant chunk of the dispatch latency improvements, maybe.
    It would look like one of those IQ comparison articles with the mouse-over flip between a scene rendered with different filtering/AA/feature/gamma/brilinear/math settings.
    The dev could call it a "cinematic" filter for the type of movie filmed with a broken camera by a director who hates epileptics, or maybe AAA devs can take IQ degradation one step further than cinematic and go for "zeotrope" rendering.

    One could try to write the rendering pipeline to split various stages inside a frame between architectures. That would subject the game to intra-frame sync points, and trips back through two drivers. The possibly different implementations for those sync operations and the probable need for validating or massaging the data could be pain points.

    Then there's something like the existing hacking of PhysX when an AMD card is used as the rendering card. That is two separate contexts where interactions are more rare and would already go through some decoupling.
    Maybe that, again if everyone in this long chain of corporate interests and development overheads finds a niche of a niche something worth fighting for.

    Then there are other significant obligations this places on developers, Microsoft, the vendors, and the user.
     
  20. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,380
    I think AFR would be more troublesome for mixed SLI than SFR, exactly because of the frame pacing issues. The only way to make that smooth with different GPUs is to underuse one of both. And then what's the point?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...