DirectX 12: The future of it within the console gaming space (specifically the XB1)

Discussion in 'Console Technology' started by Shortbread, Mar 7, 2014.

  1. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    967
    Likes Received:
    1,223
    Location:
    55°38′33″ N, 37°28′37″ E
    It depends on the level of optimization in Direct3D 11.X, but fundamentally only one thread is allowed to issue draw calls .

    On the PC, Direct3D 11 allows free thread-safe multithreading for resource creation, but draw calls are not thread-safe - only one single rendering thread can directly interact with a D3D11 Device (i.e. the kernel mode driver) in the so-called "immediate context". However Direct3D 11 also allows additional rendering threads to run in a "deferred context" - these threads can issue draw calls and state changes and record them into "display lists", but in the end these "deferred" commands have to be executed in the "immediate" rendering context, which is not thread-safe - there is no parallel processing at this final stage and these lists have to be serialized.

    There is a good explanatory article here http://code4k.blogspot.ru/2011/11/direct3d11-multithreading-micro.html

    This was pretty much the same on XBox 360 - you had a 3-core 6-threaded CPU and a typical game would have a single rendering thread, a thread for game state updates and AI, then audio streaming, file decompression, procedural textures and geometry, etc. - whatever the developers could implement with simple per-frame sync, without using heavy-weight inter-process synchronization techniques which cause more problems than they resolve.


    Direct3D 12, on the other hand, allows multiple rendering threads since [post=1836199]all rendering work is performed in the user-mode driver[/post] and only final presentation is handled in the main rendering thread which talks to the OS kernel. This is possible because draw calls and state changes are reorganized to be immutable (i.e. read-only), so they are inherently thread-safe and there is no need to use mutexes or locks. And any resource management is explicitly performed by the application, not by the kernel-mode driver that talks to the actual hardware, so there is no need to sync the device state between multiple rendering threads as well.

    [​IMG]

    Code:
    Times, ms        Total              GFX-only
                  D3D11   D3D12       D3D11   D3D12
    Thread 0      7.88    3.80        5.73    1.17
    Thread 1      3.08    2.50        0.35    0.81
    Thread 2      2.84    2.46        0.34    0.69
    Thread 3      2.63    2.45        0.23    0.65
    Total        16.42   11.21        6.65    3.32
     
    #361 DmitryKo, Apr 16, 2014
    Last edited by a moderator: Apr 19, 2014
  2. forumaccount

    Newcomer

    Joined:
    Jan 30, 2009
    Messages:
    140
    Likes Received:
    86
    DX12 is still early days. Anyone with access to it or more in-depth knowledge is surely NDA'd. Even talking about it publicly via tweets is a little risky.

    However, my understanding is that Brad Wardell is a businessman and not an engineer and that article fully reinforces my understanding that he is not an engineer.
     
  3. dobwal

    Legend

    Joined:
    Oct 26, 2005
    Messages:
    5,955
    Likes Received:
    2,325
    More likely he just over embellishes as he is an engineer, given his degree.
     
  4. liquidboy

    Regular

    Joined:
    Jan 16, 2013
    Messages:
    416
    Likes Received:
    77
    ntoskrnl (NT Kernel) has been shown that it can be uplifted from ring0 into ring3, usermode, its only research BUT I strongly suspect something like this tech is being used in the GameOS and possibly coming productized in Windows Threshold (Windows and Azure, if not already in azure)

    http://research.microsoft.com/en-us/projects/drawbridge/default.aspx
     
  5. liquidboy

    Regular

    Joined:
    Jan 16, 2013
    Messages:
    416
    Likes Received:
    77
    He's also under NDA with MS, so am I for certain technologies.. Doesn't stop me from making high level (without detail) remarks for said NDA'd techs..

    Brad seems very knowedgable and his tech demo is very interesting, and the Oxide engine devs have a very impressive next gen engine ...
     
  6. Pixel

    Veteran

    Joined:
    Sep 16, 2013
    Messages:
    1,008
    Likes Received:
    477
    you are not looking hard enough

    twitter conversation involving Treyarch Software engineer Dan Olson, Codemasters programmer Rob Jones and Unreal Engine 4 programmer Keith Judge

    source: https://twitter.com/statuses/453477224985788416
     
  7. Pixel

    Veteran

    Joined:
    Sep 16, 2013
    Messages:
    1,008
    Likes Received:
    477
  8. zupallinere

    Regular Subscriber

    Joined:
    Sep 8, 2006
    Messages:
    768
    Likes Received:
    109
    ..Neat chart inserted here.

    whew thanks for that response. Single threaded behavior on a PC sure, with all of the different hardware and driver interactions it makes sense to keep things safe rather than speedy but having multiple threads locked out for a fixed hardware platform seemed a bit non-obvious.

    The bolded part was particularly apt. Wonder if the amount of control afforded by the addition of hUMA or whatever it's called will allow for a more predictable response to interprocess syncing or are we just gonna have to wait for a language more amenable to such things ... at least one that doesn't run on the Java VM ;-) Haskell on the console !! :lol:
     
  9. pMax

    Regular

    Joined:
    May 14, 2013
    Messages:
    327
    Likes Received:
    22
    Location:
    out of the games
    ...you'd have to make windows being able to run nicely and stable on a different ring, as you dont want a kernel exploit in windows to ruin your business. VM is terribly easier, then.

    A thin hypervisor can take little resources - you can write one in few hundreds lines (well, a skeleton one). What kills your VM is mostly (except paging) the vm enter/exit.
    It happens, if I recall correctly, mainly on purpose (vm calls), on not reflected interrupts, on privileged instruction emulation (but this would be only needed on winOS partition).

    I would be very surprised if such overhead would be measured in 2 digits.
     
  10. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    967
    Likes Received:
    1,223
    Location:
    55°38′33″ N, 37°28′37″ E
    Yep, it's because the programming model remains the same as with D3D11.2 on the PC.

    As we already know, [post=1840169]there are command bundles and reduced resource creation overhead[/post] on the Xbox One, however porting these improvement to the PC would require WDDM 2.0, which is the driver model beneath D3D12.

    These are two separate issues - first is avoiding inter-process synchronization as much as possible by separating the algorithm into several independent chunks that can run in parallel, the second is doing the still necessary bits of synchronization as efficiently as possible.

    Direct3D 12 provides the solution to the first problem, that is streamlining the API for parallel CPU processing as much as possible.


    Providing effective multi-processor (or multi-core) access to the shared memory is the second part of the equation.

    NUMA is not really about inter-process synchronization on desktop computer devices, it was designed for huge-multiprocessor or cloud-based computing, where the same algorithm has to run on many independent computing nodes with different data sets on each node then sync the results to some other nodes, and most of these nodes are non-local, i.e. they are running on another computer or cluster connected by high-speed network.

    On the desktop/console, it's much more efficient to provide a wide high-speed memory connection and a L3/L4 cache to connect L1/L2 caches on each core.
     
  11. ramr

    Newcomer

    Joined:
    Jan 19, 2013
    Messages:
    169
    Likes Received:
    32
    The above couldn't be further from the truth. Brad programed one of the top 5 all time games by himself - the original GalCiv. He also programmed innumerable productivity apps. Given his AI work, I tend to believe his comments on CPU utilization. On the GPU side, well, stardock has never been known for cutting edge graphics. Given that he is focusing on the CPU optimization as it relates to improved GPU utilization I will take his word for it. None of the so-called experts in the links above said anything to contradict what he said so I am not sure why people are pointing to that as proof he is kooky.
     
  12. Pixel

    Veteran

    Joined:
    Sep 16, 2013
    Messages:
    1,008
    Likes Received:
    477
    The more you read about Mantle the more we see they achieve many of its optimizations in a similar fashion to DX12. They are very similar in that they have superior multithread scaling compared to dx11, split command buffer between multiple cores, & binding descriptors tables etc to reduce # of draw calls.
    From the results we've seen with Brads StarSwarm game its the only game that sees massive performance boost (outside of SLI/crossfire improved implementation) on Mantle.
    Brad develops large strategy/simulation games and the Star Swarm game involve huge simulations involving rendering thousands of objects with multiple materials per object, with thousands of variables that need to be updated every frame. The draw calls normally swamp the cpu.
    All these other games, from Ryse/Forza/BF4/CoD will hit a gpu bottleneck somewhere in the graphics pipeline and likely a ddr3 bandwidth bottleneck (esram may be fine though) before they can allow for 50% fps increase.
     
    #372 Pixel, Apr 17, 2014
    Last edited by a moderator: Apr 17, 2014
  13. Allandor

    Regular

    Joined:
    Oct 6, 2013
    Messages:
    842
    Likes Received:
    879
    well, yes I don't think we get a 50% fps increase in a title like ryse (we could get a little increase right now because of newer drivers and some of those 10% that were not available, if the game would be patched with those newer drivers), but games like ryse should hold the 30fps with those optimizations (which could be 50% in some situations).
     
  14. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,834
    Likes Received:
    18,634
    Location:
    The North
    An quote of CPU limitations: topic Titanfall.

    http://www.gamespot.com/articles/th...le-war-says-titanfall-developer/1100-6419057/
     
    #374 iroboto, Apr 17, 2014
    Last edited by a moderator: Apr 17, 2014
  15. TheAlSpark

    TheAlSpark Moderator
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    22,146
    Likes Received:
    8,533
    Location:
    ಠ_ಠ
    Quite. The 360 interview did allude to the CPU being a problem as well. Especially in Last Titan Standing, the spikes seem to indicate some inherent issue there.

    Kind of curious as to why the titans themselves would be so problematic.
     
  16. Malo

    Malo Yak Mechanicum
    Legend Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    8,931
    Likes Received:
    5,533
    Location:
    Pennsylvania
    I don't understand how Microsoft, who just released the XBox One, is now implementing a new DX into the Xbone architecture that developers have to redesign for? All that says to me is confirmation that MS scrambled to get DX12 into the scene much faster than previous intended due to Mantle as I really can't believe that they would release Xbone then move to a newer API immediately.
     
  17. DSoup

    DSoup Series Soup
    Legend Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    16,785
    Likes Received:
    12,697
    Location:
    London, UK
    The alternative was delaying Xbox One until DirectX 12 was read for prime time on Xbox and PC :nope:
     
  18. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    Why stop at 50%?
    The claim for bringing over DX12 was 100%.
     
  19. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    967
    Likes Received:
    1,223
    Location:
    55°38′33″ N, 37°28′37″ E
    Who said they have to redesign anything?

    If developers are fine with Direct3D 11.2/11.X, that's OK, nobody is deprecating Direct3D 11 and performance should improve with new Xbox One SDK releases (and new Windows releases as well, when they [post=1840198]port D3D11.X features fom Xbox One[/post] and move to [post=1841448]lightweight WDDM 2.0 driver model [/post]).

    But if you absolutely have to squeze additional bits of performance from the exising hardware, now you can do that with Direct3D 12.
     
  20. Pixel

    Veteran

    Joined:
    Sep 16, 2013
    Messages:
    1,008
    Likes Received:
    477
    How can you say that?

    Sure they were. Anyways one of those was Keith Judge a programmer for unreal engine 4 which is supporting directx12 and is baffled by Wardells statement.

    Wardell in recent days has contradicted himself.
    Now responding to questions on twitter and he's suggesting that directx12 won't close the gap between ps4 and XOne. That contradicts his statement that it would give XOne 2x the performance for most games.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...