No DX12 Software is Suitable for Benchmarking *spawn*

Discussion in 'Architecture and Products' started by trinibwoy, Jun 3, 2016.

  1. Clukos

    Clukos Bloodborne 2 when?
    Veteran Newcomer

    Joined:
    Jun 25, 2014
    Messages:
    4,462
    Likes Received:
    3,793
    I think amd covered that too specifically noting the difference between Dx11/Dx12

    [​IMG]
     
  2. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    16,139
    Likes Received:
    5,074
    Yeah I was hoping to also see Nvidia focus on the benefits of Dx12 for multiple GPU rendering. Unfortunately it appears they've done the stupid thing and doubled down on AFR through the driver along with reinforcing the need for physical bridges.

    But since that doesn't lock you into their eco-system, it's not something they'd want to promote.

    Regards,
    SB
     
  3. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    Thats not really a difference Xfire and SLi had options for SFR in older Dx's too back all the way to Dx9;)
     
  4. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    16,139
    Likes Received:
    5,074
    It's actually a huge difference. SFR was attempted (as well as other methods like the alternating tiles method for AMD) but didn't work as there was no way for the driver to intelligently split the workload across 2 devices. So a lot of things got duplicated as you can't easily just arbitrarily cut a scene in half (what if triangles overlap, as an extremely simple example?). Thus performance at best had minor speedups and at worse actually performed worse than with no SFR. SFR would have been the ideal solution if they could have gotten it to work well. But neither Nvidia nor AMD could. And the situation only got worse with regards to driver level SFR once games started to not use pure forward rendering.

    Dx12 gives the developer the ability to intelligently split the workload across multiple GPUs. This allows scaling regardless of the complexity of the rendering engine. It doesn't even have to follow the SFR model of multi-GPU. The developer is free to implement any method with which they want to split the workload across multiple GPUs. Just hopefully they aren't stupid enough or lazy enough to do AFR as that is by far the worst method possible.

    Hell, I initially tried Crossfire back with the Radeon X850 XT just because their tiled M-GPU method looked like it had the best potential to do a balanced split of a scene between cards. Yeah that didn't go well and ending up only using AFR which is just horrible.

    Regards,
    SB
     
    #64 Silent_Buddha, Jun 5, 2016
    Last edited: Jun 5, 2016
    Razor1 likes this.
  5. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    That is true!
     
  6. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    So, we need to hope that some developers out there are not rushed by their publishers and show more dedication to a 5-7% marketshare than they do for the whole PC gaming community when they do their usual console port - allegedly optimized for PC, when sometimes you even get Press "Triangle/Circle/Box" to continue? Hm, hope dies last.
     
  7. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    16,139
    Likes Received:
    5,074
    Well it does happen from time to time. I'm actually interested to see if the next Civilization game does M-GPU in the engine again like they did with Civ V.

    But yeah, the other option is for reusable game engines (UE, Unity, Frostbite, Chrome Engine, etc.) to offer some basic level of M-GPU which may or may not be optimal for the developer's game but would offer either some benefits, or a base to allow the developer to expand/customize it to their game's requirements.

    And that 5-7% could potentially grow if there was something used other than AFR. I know myself and quite a few others would give M-GPU a try again if it happened.

    Regards,
    SB
     
  8. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,183
    Likes Received:
    1,840
    Location:
    Finland
    If Civ VI engine is same continuity as the previous engine, I'm quite sure they will, they supported even SFR already in Civ:BE (Mantle-version)
     
  9. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    From time to time, agreed. But for mGPU to have any merit, it needs to hapen more often than not. At least.

    It's just, from day to day business, I see vastly more games where the developers for whatever reasons (and there may be good reasons, mind you) do not focus much on advanced techniques in any area.
     
  10. Ryan Smith

    Regular

    Joined:
    Mar 26, 2010
    Messages:
    611
    Likes Received:
    1,052
    Location:
    PCIe x16_1
    You can totally do explicit mGPU on NVIDIA cards. We've seen that first-hand with AotS, and by and large it even works well.:p

    NVIDIA's position is basically that implicit mGPU probably isn't going away any time soon, and that it's handy to be able to keep mGPU traffic off of the PCIe bus. But the use of SLI bridges is not mutually exclusive from supporting explicit mGPU.
     
    pharma, Razor1 and CarstenS like this.
  11. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    16,139
    Likes Received:
    5,074
    Oh I realize that they can support it, but they won't promote it.

    Regards,
    SB
     
  12. MDolenc

    Regular

    Joined:
    May 26, 2002
    Messages:
    690
    Likes Received:
    425
    Location:
    Slovenia
    There's a question with explicit multi GPU that has been stuck in my mind for a while now but haven't come around to ask someone yet... Why it seems hasn't anyone experimented with this on DX11?
    I mean you can fire up two D3D11 devices just the same as D3D12 devices. I realize performance might not be quite up there with DX12 but is hit just so big that no one even bothered to try?
     
  13. gamervivek

    Regular Newcomer

    Joined:
    Sep 13, 2008
    Messages:
    715
    Likes Received:
    220
    Location:
    india
    overclock3d(where that image comes from) did test it at 4k and found Fury X double-digit faster.

    http://www.overclock3d.net/reviews/...son_16_4_2_-_directx_12_performance_boosted/5

    Though the problem with such upto benchmarks is that the game can favor different cards in different conditions, for instance I looked at a benchmark of Tomb Raider pitting Fury X against Titan X and while Titan X was mostly faster the video still had parts where Fury X was consistently around 10% faster.
     
    Razor1 likes this.
  14. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    It's worth noting that custom SFR doesn't have to be 50/50 split. You can dynamically adjust the split based on GPU performance differences. Works much better in situations when you upgrade your GPU (GTX 970 + GTX 1080 for example). Or with a fast iGPU + entry level DGPU.

    You could tile the scene instead of splitting it. If you have a precise culling system, splitting the rendering for example to 128x128 or 256x256 pixel tiles is practical. If the tile fits completely inside the ROP caches (or L2 on Nvidia or L3 on Intel), there is zero memory bandwidth cost for overdraw. Blending to HDR render target gets the biggest wins (I have measured 2x+ performance gain in simple huge overdraw particle blending case). GPU caches are getting bigger, meaning that you could use bigger tiles -> less vertex overhead (sub-object culling is not pixel precise).

    Tiled rendering also means that you have final depth + g-buffer of some regions ready very early in the frame. You can start lighting and processing these regions using asynchronous compute while you are rendering more tiles with ROPs + geometry pipe. This way you can utilize the fixed function hardware for much longer period of the time during a frame (get more out of the triangle rate and fill rate), while simultaneously filling the CUs with compute shader work. This is just one way to do it (with it's own drawbacks of course). Explicit multiadapter and asynchronous compute give some nice new ways to implement things efficiently. But big changes to engine architecture are unfortunately needed to get full advantage of these new features. Maybe we need to wait a few years until big engines catch up.
     
    Kej, Kaarlisk, Ethatron and 8 others like this.
  15. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    DX11 shared resources (between two devices) have huge limitations:
    https://msdn.microsoft.com/en-us/library/windows/desktop/ff476531(v=vs.85).aspx

    I haven't tried two devices + shared resources, but I would expect that performance is not that great, as this feature is not designed for multi-GPU use (you need to flush manually to see the results, most likely reducing the parallelism).
     
  16. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,773
    Likes Received:
    2,560
  17. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    And also 4k (appreciate they are limited in their testing),
    which is not really the best resolution to test these IMO.
    Cheers
     
  18. gamervivek

    Regular Newcomer

    Joined:
    Sep 13, 2008
    Messages:
    715
    Likes Received:
    220
    Location:
    india
    pcper's review of 1080 uses high too, though it does show 980Ti in lead at 1440p while being equal at 4k.

    As for HBAO+ option, I'm not sure if it's there but giving nvidia the lead wouldn't be a surprise.

    4k high is better than 4k ultra surely if you're using that reasoning.
     
  19. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    I am saying it is better to use 1440p with uncapped frames, along with showing 4k.
    I would say not even the 1080FE is truly designed in terms of HW spec/technology implemented as a single card solution for 4k when playing games with enthusiast settings - all of those games they benchmarked show sub-optimal fps at 4k (let alone what a frame analysis would show)
    Sure you can make it work, but you really need to see a broad range of resolutions and ideally 1440p.
    I assume they used 4k to overcome forced VSYNC?
    Although I thought that was now resolved.
    Cheers
     
  20. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY

    Yep that is true, different frames are going to give us different results, this is why I always say every benchmark is valid, just have to average them out to get the results. Yeah we can have outliers favoring one of the IHV but they will or should even themselves out.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...