DX12 Performance Discussion And Analysis Thread

Discussion in 'Rendering Technology and APIs' started by A1xLLcqAgt0qc2RyMz0y, Jul 29, 2015.

  1. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    19,423
    Likes Received:
    10,316
    And how exactly are AMD going to help them if they can't see the source code?

    Regards,
    SB
     
  2. Unknown Soldier

    Veteran

    Joined:
    Jul 28, 2002
    Messages:
    4,047
    Likes Received:
    1,669
    He is talking about the source code of GameWorks not the actual game source code itself. The developer should have access to their own source code don't you think? ;)
     
    Razor1 likes this.
  3. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,493
    Likes Received:
    474
    Source code access is great, but IHVs writing code isn't the only way to provide devrel assistance. AAA game studios have some amazing programmers so teaching them how your hardware works is sometimes the best approach. Then these programmers teach others via conferences, etc.
     
  4. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,714
    Likes Received:
    2,135
    Location:
    London
    You'll have to name/quote them to be taken seriously round here with that statement.
     
  5. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    Generally, I obviously agree with Jawed here, but:
    While that's true of course also, it does not mention explicitly that both of these choices are design decisions by the hardware engineers. The more technical people will of course implicitly grab that part, but for regular users like me, it's implications should be spelled out more clearly:

    AMD choose to have more FLOPS/mm² over having a higher utilization without special software love
    Nvidia choose to have less FLOPS/mm² in favor of being able to use a higher percentage more often.

    Asynchronous Compute Queues and concurrent execution are not as black and white as marketing would like you to believe and as many people even here on B3D constantly repeat as if they're getting something out of it. Neither is Tessellation. It's all about balance. This balance shifts now with the adoption of concurrent execution from favoring Nvidias approach to a better balance between high FLOPS density vs. high utilization in a broader range of cases.

    Yes, they are definitely catching up!
     
  6. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,714
    Likes Received:
    2,135
    Location:
    London
    PC graphics is just playing catchup with console graphics. CPU and GPU compute on consoles has kept PC looking like an afterthought.
     
    egoless likes this.
  7. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY

    How about myself. :)

    I have worked on teams with AMD giving support to console games, but the support is minimal, a question here and question there, if something goes horribly wrong, is the only time AMD has to spend a good deal of time and that is at the request of the developers, I have only seen that happen once. It was actually an issue with an engine that wasn't developed by the team.
     
  8. CSI PC

    Veteran

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Remember I keep saying my context is around current generation consoles, FFXIV was way before Xbox-One and PS4, also FFXIV is unusual in being an MMOG and in the beginning the PC port was no-where seen to be as good as consoles but again it is outside my context anyway apart from remember I said Square Enix/EA and some other AAA multi-platform studios thought console gaming was dying before the current gen of consoles and this affected their approach before the current gen; Ubisoft focused on consoles for the development of Watch Dogs 2 and already mentioned highly optimised for AMD and sounds like it will have low-level features from GPUOpen, unlike Watch Dogs that was developed primarily for PC due to PS4 and Xbox-one not around.
    Ah yeah your right Nixxes is not used for all studios within the Square Enix umbrella, thanks.
    Anyway regarding the normal games.
    Look up the Luminous Studio 1.5 engine.
    It is specifically being used to develop FFXV on consoles and later on ported to PC, they used Nvidia hardware (needed multiple Titan X) for the grunt to demo a tech and it is relying upon brute force for now - even Square Enix says there is no current optimisation as all development is for consoles, there is not even a release date for the PC version using this multi-platform engine that the demo came from.

    Can you provide an actual recent example where Nvidia is heavily involved with the optimisation in the early stages and on record with Square Enix mentioning this like they do with AMD? - maybe in the Nvidia thread as happy to discuss it there.
    Regarding previous FF games, sorry but they are not exactly known for their quality port to PC from console, in fact they can be pretty dire due to the budget-resources-priorities involved.
    Gameworks is used in many instances because it provides a quicker and easier way to deliver a game to PC in terms of port and why some ports are crap because of the lesser importance put on it, but again the core technology-development is all derived from the focus on console as the engine is primarily console and with PC - Hitman/RoTR (some core visual quality features on console-AMD gpus and not Nvidia)/FFXV/Duece Ex Mankind (latest game)/etc - just to emphasize context is since we have the latest consoles that AMD controls.
    Anyway Gameworks is a red herring IMO as it can be turned off, CD Projekt has a strong relationship with Nvidia going back years and yet Witcher 3 runs very well with AMD, GTA V uses technology from both and runs well, RoTR as already mentioned has visual quality features specific to console-AMD and latest DX12-async patch greatly improves AMD performance in the game.
    The only games I can think of that will always impede AMD is the Fallout releases from Bethesda and that is because Nvidia worked closely with them to create a technology solution more closely integrated with the engine and this is beyond GameWorks and something they have done for years as a technical partnership going back to Elder Scrolls 3.
    What no article has ever focused on is that Gameworks is being used more so these days as a cheap way to bolt-on features for PC port rather than say it being actually core to the game, and the poor quality of the port development can be seen more often IMO; Batman: Arkham Knight a classic example of this (and it goes beyond using GameWorks).
    But maybe that subject also should be taken to the Nvidia thread where happy to talk about it.
    Thanks
     
    #1588 CSI PC, Jul 18, 2016
    Last edited: Jul 18, 2016
    Ext3h likes this.
  9. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    19,423
    Likes Received:
    10,316
    FFXIV was started on PC. A console version didn't appear until approximately 1 year after official launch. That launch also featured a reboot of the game (FFXIV: V2.0) which included a complete engine rewrite, as the current live version of the game could not be made to run on PS3. That was PS3/X360 generation.

    For XBO/PS4 generation they created a new engine (FFXIV: Heavensward) as well as radically changing the engine on PC. On PC it was pretty much Nvidia exclusively that helped them with the Dx11 version (Dx11 effects not ported to PS4) that was introduced with Heavensward. That goes part of the way to explaining the massive performance advantage Nvidia has in the modern FFXIV engine. They have been investigating attempting to port over some of the effects from the Dx11 version to the PS4, but haven't released anything yet. It's been a over a year now since it released so I'm guessing it's been difficult getting the Nvidia code/effects ported to the AMD hardware PS4.

    FFXIV is their biggest money maker currently, so they don't skimp on development budget or teams for it.

    Their Japan studios don't release much onto PC outside of Final Fantasy, so it's difficult to say whether the situation remains the same. I guess we'll see whenever FFXV comes to PC. I know a lot about FFXIV due to listening/reading all their developer livestreams which happen every few months.

    Regards,
    SB
     
    #1589 Silent_Buddha, Jul 18, 2016
    Last edited: Jul 18, 2016
  10. CSI PC

    Veteran

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    The original was a disaster FFXIV Online, nearly every article is scathing about it.
    Yeah there is the re-boot that also helped to resolve the originals problem on PC and is better in some ways on PC because PS3 is too limited to run such a MMOG.
    But the one really worth it is as you say the XBO/PS4 that came out 2014.
    http://www.eurogamer.net/articles/digitalfoundry-final-fantasy-14-face-off
    So where do you want to draw the line with FFXIV, the original, the reboot where it was improved also for PC, or the finished product with PS4/XBox-One, but it is still an MMOG.
    Is this subject really worth distracting this thread and what my posts were about when I am focusing on current consoles, and recent game developments.
    I used to play a fair amount of MMOG with others, and the preference for what to use for FFXIV seemed to favour PS4.
    Cheers
     
    #1590 CSI PC, Jul 18, 2016
    Last edited: Jul 18, 2016
  11. Ext3h

    Regular

    Joined:
    Sep 4, 2015
    Messages:
    428
    Likes Received:
    497
    To bring up the old topic of Nvidia, Maxwell and Async Compute again.

    I believe we have been looking in the wrong spot all the time.

    What happens when we request multiple queues from the OS on Maxwell hardware?
    • The OS attempts to hand down the request to the driver, requesting a fresh queue. (Simplified)
    • If the driver fails to deliver, the OS creates an emulated software queue on an existing one.
    • The OS handles the scheduling, based on events it receives from the driver on the hardware queues allocated.
    So, when we start scheduling otherwise identical command buffers to multiple queues instead of a single one, what can possible happen?
    1. The OS coincidentally produces the very same execution schedule which the developer had hand tuned with async off.
    2. The OS produces a different execution schedule.
    In the first case, we are not going to see any difference between the use of a dedicated compute queue or not.

    In the second case, results can hugely vary:
    • Depending on the application, the order of execution has no impact on performance, as the pipeline states are either compatible, or all barriers are unavoidable either way.
    • The OS might find a better schedule than the developer did. As e.g. observed with the Fable Legends demo.
    • The schedule found by the OS induces additional stalls which did not occur in the hand tuned execution schedule.

    Either way, this isn't even up to the driver. The corresponding scheduler is part of Windows 10 and the DX12 runtime environment.

    Nvidia most likely never lied when they said they didn't activate Async in the driver. They didn't. That was MS enabling the emulation layer, and hence also the software scheduling.

    I suspect this is also the reason why Nvidia can't fix the performance penalty in AotS and alike - it's simply not in their domain.

    It also explains why cross-testing with older driver versions couldn't replicate past results / performance problems and/or bugs.
    It's not the driver which makes a difference, but the updates to Windows 10.


    (There's quite a lot of assumptions in this post, but if it holds true, this would mean that we wrongly accused Nvidia for the past year. At least for the technical problems, not for the lack of communication.)
     
    Malo and CSI PC like this.
  12. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,714
    Likes Received:
    2,135
    Location:
    London
    A cross platform engine?
     
  13. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    yeah, can't talk about it, still under NDA, but man the engine blows :/ and the company we got it from, (bought out), we had crap tech support.
     
  14. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,714
    Likes Received:
    2,135
    Location:
    London
    Such as?
     
  15. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    What is falling under the category of OS in this, a Windows-level program? The kernel driver?

    In the case of the Fibonacci program in this thread, didn't an earlier version that tried to spawn off many compute queues demonstrate that attempts to allocate a new queue would continue until it exceeded whatever implementation limit there was and crash?
    If even the initial compute queue setup being submitted to the driver cannot succeed, what obligation does the OS have to step in for the failure?

    If the driver and the kernel operations for allocating a queue outside of kernel space succeed, does the OS care what happens in that queue after that?
    My impression is that beyond the initial allocation and portions related to final submission, the idea is that the OS or privileged operations are kept to minimum.
     
  16. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    19,423
    Likes Received:
    10,316
    Oh yes, I'm quite aware of the disaster that FFXIV v1.0 was. I was a BETA tester for them after all. :) It was a console centric design without the ability to run on console.

    The reboot with FFXIV 2.0 was a welcome change for the most part. The UI was made far more PC friendly while also implementing a different UI for the console or when using a console controller on PC.

    However, with that reboot came a rather large graphical downgrade on PC as everything was redone (art, assets, game mechanics, etc.) such that it could be used on the PS3 version of the game. For example, while in parties you could display health and mana pools for players but not stamina bars. That was a concession to the limited memory of the PS3. Another example is the relatively small levels. The more detailed the level the smaller it had to be. They had to be very careful with not doing anything to exceed the memory they had to work with on the PS3.

    With FFXIV, I've been mainly referring to the PS4 and PC versions and excluding the current PS3 client as it's relatively irrelevant to the conversation although it still imposes constraints on FFXIV game design as it's still a supported platform (the previously mentioned lack of a stamina bar has finally been implemented across all platforms, but they had to remove another feature from all platforms to make it fit into their memory budget for PS3). The PC version features a lot of Dx11 effects that do not exist on the PS4 version. Virtually all of them were implemented with help from Nvidia. It was either their last live letter or the one previous to that where a PS4 user asked if those graphics features would ever make it to the PS4 and all that they could state was that they were attempting to port those features over to PS4 and wanted to get them onto PS4, but that they have nothing to announce yet.

    Regards,
    SB
     
  17. Ext3h

    Regular

    Joined:
    Sep 4, 2015
    Messages:
    428
    Likes Received:
    497
    To be honest: I'm not sure. I'm not familiar enough with how the software stack is structured to give a proper reasoning.
    All I did understand from the explanation given to me, is that the scheduler is in fact part of the OS.
    Kernel mode or part of the user space runtime? No clue, even though kernel mode appear likely since it's also responsible for scheduling concurrent execution of multiple 3D accelerated applications. Definitely not part of the driver, or in any way exposed to it.
    On hardware not supporting multiple queues of any of the 3 types, it performs a transparent mapping, both from the perspective of the application and the driver.

    I couldn't find the contract which defines any of this behavior. And yet something in the stack voluntarily provides these emulated queues.

    Going by the API specs, queue allocation should have been able to fail in case of over allocation. It doesn't.

    That part with the Fibonacci program?
    Yet another case where the behavior isn't replicable any more. It used to fail (apparently in the edge case where the hardware could provide a number of dedicated queues first, and the OS didn't reserve some for software scheduling?), but now it continues to scale beyond the hardware limit as well, and the hardware limits are only exposed by a step function in the performance profile.

    Btw.: Fences are apparently not even remotely as low level as they ought to be either. Even if the hardware had support for them, effectively being able to provide zero latency synchronization.
    They are handled by the same scheduler, if you want it or not. (Not sure about this one either though, because for synchronization point placed on an exclusively allocated queue, e.g. a compute queue on GCN hardware, the latency appears to be much smaller than when synchronizing a shared queue. So there might still be some type of fast track.)
     
    ieldra likes this.
  18. CSI PC

    Veteran

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    I would say mainly because is still a historical engine used in A Realm Reborn before they fully came to grips with current consoles but anyway Gameworks is more of a 'bolt-on' to the optimised engine/core features, they specifically developed Luminous 1.5/2.0 for the current consoles and to be used on PC for latest FF game, although there is talk they will switch to an external multi-platform engine for re-makes but either engine is designed for multi-platform low level API development although we will have to see if remakes are DX12 designed/optimised.
    Just curious what Nvidia did though on FFXIV A Realm Reborn; Can you provide any references to what this is?
    I cannot find any details myself, although I do know they worked closely with some other MMOGs.

    One interesting consideration, how long before MMOGs start to look at DX12 and ways it can help performance with massive scale battles/RAIDs, although I appreciate this will not help the backend.
    Thanks
     
    #1598 CSI PC, Jul 19, 2016
    Last edited: Jul 19, 2016
  19. ieldra

    Newcomer

    Joined:
    Feb 27, 2016
    Messages:
    149
    Likes Received:
    116

    Guild Wars 2 is so CPU bottlenecked that on my i7 920 system (3800mhz) it wouldn't be able to render the entirety of the crown pavilion in divinity's reach. For the longest time I thought it had been bombed or something in the story of the game, imagine my surprise after upgrading to Haswell to find that it actually looks pretty decent lol

    [​IMG]

    World of Warcraft is the only MMO I've played that scales fairly well scenes with many units, all the other's I've played have been bad. I remember Age of Conan managing large battles fairly well, but it's been many years since I played I could be remembering wrong
     
    CSI PC likes this.
  20. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    19,423
    Likes Received:
    10,316
    If I had the time I could try tracking it down. It was in one of the Japanese developer livestreams. But each one is 2-3 hours long and filled mostly with non-technical information. Unfortunately, I just don't have time to go through all the livestreams to find where it was mentioned. Sorry.

    It's one of those things where it'd be greatly beneficial to MMORPGs (typically CPU limited) but wouldn't be financially feasible. MMO's generally need to support as large a player base as is possible. With rare exceptions I don't expect Dx9 to be abandoned by the majority of MMOs, and once that is abandoned then Dx10/11 will become what MMO engines are based on. So it'll be a long long time before we see an MMO designed with Dx12/Vulkan in mind.

    What we may see is partial support where the game is designed for Dx10/11 but a Dx12 path to take partial advantage of it. Similar to all AAA games currently that have support for Dx12/Vulkan. Even the recent Doom doesn't feature a full Vulkan rendering engine. I'd expect that we might see something from Blizzard in a year or two possibly. It's quite likely they've already started to look at it, but won't go full on into it until the install base is much higher. Currently it appears only Pascal and GCN based cards can take full advantage of Dx12/Vulkan.

    Regards,
    SB
     
    ieldra likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...