DX12 Performance Discussion And Analysis Thread

Can you tell me those legally binding agreements? Cause I haven't seen a single one that says a team can't work with AMD if they are using gameworks, yeah they can't share gameworks code with AMD, if the team has purchased the source code, but outside of that, there is nothing in there.

And how exactly are AMD going to help them if they can't see the source code?

Regards,
SB
 
Source code access is great, but IHVs writing code isn't the only way to provide devrel assistance. AAA game studios have some amazing programmers so teaching them how your hardware works is sometimes the best approach. Then these programmers teach others via conferences, etc.
 
Guys it depends on the company and the IHV's preference to work with the publisher or if its a marquee title for them, AMD does work with companies that do develop exclusively consoles as well.
You'll have to name/quote them to be taken seriously round here with that statement.
 
Generally, I obviously agree with Jawed here, but:
GCN has been substantially more limited in its geometry throughput when compared with competing NVidia cards, so during these passes (which most games with high-end graphics have) the proportion of frame time for these passes is much greater than on NVidia.
[…]
Having, generally, more FLOPS than the competing NVidia cards, this has been a double-hurt for GCN: longer time spent with more ALU capability sat idle.
While that's true of course also, it does not mention explicitly that both of these choices are design decisions by the hardware engineers. The more technical people will of course implicitly grab that part, but for regular users like me, it's implications should be spelled out more clearly:

AMD choose to have more FLOPS/mm² over having a higher utilization without special software love
Nvidia choose to have less FLOPS/mm² in favor of being able to use a higher percentage more often.

Asynchronous Compute Queues and concurrent execution are not as black and white as marketing would like you to believe and as many people even here on B3D constantly repeat as if they're getting something out of it. Neither is Tessellation. It's all about balance. This balance shifts now with the adoption of concurrent execution from favoring Nvidias approach to a better balance between high FLOPS density vs. high utilization in a broader range of cases.

So when these passes are accompanied by compute kernels, the difference with asynchronous compute on AMD can be quite large.
Yes, they are definitely catching up!
 
PC graphics is just playing catchup with console graphics. CPU and GPU compute on consoles has kept PC looking like an afterthought.
 
You'll have to name/quote them to be taken seriously round here with that statement.


How about myself. :)

I have worked on teams with AMD giving support to console games, but the support is minimal, a question here and question there, if something goes horribly wrong, is the only time AMD has to spend a good deal of time and that is at the request of the developers, I have only seen that happen once. It was actually an issue with an engine that wasn't developed by the team.
 
Only for their Eidos studio ports AFAIK.

Their older FF ports (the recent FFIX PC release, for example) use a studio based out of Singapore. And FFXIV is done in house with assistance from Nvidia. It's not a surprise that the past few FF games have been partnered with Nvidia both on PC and console as prior to this generation the PS3 was the lead platform.



It also means they can't work closely with AMD as AMD would not be able to help them with anything involving or interfacing with Gameworks not for lack or desire to on AMD's part, but due to the legally binding agreements you have to make with Nvidia.

Regards,
SB
Remember I keep saying my context is around current generation consoles, FFXIV was way before Xbox-One and PS4, also FFXIV is unusual in being an MMOG and in the beginning the PC port was no-where seen to be as good as consoles but again it is outside my context anyway apart from remember I said Square Enix/EA and some other AAA multi-platform studios thought console gaming was dying before the current gen of consoles and this affected their approach before the current gen; Ubisoft focused on consoles for the development of Watch Dogs 2 and already mentioned highly optimised for AMD and sounds like it will have low-level features from GPUOpen, unlike Watch Dogs that was developed primarily for PC due to PS4 and Xbox-one not around.
Ah yeah your right Nixxes is not used for all studios within the Square Enix umbrella, thanks.
Anyway regarding the normal games.
Look up the Luminous Studio 1.5 engine.
It is specifically being used to develop FFXV on consoles and later on ported to PC, they used Nvidia hardware (needed multiple Titan X) for the grunt to demo a tech and it is relying upon brute force for now - even Square Enix says there is no current optimisation as all development is for consoles, there is not even a release date for the PC version using this multi-platform engine that the demo came from.

Can you provide an actual recent example where Nvidia is heavily involved with the optimisation in the early stages and on record with Square Enix mentioning this like they do with AMD? - maybe in the Nvidia thread as happy to discuss it there.
Regarding previous FF games, sorry but they are not exactly known for their quality port to PC from console, in fact they can be pretty dire due to the budget-resources-priorities involved.
Gameworks is used in many instances because it provides a quicker and easier way to deliver a game to PC in terms of port and why some ports are crap because of the lesser importance put on it, but again the core technology-development is all derived from the focus on console as the engine is primarily console and with PC - Hitman/RoTR (some core visual quality features on console-AMD gpus and not Nvidia)/FFXV/Duece Ex Mankind (latest game)/etc - just to emphasize context is since we have the latest consoles that AMD controls.
Anyway Gameworks is a red herring IMO as it can be turned off, CD Projekt has a strong relationship with Nvidia going back years and yet Witcher 3 runs very well with AMD, GTA V uses technology from both and runs well, RoTR as already mentioned has visual quality features specific to console-AMD and latest DX12-async patch greatly improves AMD performance in the game.
The only games I can think of that will always impede AMD is the Fallout releases from Bethesda and that is because Nvidia worked closely with them to create a technology solution more closely integrated with the engine and this is beyond GameWorks and something they have done for years as a technical partnership going back to Elder Scrolls 3.
What no article has ever focused on is that Gameworks is being used more so these days as a cheap way to bolt-on features for PC port rather than say it being actually core to the game, and the poor quality of the port development can be seen more often IMO; Batman: Arkham Knight a classic example of this (and it goes beyond using GameWorks).
But maybe that subject also should be taken to the Nvidia thread where happy to talk about it.
Thanks
 
Last edited:
Remember I keep saying my context is around current generation consoles, FFXIV was way before Xbox-One and PS4, also FFXIV is unusual in being an MMOG and in the beginning the PC port was no-where seen to be as good as consoles but again it is outside my context anyway apart from remember I said Square Enix/EA and some other AAA multi-platform studios thought console gaming was dying before the current gen of consoles and this affected their approach before the current gen; Ubisoft focused on consoles for the development of Watch Dogs 2 and already mentioned highly optimised for AMD and sounds like it will have low-level features from GPUOpen, unlike Watch Dogs that was developed primarily for PC due to PS4 and Xbox-one not around.

FFXIV was started on PC. A console version didn't appear until approximately 1 year after official launch. That launch also featured a reboot of the game (FFXIV: V2.0) which included a complete engine rewrite, as the current live version of the game could not be made to run on PS3. That was PS3/X360 generation.

For XBO/PS4 generation they created a new engine (FFXIV: Heavensward) as well as radically changing the engine on PC. On PC it was pretty much Nvidia exclusively that helped them with the Dx11 version (Dx11 effects not ported to PS4) that was introduced with Heavensward. That goes part of the way to explaining the massive performance advantage Nvidia has in the modern FFXIV engine. They have been investigating attempting to port over some of the effects from the Dx11 version to the PS4, but haven't released anything yet. It's been a over a year now since it released so I'm guessing it's been difficult getting the Nvidia code/effects ported to the AMD hardware PS4.

FFXIV is their biggest money maker currently, so they don't skimp on development budget or teams for it.

Their Japan studios don't release much onto PC outside of Final Fantasy, so it's difficult to say whether the situation remains the same. I guess we'll see whenever FFXV comes to PC. I know a lot about FFXIV due to listening/reading all their developer livestreams which happen every few months.

Regards,
SB
 
Last edited:
FFXIV was started on PC. A console version didn't appear until approximately 1 year after official launch. That launch also featured a reboot of the game (FFXIV: V2.0) which included a complete engine rewrite, as the current live version of the game could not be made to run on PS3. That was PS3/X360 generation.

For XBO/PS4 generation they created a new engine (FFXIV: Heavensward) as well as radically changing the engine on PC. On PC it was pretty much Nvidia exclusively that helped them with the Dx11 version (Dx11 effects not ported to PS4) that was introduced with Heavensward. That goes part of the way to explaining the massive performance advantage Nvidia has in the modern FFXIV engine. They have been investigating attempting to port over some of the effects from the Dx11 version to the PS4, but haven't released anything yet. It's been a over a year now since it released so I'm guessing it's been difficult getting the Nvidia code/effects ported to the AMD hardware PS4.

FFXIV is their biggest money maker currently, so they don't skimp on development budget or teams for it.

Their Japan studios don't release much onto PC outside of Final Fantasy, so it's difficult to say whether the situation remains the same. I guess we'll see whenever FFXV comes to PC. I know a lot about FFXIV due to listening/reading all their developer livestreams which happen every few months.

Regards,
SB
The original was a disaster FFXIV Online, nearly every article is scathing about it.
Yeah there is the re-boot that also helped to resolve the originals problem on PC and is better in some ways on PC because PS3 is too limited to run such a MMOG.
But the one really worth it is as you say the XBO/PS4 that came out 2014.
http://www.eurogamer.net/articles/digitalfoundry-final-fantasy-14-face-off
Final Fantasy 14's initial launch was something of a disaster for Square Enix, temporarily tarnishing both a much-loved brand and the reputation of its developers with a product thatfelt unfocused and incomplete. Rushed out after an unusually short beta period where only a tiny part of the game was available for players to test (the promised PlayStation 3 beta test never materialised at all), the final release was plagued by game-breaking bugs, a convoluted menu system unsuited to modern MMORPGs, questionable design choices that made the experience frustrating to play, and an unoptimised graphics engine that led to poor performance on a wide range of PCs.



But despite these glaring criticisms Square Enix still hoped that the game would be successful based on the Final Fantasy name, and that fans would accept some of these issues while patches were developed to address them. But this wasn't the case, with a ferocious backlash against the game that saw players deserting it in their droves. This saw the eventual decision to discontinue the title and rework it into a brand new game, acting as a reboot and sequel to the events of Final Fantasy 14.

Known internally as Final Fantasy 14 Version 2.0, this mammoth undertaking saw a new game engine developed for the title, with reworked graphical sub-systems, more traditional MMO gameplay mechanics, an improved user interface, and better thought-out quests making up the bulk of the changes.
 The end result of this extensive redesign is an experience that is far more polished and enjoyable to play than the original release, with none of the glaring issues which plagued the previous game.
So where do you want to draw the line with FFXIV, the original, the reboot where it was improved also for PC, or the finished product with PS4/XBox-One, but it is still an MMOG.
Is this subject really worth distracting this thread and what my posts were about when I am focusing on current consoles, and recent game developments.
I used to play a fair amount of MMOG with others, and the preference for what to use for FFXIV seemed to favour PS4.
Cheers
 
Last edited:
To bring up the old topic of Nvidia, Maxwell and Async Compute again.

I believe we have been looking in the wrong spot all the time.

What happens when we request multiple queues from the OS on Maxwell hardware?
  • The OS attempts to hand down the request to the driver, requesting a fresh queue. (Simplified)
  • If the driver fails to deliver, the OS creates an emulated software queue on an existing one.
  • The OS handles the scheduling, based on events it receives from the driver on the hardware queues allocated.
So, when we start scheduling otherwise identical command buffers to multiple queues instead of a single one, what can possible happen?
  1. The OS coincidentally produces the very same execution schedule which the developer had hand tuned with async off.
  2. The OS produces a different execution schedule.
In the first case, we are not going to see any difference between the use of a dedicated compute queue or not.

In the second case, results can hugely vary:
  • Depending on the application, the order of execution has no impact on performance, as the pipeline states are either compatible, or all barriers are unavoidable either way.
  • The OS might find a better schedule than the developer did. As e.g. observed with the Fable Legends demo.
  • The schedule found by the OS induces additional stalls which did not occur in the hand tuned execution schedule.

Either way, this isn't even up to the driver. The corresponding scheduler is part of Windows 10 and the DX12 runtime environment.

Nvidia most likely never lied when they said they didn't activate Async in the driver. They didn't. That was MS enabling the emulation layer, and hence also the software scheduling.

I suspect this is also the reason why Nvidia can't fix the performance penalty in AotS and alike - it's simply not in their domain.

It also explains why cross-testing with older driver versions couldn't replicate past results / performance problems and/or bugs.
It's not the driver which makes a difference, but the updates to Windows 10.


(There's quite a lot of assumptions in this post, but if it holds true, this would mean that we wrongly accused Nvidia for the past year. At least for the technical problems, not for the lack of communication.)
 
How about myself. :)

I have worked on teams with AMD giving support to console games, but the support is minimal, a question here and question there, if something goes horribly wrong, is the only time AMD has to spend a good deal of time and that is at the request of the developers, I have only seen that happen once. It was actually an issue with an engine that wasn't developed by the team.
A cross platform engine?
 
yeah, can't talk about it, still under NDA, but man the engine blows :/ and the company we got it from, (bought out), we had crap tech support.
 
To bring up the old topic of Nvidia, Maxwell and Async Compute again.

I believe we have been looking in the wrong spot all the time.

What happens when we request multiple queues from the OS on Maxwell hardware?
  • The OS attempts to hand down the request to the driver, requesting a fresh queue. (Simplified)
  • If the driver fails to deliver, the OS creates an emulated software queue on an existing one.
  • The OS handles the scheduling, based on events it receives from the driver on the hardware queues allocated.

What is falling under the category of OS in this, a Windows-level program? The kernel driver?

In the case of the Fibonacci program in this thread, didn't an earlier version that tried to spawn off many compute queues demonstrate that attempts to allocate a new queue would continue until it exceeded whatever implementation limit there was and crash?
If even the initial compute queue setup being submitted to the driver cannot succeed, what obligation does the OS have to step in for the failure?

If the driver and the kernel operations for allocating a queue outside of kernel space succeed, does the OS care what happens in that queue after that?
My impression is that beyond the initial allocation and portions related to final submission, the idea is that the OS or privileged operations are kept to minimum.
 
The original was a disaster FFXIV Online, nearly every article is scathing about it.
Yeah there is the re-boot that also helped to resolve the originals problem on PC and is better in some ways on PC because PS3 is too limited to run such a MMOG.
But the one really worth it is as you say the XBO/PS4 that came out 2014.

Oh yes, I'm quite aware of the disaster that FFXIV v1.0 was. I was a BETA tester for them after all. :) It was a console centric design without the ability to run on console.

The reboot with FFXIV 2.0 was a welcome change for the most part. The UI was made far more PC friendly while also implementing a different UI for the console or when using a console controller on PC.

However, with that reboot came a rather large graphical downgrade on PC as everything was redone (art, assets, game mechanics, etc.) such that it could be used on the PS3 version of the game. For example, while in parties you could display health and mana pools for players but not stamina bars. That was a concession to the limited memory of the PS3. Another example is the relatively small levels. The more detailed the level the smaller it had to be. They had to be very careful with not doing anything to exceed the memory they had to work with on the PS3.

So where do you want to draw the line with FFXIV, the original, the reboot where it was improved also for PC, or the finished product with PS4/XBox-One, but it is still an MMOG.
Is this subject really worth distracting this thread and what my posts were about when I am focusing on current consoles, and recent game developments.
I used to play a fair amount of MMOG with others, and the preference for what to use for FFXIV seemed to favour PS4.
Cheers

With FFXIV, I've been mainly referring to the PS4 and PC versions and excluding the current PS3 client as it's relatively irrelevant to the conversation although it still imposes constraints on FFXIV game design as it's still a supported platform (the previously mentioned lack of a stamina bar has finally been implemented across all platforms, but they had to remove another feature from all platforms to make it fit into their memory budget for PS3). The PC version features a lot of Dx11 effects that do not exist on the PS4 version. Virtually all of them were implemented with help from Nvidia. It was either their last live letter or the one previous to that where a PS4 user asked if those graphics features would ever make it to the PS4 and all that they could state was that they were attempting to port those features over to PS4 and wanted to get them onto PS4, but that they have nothing to announce yet.

Regards,
SB
 
What is falling under the category of OS in this, a Windows-level program? The kernel driver?

In the case of the Fibonacci program in this thread, didn't an earlier version that tried to spawn off many compute queues demonstrate that attempts to allocate a new queue would continue until it exceeded whatever implementation limit there was and crash?
If even the initial compute queue setup being submitted to the driver cannot succeed, what obligation does the OS have to step in for the failure?

If the driver and the kernel operations for allocating a queue outside of kernel space succeed, does the OS care what happens in that queue after that?
My impression is that beyond the initial allocation and portions related to final submission, the idea is that the OS or privileged operations are kept to minimum.
To be honest: I'm not sure. I'm not familiar enough with how the software stack is structured to give a proper reasoning.
All I did understand from the explanation given to me, is that the scheduler is in fact part of the OS.
Kernel mode or part of the user space runtime? No clue, even though kernel mode appear likely since it's also responsible for scheduling concurrent execution of multiple 3D accelerated applications. Definitely not part of the driver, or in any way exposed to it.
On hardware not supporting multiple queues of any of the 3 types, it performs a transparent mapping, both from the perspective of the application and the driver.

I couldn't find the contract which defines any of this behavior. And yet something in the stack voluntarily provides these emulated queues.

Going by the API specs, queue allocation should have been able to fail in case of over allocation. It doesn't.

That part with the Fibonacci program?
Yet another case where the behavior isn't replicable any more. It used to fail (apparently in the edge case where the hardware could provide a number of dedicated queues first, and the OS didn't reserve some for software scheduling?), but now it continues to scale beyond the hardware limit as well, and the hardware limits are only exposed by a step function in the performance profile.

Btw.: Fences are apparently not even remotely as low level as they ought to be either. Even if the hardware had support for them, effectively being able to provide zero latency synchronization.
They are handled by the same scheduler, if you want it or not. (Not sure about this one either though, because for synchronization point placed on an exclusively allocated queue, e.g. a compute queue on GCN hardware, the latency appears to be much smaller than when synchronizing a shared queue. So there might still be some type of fast track.)
 
..... The PC version features a lot of Dx11 effects that do not exist on the PS4 version. Virtually all of them were implemented with help from Nvidia. It was either their last live letter or the one previous to that where a PS4 user asked if those graphics features would ever make it to the PS4 and all that they could state was that they were attempting to port those features over to PS4 and wanted to get them onto PS4, but that they have nothing to announce yet.

Regards,
SB
I would say mainly because is still a historical engine used in A Realm Reborn before they fully came to grips with current consoles but anyway Gameworks is more of a 'bolt-on' to the optimised engine/core features, they specifically developed Luminous 1.5/2.0 for the current consoles and to be used on PC for latest FF game, although there is talk they will switch to an external multi-platform engine for re-makes but either engine is designed for multi-platform low level API development although we will have to see if remakes are DX12 designed/optimised.
Just curious what Nvidia did though on FFXIV A Realm Reborn; Can you provide any references to what this is?
I cannot find any details myself, although I do know they worked closely with some other MMOGs.

One interesting consideration, how long before MMOGs start to look at DX12 and ways it can help performance with massive scale battles/RAIDs, although I appreciate this will not help the backend.
Thanks
 
Last edited:
I would say mainly because is still a historical engine used in A Realm Reborn before they fully came to grips with current consoles, they specifically developed Luminous 1.5/2.0 for the current consoles and to be used on PC for latest FF game, although there is talk they will switch to an external multi-platform engine for re-makes but either engine is designed for multi-platform low level API development although we will have to see if remakes are DX12 designed/optimised.
Just curious what Nvidia did though on FFXIV A Realm Reborn; Can you provide any references to what this is?
I cannot find any details myself, although I do know they worked closely with some other MMOGs.

One interesting consideration, how long before MMOGs start to look at DX12 and ways it can help performance with massive scale battles/RAIDs, although I appreciate this will not help the backend.
Thanks


Guild Wars 2 is so CPU bottlenecked that on my i7 920 system (3800mhz) it wouldn't be able to render the entirety of the crown pavilion in divinity's reach. For the longest time I thought it had been bombed or something in the story of the game, imagine my surprise after upgrading to Haswell to find that it actually looks pretty decent lol

gw2_2013-07_queens_jubilee_-_the_crown_pavilion_jpg_0x0_q85.jpg

World of Warcraft is the only MMO I've played that scales fairly well scenes with many units, all the other's I've played have been bad. I remember Age of Conan managing large battles fairly well, but it's been many years since I played I could be remembering wrong
 
Just curious what Nvidia did though on FFXIV A Realm Reborn; Can you provide any references to what this is?
I cannot find any details myself, although I do know they worked closely with some other MMOGs.

If I had the time I could try tracking it down. It was in one of the Japanese developer livestreams. But each one is 2-3 hours long and filled mostly with non-technical information. Unfortunately, I just don't have time to go through all the livestreams to find where it was mentioned. Sorry.

One interesting consideration, how long before MMOGs start to look at DX12 and ways it can help performance with massive scale battles/RAIDs, although I appreciate this will not help the backend.
Thanks

It's one of those things where it'd be greatly beneficial to MMORPGs (typically CPU limited) but wouldn't be financially feasible. MMO's generally need to support as large a player base as is possible. With rare exceptions I don't expect Dx9 to be abandoned by the majority of MMOs, and once that is abandoned then Dx10/11 will become what MMO engines are based on. So it'll be a long long time before we see an MMO designed with Dx12/Vulkan in mind.

What we may see is partial support where the game is designed for Dx10/11 but a Dx12 path to take partial advantage of it. Similar to all AAA games currently that have support for Dx12/Vulkan. Even the recent Doom doesn't feature a full Vulkan rendering engine. I'd expect that we might see something from Blizzard in a year or two possibly. It's quite likely they've already started to look at it, but won't go full on into it until the install base is much higher. Currently it appears only Pascal and GCN based cards can take full advantage of Dx12/Vulkan.

Regards,
SB
 
Back
Top