But is it achieving the same results?Yes but it's achieving the same results as the DX11 version.
And secondly does occupancy mean more CPU saturation ?
But is it achieving the same results?Yes but it's achieving the same results as the DX11 version.
Perhaps, but we're stretching the utility of those measurements. Things like power use are easy to swing wildly with voltage or turbo tweaks, among other things. Stuff like OS yields vs. busy waits can make a large difference to power use, but also can negatively affect latency.CPU occupancy is higher for DX12 drawing a lot more watts in the first video. Notably more RAM used too on both vids with stats.
Ah, I missed an important metric which was framerate. DX12 is running 30% faster.But is it achieving the same results?
I assume the comparison is performed fairly on the same system, not cranking up the voltage for one test versus another. Just from your first video, voltages are the same.Perhaps, but we're stretching the utility of those measurements. Things like power use are easy to swing wildly with voltage or turbo tweaks, among other things. Stuff like OS yields vs. busy waits can make a large difference to power use, but also can negatively affect latency.
As an end result, I think it fair to say power draw for what's on screen is a reasonable measure. If I offer your two consoles producing what looks nigh identical on screen results where one uses 100W to produce those visuals and the other 200W, regardless of how they are achieving that, the one producing the result in less power is more efficient. And if that's the only difference between them, then the lower power option is the 'better' for that game. I can appreciate that down the line, the 200W console may do more, but that doesn't invalidate the data-point on efficiency.That's not to say that the delivered power efficiency is not worse here, just that it's a fickle measurement and not something that really generalizes to a measure of "goodness" of a game's path. Similarly GPU power is not a good metric for whether a game is "utilizing the GPU well". There's a lot of layers of subtlety and complexity between these measurements and the goals of a given game/OS in terms of delivered performance.
Oh. Well not always. I was trying to get at that the results indicated sometimes it would be way ahead, sometimes similar and sometimes ahead or behind. But the latency spikes were different, and the performance profiles appeared to be different based on the hardware benchmarked.Ah, I missed an important metric which was framerate. DX12 is running 30% faster.
Right, but my point is that while that conclusion is valid for your specific game running on your specific console on that specific day, it just doesn't really generalize to anything useful. The issue is when people say "therefore console/API/whatever is more power efficient" in general, which is almost always the implication. But if these cases aren't even considering power efficiency as an optimization target how is that result meaningful at all? If my game runs all the cores at 100% with busy waits to reduce latency in your twitch shooter to minimal levels and burns power doing it then CPUs with higher clocks and more cores will look way worse. Is that because they are fundamentally "less efficient" to produce the "same" result? Yes in that one case, but that says literally nothing about that CPU's efficiency vs. another in any other task.As an end result, I think it fair to say power draw for what's on screen is a reasonable measure. If I offer your two consoles producing what looks nigh identical on screen results where one uses 100W to produce those visuals and the other 200W, regardless of how they are achieving that, the one producing the result in less power is more efficient. And if that's the only difference between them, then the lower power option is the 'better' for that game. I can appreciate that down the line, the 200W console may do more, but that doesn't invalidate the data-point on efficiency.
DX12 -in particular- focused primarily on speed, it promised less CPU overhead (so less chance for the game to become single threaded), more multi core utilization, and vastly more draw calls count than ever before (so drawing more stuff on screen without tremendously hurting performance). So far none of that materialized in a good capacity. If not coded for carefully, DX12 actually decreases fps, increases CPU overhead, introduces frequent "Pipeline State Objects" stuttering, has VRAM management issues, and longer loading times "due to PSO compilation".This is counter to expectations of DX12 based on announcements for its release. It was supposed to speed up the PC, with things like zillions of asteroids being enabled. Something seems to have happened between the vision and the reality. What are we looking at for real with DX12? More features and better quality eventually, but at lower performance? That is, given a game DX11 can do, DX11 will do it faster with less energy, and DX12 only comes in to its own when doing something DX11 can't do?
Many of these assumptions have since proven to be unrealistic.
On the application side, many developers considering or implementing Vulkan and similar APIs found them unable to efficiently support important use cases which were easily supportable in earlier APIs. This has not been simply a matter of developers being stuck in an old way of thinking or unwilling to "rein in" an unnecessarily large number of state combinations, but a reflection of the reality that the natural design patterns of the most demanding class of applications which use graphics APIs — video games — are inherently and deeply dependent on the very "dynamism" that pipelines set out to constrain.
As a result, renderers with a choice of API have largely chosen to avoid Vulkan and its "pipelined" contemporaries, while those without a choice have largely just devised workarounds to make these new APIs behave like the old ones — usually in the form of the now nearly ubiquitous hash-n-cache pattern. These applications set various pieces of "pipeline" state independently, then hash it all at draw time and use the hash as a key into an application-managed pipeline cache, reusing an existing pipeline if it exists or creating and caching a new one if it does not. In effect, the messy and inefficient parts of GL drivers that pipelines sought to eliminate have simply moved into applications, except without the benefits of implementation specific knowledge which might have reduced their complexity or improved their performance.
On the driver side, pipelines have provided some of their desired benefits for some implementations, but for others they have largely just shifted draw time overhead to pipeline bind time (while in some cases still not entirely eliminating the draw time overhead in the first place). Implementations where nearly all "pipeline" state is internally dynamic are forced to either redundantly re-bind all of this state each time a pipeline is bound, or to track what state has changed from pipeline to pipeline — either of which creates considerable overhead on CPU-constrained platforms.
For certain implementations, the pipeline abstraction has also locked away a significant amount of the flexibility supported by their hardware, thereby paradoxically leaving many of their capabilities inaccessible in the newer and ostensibly "low level" API, though still accessible through older, high level ones. In effect, this is a return to the old problem of the graphics API artificially constraining applications from accessing the full capabilities of the GPU, only on a different axis.
This is not what's being said here. They're discussing significant overhead to pipeline binding, which brings challenges and could certainly cause a game to run slower, but is absolutely not the same as the api not providing options to reduce cpu overhead.They also admit what we've all suspected from the beginning: DX12/Vulkan DO NOT reduce CPU overhead, but actually directly INCREASE it.
It probably needs its own thread at this point TBH... no new information is being presented and it's increasingly becoming an irrelevant discussion outside of people just wanting to air various grievances.This entire forum is one continuous argument about dx12.
As it should be.This entire forum is one continuous argument about dx12.
Note that this entire extension discussion you pulled is talking *specifically about PSOs*. I don't think the fact that PSOs have added overhead is particularly contentious at this point, but similarly other areas of the API (submission and draw calls) are clearly better. Sadly in many games the negatives on the PSO side have outweighed other benefits in the API of course, but that doesn't mean it's fair to make a statement like the above with qualification. And frankly no statement that broad will ever be true in general for all games - it depends a lot on the specifics of the content and implementation.They also admit what we've all suspected from the beginning: DX12/Vulkan DO NOT reduce CPU overhead, but actually directly INCREASE it.
You'd still call it a net detriment even when considering RT and similar features? i.e. you'd prefer to have games use DX11 even if it means not having the RT path? As I've argued, I don't think it's realistic or fair to separate the "bad stuff that is related to new APIs" from the good stuff that they enable, even if it makes the conclusions more nuanced.IMO i like DX12 for changing things kind of, making devs be more considerate, and giving us great things like RT, but from a user experience perspective it has been a net detriment even when one considers the amount of good DX12 titles.
PSOs aren't a completely bad idea either since they're designed to give IHVs the opportunity to remove more special graphics state and it's associated fixed function hardware without having the driver to implement complex runtime patching schemes. PSOs exist to avoid unnecessarily punishing hardware designs moving in a more "general purpose" direction where more state is implemented in software. PSOs aren't even all that controversial in the context of compute pipelines since there's far less special states as opposed to graphics pipelines.Note that this entire extension discussion you pulled is talking *specifically about PSOs*. I don't think the fact that PSOs have added overhead is particularly contentious at this point, but similarly other areas of the API (submission and draw calls) are clearly better. Sadly in many games the negatives on the PSO side have outweighed other benefits in the API of course, but that doesn't mean it's fair to make a statement like the above with qualification. And frankly no statement that broad will ever be true in general for all games - it depends a lot on the specifics of the content and implementation.
Most of the stuff in that discussion is true, but you sort of have to be pretty familiar with the technical details of GPU drivers and hardware to understand the cases they are talking about. I caution against broadening the conclusions there too much if you are not.
Indeed, as the aforementioned Fortnite benchmarks show, there are definitely cases in which compile stutter can be reduced/better in the new APIs. While I broadly agree that the issue got worse in the new APIs for most users for many of the reasons discussed (the problem was legitimately simpler on the driver side; applications do not even have the data they would need to do it optimally on a given piece of hardware), it's still kind of annoying when people pretend there was never an issue in previous APIs. There has always been shader compilation stutter issues, and it has gotten worse as games use more shaders.PSOs aren't a completely bad idea either since they're designed to give IHVs the opportunity to remove more special graphics state and it's associated fixed function hardware without having the driver to implement complex runtime patching schemes. PSOs exist to avoid unnecessarily punishing hardware designs moving in a more "general purpose" direction where more state is implemented in software. PSOs aren't even all that controversial in the context of compute pipelines since there's far less special states as opposed to graphics pipelines.
I still somewhat disagree with the idea giving graphics programmers the ability to do generalized indirect shader dispatches since that will just encourage spilling. Shader compilers don't just exist to do static register allocation to minimize register pressure/increase occupancy but they're also there to prevent spilling as much as possible too. Apple are able to more easily get away with what they can because they've effectively siloed themselves off of the rest of the industry. If anyone else had attempted to pull off a similar move in a competitive environment such as desktop or mobile graphics space, they'll either sink (hardware complexity/unsatisfactory performance) or swim (apps start taking advantage of the feature). Even Apple's dynamic register caching solution has limits where there's a *specific threshold* that just enough spilling will start cratering their performance.We really need some hardware movement to address the fundamental problem, which by and large is a consequence of the way GPUs do static resource/register allocation and occupancy. It's definitely a bigger problem in graphics than compute, but it's still a problem in both. Ex. permutations are still an issue when Nanite moves to base pass compute materials.
While PSOs aren't good for user experience, there's been undeniable benefits from a hardware/driver design perspective in how they implement (hardware or software) specific states ...That said, it's not unreasonable to say that in a lot of ways PSOs have been a net negative on the user experience. Given no shortage of unreasonable statements, I don't think it's really worth arguing about that one
As it should be.
IMO i like DX12 for changing things kind of, making devs be more considerate, and giving us great things like RT, but from a user experience perspective it has been a net detriment even when one considers the amount of good DX12 titles. There are so many bad DX12 titles that it is overwhelming sometimes to think about.
There is a reason 2022 was one of my least favourite years of DF coverage...
Technically that's not the case with D3D11. You can use DX9 hardware (feature level 9.x) with D3D11 ...I don't think supporting GPU's that were not released as DX12 GPU's helped things.
The HD7000 series is 'fully' DX12 compatible despite being a DX11 GPU and having no support for any of the new DX12 features such as RT, mesh shaders and other features.
DX12 should have represented a hard line of compatibility like DX8, 9, 10 and 11 before it.
Technically that's not the case with D3D11. You can use DX9 hardware (feature level 9.x) with D3D11 ...
Doesn't really matter since your incorrect example wasn't all that helpful for your justification of hard resets and compatibility breaks. Developers could very much use D3D11 without all the new features on DX9 hardware. Hardware tessellation is a bit ironic since it's irrelevant these days when the older ways aged better so that would serve as a possible case against obsoleting perfectly functional older hardware especially when nobody knows if new feature xyz will stand the test of time to remain relevant ...Which DX9 GPU's had tessellation hardware that was fast enough to actually use for DX11?