DirectX 12: The future of it within the console gaming space (specifically the XB1)

Jwm · Feb 10, 2015

Allandor said:
Well, I think there is a question that is missing here. How does the ability to give the GPU a huge load of draw calls effect GPU performance. We see, that the CPU is now more or less jobless (in relation to the count of draw calls of current games). As far as I know, developers try to reduce the draw calls as good as possible so they don't get slowdown's because of the cpu. so what is the difference for the GPU of getting many draw calls instead of less. Does this provide the ability to work on a draw call with less resources used on gpu, too (e.g. because of smaller draw calls; less preparations needed; less dependencies; ...)?

funny thing about this... found a really really old NVidia presentation
http://www.nvidia.com/docs/IO/8228/BatchBatchBatch.pdf

Need Sebbbi maybe to kindly shed some light on that. My experience was only related to XNA on 360, and it was easy to get slowdowns from too many draw calls (granted I was doing that on purpose). However, with a project that was rather heavy on particles I did have to make optimizations to avoid some hiccups during more gory sections with lots of AI fighting you. I need to finish that game or port to Unity. LOL

And Edit - Thanks Sebbbi!

iroboto · Feb 10, 2015

But -- Max already posts here. Max... come talk to us first

I'm sure we can ask the right questions and get the right tools together for an internet-view

Starx said:

psorcerer · Feb 10, 2015

AlexV said:
However, it is pretty naïve to assume that AMD hacking around with Mantle is similar with Microsoft committing to a new version of DirectX

DX10 was out in 2006. It was obvious then that DX10 is not the solution, and AFAIK a lot of developers were begging for the feature set (mostly reducing CPU load, draw calls, etc.) now known as DX12 to be in fact inside DX10. And then for the same to happen in DX11. And only now, in 2015, for some strange reason these features suddenly appear. My understanding would be that MSFT "committed to DX12" solely because a certain single-vendor solution was heavily developed.
More than that, I've seen numerous developers trying to persuade MSFT that the change is needed, but it didn't happen up until now.
More than that I think you can even find some of my posts from 8-9 years ago on this very forum about how many problems with CPU load DX9 and DX10 has.

psorcerer · Feb 10, 2015

sebbbi said:
because the GPUs cannot render unlimited amount of parallel draw calls with different state

Obviously they can. On low level GPU has a "draw call per triangle strip" model. The "state change" is an emulated feature that does not exist in hardware. Hardware has no state and doesn't care about any state. Constants and "registers" (in their D3D sense) are totally artificial constructs that do not exist in modern hardware. On low level your granularity is limited solely by wave-fronts and similar code running per-vertex and per-pixel in each ALU when particular thread/context is running.

sebbbi said:
by doing a single big multi-draw indirect call

Which is the best solution under current circumstances. But AFAIK still has quite lot problems in specific drivers.

function · Feb 10, 2015

I feel quite sorry for AMD.

They have just about everything against them ... Intel's money, process, advantage, and multi-billion incentive scheme to not use AMD products like Kaveri, Nvidia's R&D budget, high turnover of management, etc, etc ... yet they still manage to drive forward a much needed initiative like Mantle, and then everyone else kicks them into the shins, takes the project and runs off with it.

Up next: Nvidia and Intel take Adaptive sync via display port and run off with it.

iroboto · Feb 10, 2015

Since we are on topic of draw calls, I read a page from the Mantle thread that small draw calls many will perform the same as batched - state validation only occurs if the state is changed otherwise it will leave things as is. Is this correct?

Also it brings up culling, as more draw calls to build something will allow for fine grained culling. I didn't even know! I always thought the GPU would just subtract the triangles it can't see, I didn't know that it was the whole mesh or none of the mesh.

How do batches and culling work? If you batch 100 trees in the area but 50 are covered by a mountain are you still forced to draw all 100?

I'm going to research lol, maybe someone will beat me to it.

edit: so can split a mesh if an object is partially/fully occluded. Then gather up the remaining meshes to render. Man this sounds like a lot of work

Jwm · Feb 10, 2015

function said:
I feel quite sorry for AMD.

They have just about everything against them ... Intel's money, process, advantage, and multi-billion incentive scheme to not use AMD products like Kaveri, Nvidia's R&D budget, high turnover of management, etc, etc ... yet they still manage to drive forward a much needed initiative like Mantle, and then everyone else kicks them into the shins, takes the project and runs off with it.

Up next: Nvidia and Intel take Adaptive sync via display port and run off with it.

I thought DX12 was long in development, before Mantle and AMD was able to develop that in the interim with some of the features that they already knew about DX12 from being partners?

Kaotik · Feb 10, 2015

Jwm said:
I thought DX12 was long in development, before Mantle and AMD was able to develop that in the interim with some of the features that they already knew about DX12 from being partners?

Only indication of that was some NVIDIA rep claiming such, which tbh sounded like they had to give the impression that "we were doing it before amd did it"

lanek · Feb 10, 2015

Jwm said:
I thought DX12 was long in development, before Mantle and AMD was able to develop that in the interim with some of the features that they already knew about DX12 from being partners?

For what we know, it is more the Mantle features and design who have end on DX12...

psorcerer · Feb 10, 2015

function said:
Nvidia and Intel take Adaptive sync via display port and run off with it.

They already did it.

iroboto said:
Is this correct?

More or less. Because of bugs/features in driver state can be still re-instantiated sometimes, even if it was the same one.

Jwm said:
I thought DX12 was long in development

Mantle was done in approx. 2 years. I doubt that DX12 is developed longer than that.

3dcgi · Feb 10, 2015

psorcerer said:
Obviously they can. On low level GPU has a "draw call per triangle strip" model. The "state change" is an emulated feature that does not exist in hardware. Hardware has no state and doesn't care about any state. Constants and "registers" (in their D3D sense) are totally artificial constructs that do not exist in modern hardware. On low level your granularity is limited solely by wave-fronts and similar code running per-vertex and per-pixel in each ALU when particular thread/context is running.

You need to think of the entire GPU. Shader state is emulated, but parts of the fixed function pipeline have state limits. This is probably what sebbbi was referring to.

Jwm · Feb 10, 2015

lanek said:
For what we know, it is more the Mantle features and design who have end on DX12...

A bit OT for this but glNext is Mantle(or a fork), so be interesting to watch all three and how close the performance is on console. Assuming glNext is used on the other console. Interesting stuff for sure, look forward to what it brings.

iroboto · Feb 10, 2015

Jwm said:
A bit OT for this but glNext is Mantle(or a fork), so be interesting to watch all three and how close the performance is on console. Assuming glNext is used on the other console. Interesting stuff for sure, look forward to what it brings.

If glNext is a fork that would explain a lot of the behaviour from AMD. I'm not sure if it's going to be implemented in PS4 vs their GNM/GNM+ or if it'll impact their own development, but they do support openGL today, so it would make sense to.

sebbbi · Feb 10, 2015

psorcerer said:
Obviously they can. On low level GPU has a "draw call per triangle strip" model. The "state change" is an emulated feature that does not exist in hardware. Hardware has no state and doesn't care about any state. Constants and "registers" (in their D3D sense) are totally artificial constructs that do not exist in modern hardware. On low level your granularity is limited solely by wave-fronts and similar code running per-vertex and per-pixel in each ALU when particular thread/context is running.

This is true for compute shaders on GCN. Compute is fully bindless. Resource descriptors are loaded to scalar registers of the CU. Each wave could be running a different compute shader. However this is not true for all the rasterization state. Try for example to change your scissor rectangle between every draw call and check the CU occupancy (hint: it's not going to look pretty). GCN is fully bindless, but it is not stateless.

lanek · Feb 10, 2015

Well we should have more information really soon, as Valve and AMD will present "glNext" at the GDC next month.

3dilettante · Feb 10, 2015

Jwm said:
I thought DX12 was long in development, before Mantle and AMD was able to develop that in the interim with some of the features that they already knew about DX12 from being partners?

So much goes on behind the scenes of API development that I'm not sure we will know the full story, or if given the number of players and viewpoints we can.
It could very well be that there was some kind of iteration on DX being worked on by stakeholders. There would have been multiple visions on where it would be taken, and we have a long history of sub-versioning of DX and rumored late-stage regressions in featuresets that hint at the difficulty in resolving conflicts when different vendors either lack hardware capability now--and you can ill afford to abandon significant players, or are at odds over what direction the future should take.
A lot of things can be done to shift things one way or another, and each party can cite their fraction of the overall story to justify their desired narrative.

Deleted member 11852 · Feb 10, 2015

iroboto said:
I'm not sure if it's going to be implemented in PS4 vs their GNM/GNM+ or if it'll impact their own development, but they do support openGL today, so it would make sense to.

I'd be surprised if an additional API found it's way only PS4. I don't even believe PS4 support OpenGL ES (unlike PS3) and they only reason Sony would add a third is if it's widely used already. It'll probably take a while to gain traction and unlike PS3, I've read zero complaints about PS4 (or specifically the GNM/GNMX APIs) being difficult to work with or adapt too.

psorcerer · Feb 10, 2015

sebbbi said:
Try for example to change your scissor rectangle between every draw call and check the CU occupancy (hint: it's not going to look pretty).

It depends if it's because of driver behaviour or rasterizer itself. I would still bet on driver here.
Although it could also be rasterizer, we do know that rasterizer does some pretty non-obvious stuff there (cache misses because of scan order anyone?).
But ok, I concur, the FFP parts are still a problem, let's hope it will be removed soon.

Kaotik · Feb 10, 2015

Jwm said:
A bit OT for this but glNext is Mantle(or a fork), so be interesting to watch all three and how close the performance is on console. Assuming glNext is used on the other console. Interesting stuff for sure, look forward to what it brings.

Is there actual source for this?

dcbronco · Feb 11, 2015

A couple of people have mentioned feeling sorry for AMD because of DX12. But I wonder if all of this actually plays into their hands and this was all a calculated risk. I've read that AMD's inferior CPU cores perform closer to Intel when programs are designed for more parallel computing. If Windows 10 does more to take advantage of that as well as DX12 wouldn't it make AMD APUs a very good purchase once they get more tablet and ultra book design wins. Especially since they should be able to get Windows free based on price under Microsoft's new program. Add the new single motherboard design and they could have some really cheap machines that would really push Intel if Carrizo or something later really hits good marks.

DirectX 12: The future of it within the console gaming space (specifically the XB1)

Jwm

iroboto

Daft Funk

psorcerer

psorcerer

function

None functional

iroboto

Daft Funk

Jwm

Kaotik

Drunk Member

lanek

psorcerer

3dcgi

Jwm

iroboto

Daft Funk

sebbbi

lanek

3dilettante

Deleted member 11852

Guest

psorcerer

Kaotik

Drunk Member

dcbronco

Similar threads