Mantle began life as an effort to bring a console style low level api patterned after libgcm/gnm to the PC space. It is PC and Xbox that are catching up to the new paradigm that exists on PS4, not the other way around. @Rikimaru
I've thought about this, hence as being one of my counter points to my own arguments earlier. As of this moment, no one knows the state of GNM/GNMX. That's fine, Sony doesn't owe anyone an explanation. Our understanding is actually quite limited on its behaviour with the hardware, and what types of characteristics and features the API has.
That being said if Mantle/PC and Xbox are 'catching' up to the programming paradigm for PS4, I would need to find proof of it. So this one comes up in mind as _not being proof of it_. I pulled these slides from Ubisofts GDC presentation on compute performance. Before they eventually get to where they want it to be in terms of performance, they attempted to run massive compute using a lot of compute shaders.
And you can see from the results, they are immediately CPU bottlenecked due to the number of calls to the system. This behaviour, that what we know so far, is not characteristic of Mantle/Vulkan and DX12. This characteristic is more in line in how we view APIs today. Low overhead will only get you so far, you need to allow for multithreaded submission to make it work as well. So it's in my opinion that GNM today does not support multithreaded submission in the way Vulkan and DX12 do. So already, 1 major paradigm is missing. The 2nd major paradigm is of course fine grained async compute performance. But that builds upon paradigm 1. So as you can see here, it would fail at paradigm 2 as well. That leaves paradigm 3, which still no one knows what it is.
That being said, likely with or without, PS4 is still be a good console. I'd like to see it adopt those 3 pillars in the future, but they don't necessarily have to; as least for a while, there's no rush for them to do it.
edit: I think also, if they had DX12, they could have written an ExecuteIndirect version of this shader. I believe that would have removed the need to insert sync points in there extremely long shader.
Last edited: