Vulkan/OpenGL Next Generation Initiative: unified API for mobile and non-mobile devices.

  • Thread starter Deleted member 13524
  • Start date
http://anandtech.com/show/9038/next-generation-opengl-becomes-vulkan-additional-details-released
In fact Khronos has confirmed that AMD has contributed Mantle towards the development of Vulkan, and though we need to be clear that Vulkan is not Mantle, Mantle was used to bootstrap the process and speed its development, making Vulkan a derivation of sorts of Mantle (think Unix family tree). What has changed from Mantle is that Khronos has gone through a period of refinement, keeping what worked in Vulcan and throwing out portions of Mantle that didn’t work well – particularly HLSL and anything that would prevent the API from being cross-vendor – replacing it with the other necessary/better functionality.
 
OpenCL 2.1 with SPIR-V:
http://anandtech.com/show/9039/khronos-announces-opencl-21-c-comes-to-opencl

C++14 subset shading language (lambdas, templates, etc). Subgroup functionality improvements. Meaning that OpenCL finally will match CUDA in productivity (and performance).

Vulkan shares the SPIR-V backend with OpenCL 2.1. It should't be a big task add C++14 shading language and other OpenCL 2.1 goodies such as subgroup functionality and device side enqueue to Vulkan. If this happens, Vulkan becomes a very interesting API (very good productivity and GPU side performance). A robust shading language (with generics support) is becoming a highly important thing since many graphics engines are at least partly transitioning to GPU from CPU. Writing and maintaining good generic libraries for shader code is a time consuming task. C++14 subset language makes this much easier (CUDA is a good example of this).
 
Shared SPIR-V backend would make it trivial for Vulcan's compute shaders and OpenCL 2.1 to share the same shader feature set.

This would mean the following things:
1. We finally have a PC graphics API that supports device side enqueue (dynamic parallelism) :)
2. We finally have a PC graphics API that supports subgroup (wave/warp) and cross lane operations :)

Both features are 100% AWESOME and result in big GPU side performance gains. Also both features make shader programming easier, by reducing the code bloat. Subgroup operations make it almost trivial to do complex parallel constructions such as prefix sums and reductions. These constructs allow the compiler to emit optimal vendor specific GPU op-codes (lane swizzles instead of manual LDS communication). Complex compute shaders get both simpler and run faster (less instructions, no additional LDS usage, reduced GPR usage). Device side enqueue simplifies both CPU code and GPU code (no need to bounce data and control flow between CPU<->GPU multiple times) and makes it possible to implement a wide variety of traditional divide-and-conquer algorithms on the GPU without needing frequent CPU round trips. Performance of these kinds of algorithms on GPU is awful without device side enqueue.

I just hope that Microsoft follows the lead, and starts to improve DirectCompute sooner than later. DirectCompute feature set has been unchanged since the launch (of DirectX 11.0). DirectX 11.1 and 11.2 added only graphics features, no improvement for compute at all. The published DirectX 11.3 feature list (http://www.anandtech.com/show/8544/microsoft-details-direct3d-113-12-new-features) includes only graphics features.

DirectCompute is starting to feel ancient, as CUDA and OpenCL have improved rapidly over the past few years. Our engine is ~90% compute shaders. Other developers are moving rapidly towards compute as well. I would kill to have all these new features and productivity improvements in DirectX.
 
Last edited:
In Vulkan, the application has direct control over the operation of the GPU via a simplified driver. “The GPU has been laid bare for you. You can screw it up, but in the right hands this gives you maximum system performance, and gives the developer lots of flexibility,” Nvidia’s Neil Trevett, president of the Khronos Group, told El Reg.

In OpenGL, the shader language compiler is part of the driver, whereas Vulkan consumes SPIR-V. This also means that developers will no longer need to ship shader source code. Vulkan is also better for multi-core programming, with multiple command buffers than execute in parallel. “In the traditional OpenGL architecture, creating command buffers and submitting them to the GPU was difficult to split up into multiple threads, and often OpenGL applications are CPU limited, the GPU can’t be fed fast enough. Now you can create as many buffers as you want in parallel. This was the one thing we couldn’t fix in OpenGL,” Trevett says.

In OpenCL 2.1, aside from SPIR-V support, the big change is that it now supports C++ as a kernel language, executing on the GPU. “It’s the big ask from the developer community,” Trevett told The Reg. The OpenCL C++ kernel language is a subset of C++ 14 and includes lambda functions, classes, templates and operator overloading, among other features. C++ support will make it easier to write or port code for high performance computing to execute on the GPU or other accelerator boards that implement the standard.

Vulkan will be a single API that will span desktop, games console, mobile and embedded platforms. It is “a ground-up redesign, we’re not backwards compatible with OpenGL,” says Trevett. The actual specification of Vulkan is not yet released, but Khronos is expecting initial specifications and implementations later in 2015.

Support for Vulkan is widespread, including from big names such as Intel, Apple, ARM, Nvidia, AMD, and chipset vendors such as Qualcomm, Imagination and Mediatek. Khronos also has active participation from games engine developers such as Epic, Valve, Unity and Blizzard, who are well placed to take advantage of what Vulkan offers.

OpenCL 2.1 is a provisional specification, and Vulkan not yet released, so you will not be playing Vulkan-powered games in the near future. As such things go though, the standards are progressing swiftly. It is a major change, promising better performance and easier programming though an industry standard cross-platform API. ®

http://www.theregister.co.uk/2015/0...e_next_generation_of_the_opengl_graphics_api/
 
Last edited:
http://www.ustream.tv/recorded/59559306/theater

Khronos (Vulkan) stuff from GDC
They also go and say it outright, there's tons of companies involved, but they will thank only one by name, AMD, for giving them Mantle to build on
GDC_2015_Vulkan_Thanks_AMD.png



edit:
Slide decks: https://www.khronos.org/developers/library/2015-gdc
 
Some of us have seen this before. The old routine goes like this:

"OpenGL is old!"
Diagrams, lot's of diagrams.
On this slide, we have impressive wall of contributor logos.
Some code snippets. Naming conventions look familiar, so it must be real thing.
"This time we're serious!"
Awkward silence from nVidia...

I'm feeling young again. What about you, guys?
 
not sure what you mean with awkward silence, NVIDIA sent the lead Vulkan driver developer to the panel and did present a running Vulkan demo as did Imagination and Intel (via Valve's driver), though AMD did not afaik.
 
There is Apple's logo on the Kronos webpage, does that means "Metal" is dead? It does not make much sense at this point.
 
As I covered here, when AMD officials began to publicly urge developers to steer away from its Mantle API in favor of DirectX12 and the next iteration of OpenGL called Vulkan, it didn’t take long to declare Mantle dead.

But now we know why. AMD's Robert Hallock confirmed on a blog post that Mantle had, for the most part, been turned into the Khronos Group’s Vulkan API that would supersede OpenGL.

“The cross-vendor Khronos Group has chosen the best and brightest parts of Mantle to serve as the foundation for 'Vulkan,' the exciting next version of the storied OpenGL API,” Hallock wrote. “Vulkan combines and extensively iterates on (Mantle’s) characteristics as one new and uniquely powerful graphics API. And as the product of an incredible collaboration between many industry hardware and software vendors, Vulkan paves the way for a renaissance in cross-platform and cross-vendor PC games with exceptional performance, image quality and features.”

Although AMD has had unexpected success with developers adopting Mantle, it was unlikely to become a standard without the support of its chief competitor Nvidia. When Microsoft announced DirectX 12, which adopted features of Mantle, AMD's API looked to be a technological dead end.

But with Vulkan being largely based on Mantle, it could potentially give AMD an advantage over Nvidia and Intel since it knows where the proverbial bodies are buried.

Jon Peddie, a graphics analyst with Jon Peddie Research said an advantage for AMD is there but it's pretty remote.

"It might, but I don’t think so. Vulcan (which isn’t even fully spec’ed yet) is an open Khronos-wide API, and so any of the Khronos members have access to it and can exploit it as they wish," Peddie said. "However, one thing about Vulkan is that its power and efficiency comes with the burden of very low-level (i.e, hard) programming. You really have to know how to manage memory, and draw calls to get it to sit up and sing for you, so smaller firms with limited engineering skills and staff won’t be able to get as much out of it as quickly as larger firms."

Neil Trevett, the president of the Khronos Group, and an executive with Nvidia, also downplayed any advantage any company had at this point.

"Many companies have contributed to Vulkan and SPIR-V—Mantle gave us a tremendous head start—but Vulkan is definitely a working group design now," Trevett said. "Multiple hardware companies demonstrated early Vulkan drivers at GDC - including Imagination, Intel, and Nvidia—ARM also reported on their driver performance."

Trevett also added that although developers writing for Vulkan will need more resources over OpenGL initially, that will change as libraries and tools are added to it.

One thing's for sure, while Nvidia ignored Mantle it can’t ignore Vulkan, which will become the de facto alternative cross-platform API to DirectX.
http://www.pcworld.com/article/2894...ises-from-the-ashes-as-opengls-successor.html
 
B_MSNtfXAAATiAm.jpg:large

Ofcourse it will be stripped to make sure all IHVs (including mobile) are able to support it. But nonetheless AMD seems to be the biggest contributor towards Vulkan. Having a common efficient API across all platforms with suitable vendor specific extensions would be great for gaming in general.
 
This single post by Promit is an absolute must-read:

http://www.gamedev.net/topic/666419-what-are-your-opinions-on-dx12vulkanmantle/#entry5215019

A couple of excerpts:

Ultimately, the new APIs are designed to cure all four of these problems.

* Why are games broken? Because the APIs are complex, and validation varies from decent (D3D 11) to poor (D3D 9) to catastrophic (OpenGL). There are lots of ways to hit slow paths without knowing anything has gone awry, and often the driver writers already know what mistakes you're going to make and are dynamically patching in workarounds for the common cases.

* Maintaining the drivers with the current wide surface area is tricky. Although AMD and NV have the resources to do it, the smaller IHVs (Intel, PowerVR, Qualcomm, etc) simply cannot keep up with the necessary investment. More importantly, explaining to devs the correct way to write their render pipelines has become borderline impossible. There's too many failure cases. it's been understood for quite a few years now that you cannot max out the performance of any given GPU without having someone from NVIDIA or AMD physically grab your game source code, load it on a dev driver, and do a hands-on analysis. These are the vanishingly few people who have actually seen the source to a game, the driver it's running on, and the Windows kernel it's running on, and the full specs for the hardware. Nobody else has that kind of access or engineering ability.

* Threading is just a catastrophe and is being rethought from the ground up. This requires a lot of the abstractions to be stripped away or retooled, because the old ones required too much driver intervention to be properly threadable in the first place.

* Multi-GPU is becoming explicit. For the last ten years, it has been AMD and NV's goal to make multi-GPU setups completely transparent to everybody, and it's become clear that for some subset of developers, this is just making our jobs harder. The driver has to apply imperfect heuristics to guess what the game is doing, and the game in turn has to do peculiar things in order to trigger the right heuristics. Again, for the big games somebody sits down and matches the two manually.

The last piece to the puzzle is that we ran out of new user-facing hardware features many years ago. Ignoring raw speed, what exactly is the user-visible or dev-visible difference between a GTX 480 and a GTX 980? A few limitations have been lifted (notably in compute) but essentially they're the same thing. MS, for all practical purposes, concluded that DX was a mature, stable technology that required only minor work and mostly disbanded the teams involved. Many of the revisions to GL have been little more than API repairs. (A GTX 480 runs full featured OpenGL 4.5, by the way.) So the reason we're seeing new APIs at all stems fundamentally from Andersson hassling the IHVs until AMD woke up, smelled competitive advantage, and started paying attention. That essentially took a three year lag time from when we got hardware to the point that compute could be directly integrated into the core of a render pipeline, which is considered normal today but was bluntly revolutionary at production scale in 2012. It's a lot of small things adding up to a sea change, with key people pushing on the right people for the right things.
I highly recommend that post to everyone, it's well written and should make sense to casual readers of this forum.
 
GLAVE open source debugger, which is a joint venture of LunarG and Valve:


This tool was shown briefly during the Khronos presentation on Vulkan.

It's possible to replay captured "time demos" and turn on validation, with break-on-error as well as being able to single-step the replay.
 

I didn't realize that even the best WDDM drivers on this planet devote significant portion of their source code to resolving app compatibility issues...

The first lesson is: Nearly every game ships broken. We're talking major AAA titles from vendors who are everyday names in the industry. In some cases, we're talking about blatant violations of API rules - one D3D9 game never even called BeginFrame/EndFrame. Some are mistakes or oversights - one shipped bad shaders that heavily impacted performance on NV drivers. These things were day to day occurrences that went into a bug tracker. Then somebody would go in, find out what the game screwed up, and patch the driver to deal with it. There are lots of optional patches already in the driver that are simply toggled on or off as per-game settings, and then hacks that are more specific to games - up to and including total replacement of the shipping shaders with custom versions by the driver team. Ever wondered why nearly every major game release is accompanied by a matching driver release from AMD and/or NVIDIA? There you go.


The second lesson: The driver is gigantic. Think 1-2 million lines of code dealing with the hardware abstraction layers, plus another million per API supported. The backing function for Clear in D3D 9 was close to a thousand lines of just logic dealing with how exactly to respond to the command. It'd then call out to the correct function to actually modify the buffer in question. The level of complexity internally is enormous and winding, and even inside the driver code it can be tricky to work out how exactly you get to the fast-path behaviors. Additionally the APIs don't do a great job of matching the hardware, which means that even in the best cases the driver is covering up for a LOT of things you don't know about. There are many, many shadow operations and shadow copies of things down there.

It's just sad.

If one day Microsoft decides to completely shut off WDDM 1.x and re-implement D3D9/10/11 runtime on top of WDDM 2.x, they will probably have to introduce a whole new Direct3D compatibility system similar to the App Compat layer for Win32 apps complete with per-application "shims" for feature level and capability lying, shader code replacement, custom DLLs and functions... all that finny stuff.

https://technet.microsoft.com/en-us/windows/jj863250.aspx
http://blogs.msdn.com/b/oldnewthing/archive/2003/12/24/45779.aspx
http://blogs.technet.com/b/askperf/...-your-old-stuff-work-with-your-new-stuff.aspx
etc.
 
Back
Top