Vulkan is a GCN low level construct?

ieldra · Sep 2, 2016

ToTTenTranz said:
From the same review, it looks like in a GPU-limited scenario the AMD driver works just as well:

Though this isn't Vulkan, so unless we're also trying to make DX12 into a GCN construct, the point is a bit moot.

I actually saw some benchmarks of DOOM running in Vulkan on different older CPUs comparing a 1060 and a 480.

http://www.hardwareunboxed.com/gtx-1060-vs-rx-480-in-6-year-old-amd-and-intel-computers/

^This is before 372.54

372.54 brought driver support for vulkan runtime version 1.0.0.17 and with it reduced CPU render times by a factor of 4 in one benchmark I saw, but obviously the effects of that need to be seen using higher end graphics cards

http://www.sweclockers.com/test/22533-snabbtest-doom-med-vulkan

Deleted member 13524 · Sep 2, 2016

ieldra said:
I actually saw some benchmarks of DOOM running in Vulkan on different older CPUs comparing a 1060 and a 480.

That's in the previous page...

ieldra · Sep 2, 2016

ToTTenTranz said:
That's in the previous page...

Oh, I didn't notice, at least it's once per page and not multiple times

I didn't see it on the previous page and repost it pretending I discovered it

DavidGraham · Sep 2, 2016

ieldra said:
372.54 brought driver support for vulkan runtime version 1.0.0.17 and with it reduced CPU render times by a factor of 4 in one benchmark I saw, but obviously the effects of that need to be seen using higher end graphics cards

How did they know it reduced CPU render time? these new results didn't show any significant change from before.

swaaye · Sep 2, 2016

OpenGL guy said:
Have you tried Quake 4 on a GTX 1080? I have and it's atrocious.

Is it the bug that causes the game to only load low resolution textures? The Steam version of the game has some kind of bug that causes this. It can be fixed with a few config variables.

ieldra · Sep 3, 2016

DavidGraham said:
How did they know it reduced CPU render time? these new results didn't show any significant change from before.

The performance gains from vulkan are a lot bigger than when I tested when the Vulkan update first hit the game.

CPU render times can be seen using the ingame monitor, theres a video on YouTube of doom vulkan using two different drivers ;

dogen · Sep 3, 2016

ieldra said:
The performance gains from vulkan are a lot bigger than when I tested when the Vulkan update first hit the game.

CPU render times can be seen using the ingame monitor, theres a video on YouTube of doom vulkan using two different drivers ;

Whoa... 3ms cpu time down from 9?

sebbbi · Sep 3, 2016

ieldra said:
http://www.sweclockers.com/test/22533-snabbtest-doom-med-vulkan

Both Maxwell (980 Ti) and Pascal (1080) flagship GPUs get a huge +30% performance boost from Vulkan. This pretty much proves that Vulkan is not a "GCN construct". Slower Maxwell and Pascal cards show smaller gains simply because the game seems to be getting close to 100% GPU bound on those GPUs (there's free CPU cycles to spare). Kepler (780 Ti) seems to be an anomaly. Could be that Kepler drivers are not yet fully optimized for Vulkan. Kepler is likely not a high priority to Nvidia anymore. Fermi doesn't even have any Vulkan or DX12 drivers.

CarstenS · Sep 3, 2016

Doesn't Doom have a frame limit at 200?

DavidGraham · Sep 3, 2016

sebbbi said:
Kepler (780 Ti) seems to be an anomaly. Could be that Kepler drivers are not yet fully optimized for Vulkan. Kepler is likely not a high priority to Nvidia anymore. Fermi doesn't even have any Vulkan or DX12 drivers.

Kepler behaves like any 3GB card running Vulkan would: massive reduction in fps with no apparent reason, It's probably a behavior remnant of the old Mantle structure where performance dropped on cards with average memory sizes. Here we see the 3GB 1060 taking a 30% hit under Vulkan compared to the 6GB 1060, despite running exactly the same as the 6GB under OpenGL.

http://www.techspot.com/review/1237-msi-geforce-gtx-1060-3gb/page2.html

ieldra · Sep 3, 2016

dogen said:
Whoa... 3ms cpu time down from 9?

Yeah, serious business eh?

sebbbi said:
Both Maxwell (980 Ti) and Pascal (1080) flagship GPUs get a huge +30% performance boost from Vulkan. This pretty much proves that Vulkan is not a "GCN construct". Slower Maxwell and Pascal cards show smaller gains simply because the game seems to be getting close to 100% GPU bound on those GPUs (there's free CPU cycles to spare). Kepler (780 Ti) seems to be an anomaly. Could be that Kepler drivers are not yet fully optimized for Vulkan. Kepler is likely not a high priority to Nvidia anymore. Fermi doesn't even have any Vulkan or DX12 drivers.

Meh, I feel like it's becoming far too common for people to just dismiss a whole API because of one title/benchmark. GCN cards gained up to 40% when Vulkan update for DOOM first hit and all of a sudden Vulkan is Mantle (people confused GCN intrinsics with some kind of inherent LL GCN optimization in Vulkan) and blah blah blah.

Seems to me like when GPU is the bottleneck they perform almost identically (whereas for GCN seems there's a slight Vulkan advantage possibly due to intrinsics being used).

Some recent games have had me really confused. Deus Ex Mankind Divided performs really strangely. In the built-in benchmark the Fury X is something like 25% faster (seems to track compute xput) yet in the game a reference (1200mhz) 980ti outperforms it by 5-10% when the FPS is low, and Fury X appears to lead when the FPS is high. CPU overhead seems higher on the nvidia side as well, and this is DX11. Weird stuff.

I usually expect NV cards to be on top when the FPS is high; higher geometry and less CPU overhead generally speaking, Deus Ex is opposite day.

CarstenS said:
Doesn't Doom have a frame limit at 200?

Yeah it does

DavidGraham said:
Kepler behaves like any 3GB card running Vulkan would: massive reduction in fps with no apparent reason, It's probably a behavior remnant of the old Mantle structure where performance dropped on cards with average memory sizes. Here we see the 3GB 1060 taking a 30% hit under Vulkan compared to the 6GB 1060, despite running exactly the same as the 6GB under OpenGL.

http://www.techspot.com/review/1237-msi-geforce-gtx-1060-3gb/page2.html

I've noticed games that support both DX11 and DX12 have less efficient memory management using DX12, lots more VRAM used and potential for stutter increases. Annoying. Any good reason why that is ?

Wow, those results are very different from the sweclockers ones >_>

sebbbi · Sep 3, 2016

DavidGraham said:
Kepler behaves like any 3GB card running Vulkan would: massive reduction in fps with no apparent reason, It's probably a behavior remnant of the old Mantle structure where performance dropped on cards with average memory sizes. Here we see the 3GB 1060 taking a 30% hit under Vulkan compared to the 6GB 1060, despite running exactly the same as the 6GB under OpenGL.

3GB memory explains this perfectly.

High level APIs, such as DirectX and OpenGL track the residency of each resource separately and move them automatically/repeatedly in/out of GPU memory based on accesses. Residency tracking is one of the big performance hogs of the high level APIs. It also causes random frame drops as the information of a missing resource comes very late (bind resource -> draw). If a resource is not resident, it must be immediately loaded to GPU memory (and old resources must be kicked out according to LRU caching policy). If enough resources are missing on the same frame there will be a stall -> frame drop.

In DirectX 11 and OpenGL you could over-commit GPU memory without big problems. All you got was some random single frame drops. The driver automatically handled resource switching. However in DX12 and Vulkan, the developer needs to manually implement resource management. It is easiest just to allocate big chunk of GPU memory and load all the common level assets there permanently. Of course there's some residency tracking for high detail (close up) streamed resources, but generally there's much more "pinned" GPU data in DX12/Vulkan applications compared to DX11/OpenGL. This is similar to console resource management (game has guaranteed memory amount). This is however problematic on PC if the developer has designed the game (ultra settings in this case) around a larger memory budget (such as 4GB). Doom Vulkan version doesn't even start on 2 GB graphics cards, while the OpenGL version runs just fine (albeit with some frame rate issues).

sebbbi · Sep 3, 2016

ieldra said:
Seems to me like when GPU is the bottleneck they perform almost identically (whereas for GCN seems there's a slight Vulkan advantage possibly due to intrinsics being used).

Async compute gives GCN a slight GPU performance edge over Nvidia in Vulkan and DX12. Both Nvidia and AMD offer various intrinsics (such as wave/warp operations) as Vulkan extensions. Id software hasn't mentioned using IHV specific intrinsics, but they have talked highly about async compute. This would single handedly explain why AMD gains around 10% more performance from Vulkan than NVidia.

CarstenS · Sep 3, 2016

Doom does use AMD GCN intrinsics, actually, AMD was quite open and proud of it for Doom being the first title to make use of it. And rightfully so.
https://community.bethesda.net/thread/59229?start=0&tstart=0

Why would anyone want not to use present resources if they could?

Anarchist4000 · Sep 3, 2016

ieldra said:
I've noticed games that support both DX11 and DX12 have less efficient memory management using DX12, lots more VRAM used and potential for stutter increases. Annoying. Any good reason why that is ?

I think the key issue there is BOTH DX11/12. The resource models are different so DX12 gets tacked on. For an exclusive DX12 environment there are other strategies that could be used that address the concerns.

ieldra said:
Meh, I feel like it's becoming far too common for people to just dismiss a whole API because of one title/benchmark. GCN cards gained up to 40% when Vulkan update for DOOM first hit and all of a sudden Vulkan is Mantle (people confused GCN intrinsics with some kind of inherent LL GCN optimization in Vulkan) and blah blah blah.

Definitely agree with this. It's easier to just assume AMD had superior driver support coming from Mantle, which shouldn't come as a surprise. A lot of Nvidia's optimizations for DX11 would need reworked, but once complete I'd expect solid results. Results that look to now be arriving on Doom.

sebbbi said:
Id software hasn't mentioned using IHV specific intrinsics, but they have talked highly about async compute.

I thought one of their developers stated they did use some intrinsics for TSSAA. In this case it was passing data between lanes to reduce bandwidth while compositing. Do the SSAA and pass the result to the neighbors or something.

CarstenS · Sep 3, 2016

Regarding the thread topic: I think yeah, DirectX 12 and Vulkan are GCN low-level constructs but in a very positive way. They took what they could from AMDs pioneering work with Mantle which of course was tailored to GCN. As a result, many concepts in DX12 and Vulkan seem to have similar effects on GCN cards as the use of Mantle would have had. So, the Dollars AMD spent on Mantle (i.e. flying Johan from Stockholm back and forth... j/k) were pretty good invested.

The only caveat I have is that from the outside you could be tempted to think that with the effort for Mantle and the prospect of an "automatic" performance uplift with DX12/Vulkan, AMD terribly neglected their basic DX11 driver architecture, just doing basic bug fixing for newer titles as necessary. That's the impression you could also get, when viewing the dramatic performance uplifts going from DX11 to 12 in games like ashes for AMD cards.

That not only a single one of the new DX12 or Vulkan titles had some birthing problems did not really help to convey the positive side of the message AMD was trying to instill.

I.S.T. · Sep 3, 2016

I don't think they neglected their DX11 driver(I know that is not what you're saying, Carsten, rather I am just arguing against the concept in general). It's more likely that it had too much... I'm not sure what the correct term here is, code rot? Baggage?

It's not like this is a new thing at AMD's graphics division. Recall the much vaunted 100% OGL rewrite of quite a few years back that wound up instead being slowly phased in and didn't ultimately change that much in the end compared to the amount of hype AMD both generated themselves and allowed to build? I think likely a similar thing happened there. Too much to do, so their effort wound up ultimately failing.

gamervivek · Sep 3, 2016

ieldra said:
Yeah, serious business eh?

Meh, I feel like it's becoming far too common for people to just dismiss a whole API because of one title/benchmark. GCN cards gained up to 40% when Vulkan update for DOOM first hit and all of a sudden Vulkan is Mantle (people confused GCN intrinsics with some kind of inherent LL GCN optimization in Vulkan) and blah blah blah.

Seems to me like when GPU is the bottleneck they perform almost identically (whereas for GCN seems there's a slight Vulkan advantage possibly due to intrinsics being used).

Some recent games have had me really confused. Deus Ex Mankind Divided performs really strangely. In the built-in benchmark the Fury X is something like 25% faster (seems to track compute xput) yet in the game a reference (1200mhz) 980ti outperforms it by 5-10% when the FPS is low, and Fury X appears to lead when the FPS is high. CPU overhead seems higher on the nvidia side as well, and this is DX11. Weird stuff.

I usually expect NV cards to be on top when the FPS is high; higher geometry and less CPU overhead generally speaking, Deus Ex is opposite day.

CPU overhead is higher on nvidia, AMD's problem is that they've a single threaded driver that gets overwhelmed on CPUs with low IPC, AMD's own and i3.

AMD cards usually use less CPU than corresponding nvidia cards in the comparisons here,

sebbbi · Sep 3, 2016

CarstenS said:
Doom does use AMD GCN intrinsics, actually, AMD was quite open and proud of it for Doom being the first title to make use of it. And rightfully so.
https://community.bethesda.net/thread/59229?start=0&tstart=0

Why would anyone want not to use present resources if they could?

DX9 and DX11 also had lots of IHV specific extensions (API backdoors) and OpenGL has IHV specific extensions (official well documented ones). Most developers didn't use these IHV specific extensions, because you'd need to write and maintain and test multiple different code paths. There is no guarantees either that an extension works properly with future hardware from the same IHV. Once you start writing hardware specific code, your maintenance cost will increase. Consoles are an exception to this rule, since console hardware stays unchanged for many years. We see a new PC GPU generation at least every 2 years.

You only want to write hardware specific code on PC if that gives you big gains and if it helps with more than on GPU. Fortunately AMD GCN architecture has been used for long time. There's a big user base available. Also with these intrinsics, you can port your console GCN code to PC, reducing the cost of writing the hardware specific code.

Anarchist4000 said:
I thought one of their developers stated they did use some intrinsics for TSSAA. In this case it was passing data between lanes to reduce bandwidth while compositing. Do the SSAA and pass the result to the neighbors or something.

Now that you mentioned it, I remember it as well. Most likely they just ported their cross lane optimized console TSSAA code to PC GCN. Nvidia has similar cross lane intrinsics available now (https://developer.nvidia.com/reading-between-threads-shader-intrinsics).

gamervivek said:
CPU overhead is higher on nvidia, AMD's problem is that they've a single threaded driver that gets overwhelmed on CPUs with low IPC

It's true that the total DX11 CPU overhead might be slightly higher on Nvidia, but as long as you have 4 core CPU or more (and/or hyperthreading), the driver distributes the workload nicely. AMD DX11 driver taxes the single (application) render thread heavily. The render thread tends to be the bottleneck even without the driver taking part of the execution time.

Both approaches however are highly wasteful. There's just too much resource bookkeeping and translation going on. Vulkan and DX12 reduce this overhead to minimum and allow you to either use one render thread or split work to multiple threads yourself (depending which suits your application the best). Extra driver worker threads are not needed.

DavidGraham · Sep 3, 2016

gamervivek said:
CPU overhead is higher on nvidia, AMD's problem is that they've a single threaded driver that gets overwhelmed on CPUs with low IPC, AMD's own and i3.

AMD's DX11 overhead has been found to be significantly higher even in synthetic tests as well, even on high end overclocked CPUs.

AMD cards usually use less CPU than corresponding nvidia cards in the comparisons here,

That remains a single comparison, see others from the same channel for opposite results

Vulkan is a GCN low level construct?

ieldra

Deleted member 13524

Guest

ieldra

DavidGraham

swaaye

Entirely Suboptimal

ieldra

dogen

sebbbi

CarstenS

Moderator

DavidGraham

ieldra

sebbbi

sebbbi

CarstenS

Moderator

Anarchist4000

CarstenS

Moderator

I.S.T.

gamervivek

sebbbi

DavidGraham

Similar threads