DmitryKo
Veteran
Actually MSDN has been updated with a detailed description of GPMMU virtual addressing model in WDDM 2.0...
I actually had to google that word to make sure it was real, thank you for my new word of the day.OK, GS bypass is still not supported. Move along AMD, I want to defenestrate the need of geometry shaders everywhere is possible.
I actually had to google that word to make sure it was real, thank you for my new word of the day.
AMD certainly supports GS bypass. OpenGL extension has been available for long time already. AMD has slower geometry shaders than Intel and NVIDIA, increasing the importance of this feature for them. My measurements (on Radeon HD 7970) show a 2.7x performance drop just by enabling the GS stage (simple GS: output = input).OK, GS bypass is still not supported. Move along AMD, I want to defenestrate the need of geometry shaders everywhere is possible.
There are differences. Maxwell 2 supports a RT bitmask instead of an RT index. This allows Maxwell 2 to replicate the triangle to N viewports if needed. This is super nice for some algorithms.Last I checked the caps it was on for NVIDIA Maxwell 2 but off for Maxwell 1. I don't know if there are hardware differences there or whether they just have yet to implement it across all the architectures though. I also didn't test if it actually works yet, etc.
Intel's geometry shaders are so fast that they are actually usable. Yet another proof of alien technologyShould be working fine on all Intel parts, although obviously bugs are always possible with new features so let me know if you run into any
DX12 calls it VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulationWhat is GS bypass? Haven't come across that yet.
I don't think I'm ready for such responsibility yet sensei. But that is indeed awesome nameage.Choose yourself whether you use the OpenGL extension name or the most awesome ever DX12 name. But choose wisely
Have you tried Tonga or Fiji? There will still be a drop but in a lot of cases they will perform better.AMD certainly supports GS bypass. OpenGL extension has been available for long time already. AMD has slower geometry shaders than Intel and NVIDIA, increasing the importance of this feature for them. My measurements (on Radeon HD 7970) show a 2.7x performance drop just by enabling the GS stage (simple GS: output = input).
DX12 calls it VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation
Choose yourself whether you use the OpenGL extension name or the most awesome ever DX12 name. But choose wisely
I have only tested with GCN gen1 (1.0) and gen2 (1.1). Both behave similarly. Our developement computers have high end cards, so no Tongas. That test was done before Fiji (Fury X) existed, and Fury X is still impossible to obtain in Finland.Have you tried Tonga or Fiji? There will still be a drop but in a lot of cases they will perform better.
How much does Intel drop in your test?
Typed UAV Loads. Kepler should not support them.Are the any features that Maxwell 1 supports that Kepler doesn't?
Cause' ViewportAndRenderTargetArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGeometryShaderEmulation was too much verbose... u.uJust....wow [emoji46]
Yes Maxwell 2 has stuff that goes above and beyond like the ability to "add" some attributes in GS but pass others through, and as you note the ability to multicast to several viewports. But I still thought that most NVIDIA hardware should be able to support the DX12 feature as-is, not just Maxwell 2, right? Guess it's probably just a driver thing.There are differences. Maxwell 2 supports a RT bitmask instead of an RT index. This allows Maxwell 2 to replicate the triangle to N viewports if needed. This is super nice for some algorithms.
Pro-tip: if you want something to be fast in Intel hardware, get it into a popular benchmark applicationIntel's geometry shaders are so fast that they are actually usable. Yet another proof of alien technology
I am pretty sure that all GPUs now should be able to use UAV across all shader stages, even on FL 11.0 (Tier1 of resource binding requires at least 8 UAV slots across all shader stages). I am not sure why this is not allowed in D3D11 (like through a cap-bit)
this would be quite a non-trivial change, since feature level 11_0 in Direct3D12 would be a superset of the same level in Direct3D11... which brings us back to the question whether "LEVEL_A" etc. would be a better naming scheme for Direct3D 12.