Nvidia Turing Architecture [2018]

Digital Foundry's cover for Metro's RTX:

-Currently the game runs @1080p and 55fps~60fps on avg
-More optimizations are to be made to increase fps
-They are not using Tensor cores for denoising
-Foliage and tree leaves are excluded from the BVH structure

 
Sebbi's comment on NVIDIA's Mesh Shaders and Texture Space Shaders:

I am glad that our SIGGRAPH 2015 presentation inspired people. Turing has hardware support for mesh shaders and texture-space shading. Also multiview rendering support has been improved (good for virtual shadow mapping).

The combination of mesh shaders and texture space shading is actually very powerful. Texture space shading could prove to be better than existing alternatives such as hardware PRT (tiled resources) or bindless textures. Fun times ahead.

You could possibly likely implement virtual shadow mapping using texture-space shading too. Same for virtual textured terrain (no backing data store, generated fully from decals / material blends).

 
Digital Foundry's cover for Metro's RTX:

-Currently the game runs @1080p and 55fps~60fps on avg
-More optimizations are to be made to increase fps
-They are not using Tensor cores for denoising
-Foliage and tree leaves are excluded from the BVH structure


Its exciting that this it the start of the revolution in realtime lighting. Its somthing that is (IMO) absolutely needed to push IQ forward as screenspace effects have kinda reached a plateau and probes/bakes for lighting are really limiting for environments, especially where indirect lighting is concerned. The performance is good enough to be playable at modest resolutions and although RT in interactive applications has a long long way to go at least its finally here.
 
I won't be spending $1300+ on a GPU... ever :) So by the time RT hardware will be commonplace and powerful enough for sub-$500 GPUs to play them at high settings 1440p 60Hz+ it will probably be 3-5 years.

Your previous quote of 3-5 years did not have that qualifier.

The demo states February 22, 2019 so it will be available then for those who buy a RTX 2080 Ti.
 
Digital Foundry's cover for Metro's RTX:
-Currently the game runs @1080p and 55fps~60fps on avg
-More optimizations are to be made to increase fps

I basically think these first RTX games are equivalent to console launch games. So in other words not really indicative of what the hardware is capable of. The 1080p sub 60 fps performance does not bother me (yet).

I think this space will require an immense amount of research and experimentation to see what can be done and how well the games can look.
 
Mesh shaders are actually a very big deal. It looks like Nvidia is throwing out the entire traditional pipeline and replacing it with Task Shader -> Mesh Shader -> Pixel Shader.

https://devblogs.nvidia.com/introduction-turing-mesh-shaders/

I'm not even sure that this can be integrated with current Directx 12 or Vulkan 1 without a huge mess. Is it time for Directx 13 and Vulkan 2?

As an addenum that the forum very kindly DELETED when it told me the 10 minute edit window was up:

Adding in Task Shakers, this represents nothing less than a complete overhaul of the entire graphics API. Adding in malloc and free (Cuda already allows you to call these from the device with no restrictions), you get nothing less than an entirely GPU-driven API, draw calls, resource management, you name it. The big question now is what AMD and Intel are doing. This sort of change doesn't happen in a vacuum.
 
Last edited by a moderator:
The big question now is what AMD and Intel are doing. This sort of change doesn't happen in a vacuum.
It certainly looks quite similar to AMDs Primitive Shaders, although AMD doesn't have an API for it...
I'm not entirely sure of the differences, by the looks of it nvidia actually may not actually use the fixed function tessellator at all? I can't quite tell since nvidia has a "mesh generation" fixed function part there in the diagrams which may still be pretty much that. But apart from that, AMD also has essentially one shader stage pre-tessellation and one post-tessellation which really looks similar to me from a quick glance to the "Task Shader" and "Mesh Shader", although AMD doesn't have catchy names for those (well I suppose "Primitive Shader" is catchy enough but AMD never really told what it meant with that exactly).
So even if there might be significant differences between nvidia's and amd's approach it looks to me like they are moving into the same direction at least. No idea about intel - gen9 is ancient by now, gen10 probably irrelevant (cannonlake is much delayed and unmanufacturable), maybe gen11 (Icelake) would incorporate similar ideas although I didn't really see anything obvious in the open-source driver yet (but I might easily miss it...), and of course there's intel discrete but that's a 2020 product where nobody outside intel has much of an idea how it looks like...
 
Last edited:
Mesh shaders are actually a very big deal. It looks like Nvidia is throwing out the entire traditional pipeline and replacing it with Task Shader -> Mesh Shader -> Pixel Shader.

https://devblogs.nvidia.com/introduction-turing-mesh-shaders/

I'm not even sure that this can be integrated with current Directx 12 or Vulkan 1 without a huge mess. Is it time for Directx 13 and Vulkan 2?

As an addenum that the forum very kindly DELETED when it told me the 10 minute edit window was up:

Adding in Task Shakers, this represents nothing less than a complete overhaul of the entire graphics API. Adding in malloc and free (Cuda already allows you to call these from the device with no restrictions), you get nothing less than an entirely GPU-driven API, draw calls, resource management, you name it. The big question now is what AMD and Intel are doing. This sort of change doesn't happen in a vacuum.

it's not throwing out the old stuff, it's a new alternative. The old stuff still works well in certain cases, just like the new stuff targets new scenarios.

I think you have the malloc/free a bit wrong, because the developrs don't allocate the meshlets themselves like that, they are given memory implicitly. Just like a geometry shader has implicit access to a primitives vertices. So overall the changes to the APIs are not that huge. It's just a bit different re-routing (since I did most of the vulkan spec writing for this feature, I have the first hand experience ;) ).
Fundamentally tessellation/geometry shader already had some of the concepts, just that they had their own set of problems ;)
 
it's not throwing out the old stuff, it's a new alternative. The old stuff still works well in certain cases, just like the new stuff targets new scenarios.

I think you have the malloc/free a bit wrong, because the developrs don't allocate the meshlets themselves like that, they are given memory implicitly. Just like a geometry shader has implicit access to a primitives vertices. So overall the changes to the APIs are not that huge. It's just a bit different re-routing (since I did most of the vulkan spec writing for this feature, I have the first hand experience ;) ).
Fundamentally tessellation/geometry shader already had some of the concepts, just that they had their own set of problems ;)

It looks to me that what's happening is that we're getting a much closer to the metal view of the pipeline. Up until now, the implementation of the pipeline, how vertices are scheduled through the various shader stages until triangle setup, has been opaque and possibly fixed function. It would not surprise me if conventional vertex and geometry shaders are converted into a mesh shader in the drivers. Possibly tessellation shaders too, assuming the mesh shader stage is able to properly access the fixed function tessellation hardware.

Concerning malloc/free in the mesh and task shaders, I was talking about those functions being available to be run from device resident code. Right now, cudaMalloc and cudaFree, as well as cudaMemcpyAsync variants are legal to call from kernels. I don't see it as far fetched to expose these sorts of things from, say, task shaders, though I don't think they are quite yet. The big problem is that this sort of thing ends up breaking all sorts of assumptions built into the DirectX and Vulkan APIs.
 
Wow !! A 2080 almost can show at 24fps

Edited: 4k with DLSS!!
 

Attachments

  • dlss2.jpg
    dlss2.jpg
    31.9 KB · Views: 25
  • dlss1.jpg
    dlss1.jpg
    32.4 KB · Views: 22
Last edited:
Here you can really see the difference between DLSS and native 4k in the reflections of the silver stromtrooper in the elevator, given that the much sharper appearance wanted and not due to some missing effect.
 
Has anyone seen any tensor core benchmarks on the 2080Ti? All I've seen so far is anandtech did one hgemm test, but no tensor cores.
 
Back
Top