I think it’s increasingly clear that Ray Tracing, or DXR was poorly marketed and the understanding of how it works is incomplete in most circles.
In this post I will attempt to reset the baseline of understanding for our discussions here, starting with Machine Learning, discussing DXR, and finally how the solution comes together. Although I may (**will**, but not willfully
) post some inaccurate statements that will be corrected by more knowledgeable members, the big picture of Ray Tracing in graphics is the message I want to get across, and hopefully I’m able to change your opinion of what to expect in Ray Tracing, and be able to shed light on why graphics hardware and solutions will undergo a dramatic change in direction from our standard rasterization path.
To begin, before we talk about Ray Tracing, we really need to start at Machine Learning and Neural Networks. Without getting into a deep discussion, Neural Networks and ML are solutions that absolutely excel at solving computationally exhausting problems through accurate approximation and trained modelling. TLDR; it can solve problems that would take an enormous or impossible amount of computation with a fraction of the resources and time. To put this into perspective, ideal problems for creating AI is to solve problems that are computationally impossible. Google challenged themselves to create the best possible AI for the game GO. What makes GO special is that there are more permutations of the game than number of atoms in the known universe, possibly universes, I believe it’s 361^(2.08 × 10^170) or 10^720 possible games for every single atom in the known universe. So the idea of brute forcing an AI is out. Through neural networks Google was able to make a multi policy setup that would go on to destroy the worlds best GO champion 4 games out of 5. (There is a documentary on this called Alpha Go on Netflix, worth watching). The same AI, Alpha Go, would then go up against the #1 chess AI Stockfish 8. And in 4 hours of training itself chess, it beat Stockfish 8 (28 wins, 72 draws, and zero losses).
What’s of importance in this comparison is the following:
* Alpha-zero evaluates 80k moves per second
* Stockfish evaluate 70 million moves per second
With each bout that Alpha-zero plays Stockfish 8, the gap between wins and draws widens. Stockfish 8 has yet to beat alpha-zero despite its ability to evaluate magnitudes more moves per second.
There are sorts of other big success stories of neural networks and AI, mainly computer vision and self driving cars, but we’ll leave that for you to research. One thing to recognize is that a majority of running these ML solutions have been done on GPUs and CPUs. Self driving for instance, is done on a Tegra K1/X1 on Tesla vehicles.
The history here is important to note because when we discuss computationally impossible things to calculate, neural networks naturally becomes a solution to the problem. When we look at things like 4K resolution, 8K resolution and Ray Tracing, all of these share the fundamental problem of being computationally intensive to the point of inefficiency, or currently out of reach. For instance a completely path traced image at 4K is just not feasible. But there are also some other items that have been out of our reach, in particular fluid dynamics, tessellation, cloth physics, destruction etc. All problems that can be solved using neural networks to approximate the output.
So if we integrate Machine Learning into video games, then naturally we should be able to solve all those computational problems with accurate approximation. Though you should asking yourself that if we have compute today why hasn’t Machine Learning been able to take part of video games for the last 5 years or so? Well there are problems with machine learning, namely the libraries and Apis that support them are frankly way too slow and way too black boxed to be used in real time operations.
Enter Direct ML.
Direct ML is developed specifically for real time ML applications, and is marketed as the DX12 of machine learning APIs. With the full release of Direct ML in Spring of 2019, we can finally begin to see ML applications in games. But that isn’t to say they haven’t been working on it already. There are currently 3 main marketed uses of ML in games today, denoising, anti aliasing and AI resolution upscale. 2 we’ve seen live with nvidia presentations, the 3rd can be found here:
http://on-demand.gputechconf.com/si...-gpu-inferencing-directml-and-directx-12.html
(FF to 15 minutes to live demo)
In this presentation of Direct ML the goal is to get a very normal computer with a normal GPU to upscale Forza Horizon 3 from 1080p and low quality AF, to 4K with higher AF settings. And it does this fairly well if you watch the video. It’s important to note that they were able to move 4x the resolution with a relatively normal GPU. I’m going to assume normal doesn’t not mean 1070+. This all completed on normal compute hardware which makes it impressive, but is there a way to make this go faster for less?
Enter Tensor Cores.
Before the invention of 3D accelerators (GPUs) CPUs had the role of generating graphics. And GPUs came into the industry as being simpler processors but could do large amounts of math in parallel. And as GPUs continued to evolve, they have been more and more capable at doing specifically that type of work such that machine learning solutions became available. Until recently our latest professional GPUs with high amount of compute power were the best hardware available for neural networks, until Tensor cores were created. Tensor cores are to neural networks, as GPUs are to 3D graphics. Tensor cores are even simpler cores, whose purpose is to accelerate the speed of computing neural networks; so while a Titan V is capable of 20+ TF of half precision math, Tensor cores are hitting closer to 500 TF of half precision power (not a good comparison due to silicon size but w/e). Tensor cores are quickly the new rage in the ML field, and we get to see them enter the consumer market with the introduction of Turing, or the RTX line of nvidia GPUs.
Now that we have established the roles and purpose of ML in games, we can make the statement that it is Machine Learning that is being used to solve Ray Tracing, not the Ray Tracing cores. As ray tracing computation continues to increase there is no hardware that would be able to solve that problem in a real time method. Thus we use denoising to enable developers to use less rays per pixel and leverage Machine learning to fill in the rest. And that is where RT cores/hardware comes to play. Neural networks need to work off inputs before it can generate the approximations. So the RT cores are there to assist in building a bare bones RT model for approximation to take over.
The RT cores alone is insufficient in having enough power to generate the type of images the consumer wants for next generation graphics, but with the assistance of ML, this problem is solvable. But what are RT cores? Today we know them as BVH accelerators, add-ons onto the compute engine whose sole purpose to hold and update ray data so that the whole scene does not need to be fully computed over each frame. That being said, my understanding that outside of BVH, the compute of GPUs is still processing ray tracing.
pt 2 below.