Raytracing vs Rasterizer (next gen)

Though your post is correct in principle, this is inaccurate:
One is fast 2D drawing of a 3d approximation of what things should like, the other mimics physical light transport.
RT in and of itself doesn't mimic light transport. It simple determines when a triangle or other geometric representation lies on a straight line. This calculation can be used for many things, including inaccurate RT rendering. The earliest RT images didn't consider light-transport and just used shading models like Lambertian to calculate lighting.

RTRT won't be fast enough to calculate light-propagation models for lighting games completely. Instead, devs will use the RTRT capabilities with other representations and algorithms to get better approximations from the data structures available. eg. Using SDF light volumes and RTed occlusion maps to prevent bleed-through.

We're not trying to solve RT, but to solve realistic lighting and shading. For this we need both compute power and RT performance. RT performance will never be enough to path-trace games completely. Compute-only has too many shortcomings. The two combined will make the most of the compute-capable modelling, but we won't see the real benefit of these gains until well into next-gen.
 
Truthfully ray tracing and rasterizing can be roughly seen as two loops in reversed order:
for each objet in the scene
for each pixel it touches
do stuff

for each pixel on screen
for each object in the scene
do stuff

Path tracing, bidirectional path tracing, photon mapping and the likes are about light transport.
fair enough, I always like to take it to path tracing. But you're right that it isn't the case.
 
Though your post is correct in principle, this is inaccurate:
RT in and of itself doesn't mimic light transport. It simple determines when a triangle or other geometric representation lies on a straight line. This calculation can be used for many things, including inaccurate RT rendering. The earliest RT images didn't consider light-transport and just used shading models like Lambertian to calculate lighting.

RTRT won't be fast enough to calculate light-propagation models for lighting games completely. Instead, devs will use the RTRT capabilities with other representations and algorithms to get better approximations from the data structures available. eg. Using SDF light volumes and RTed occlusion maps to prevent bleed-through.

We're not trying to solve RT, but to solve realistic lighting and shading. For this we need both compute power and RT performance. RT performance will never be enough to path-trace games completely. Compute-only has too many shortcomings. The two combined will make the most of the compute-capable modelling, but we won't see the real benefit of these gains until well into next-gen.
I guess when I wrote it; I was thinking about ROPs and the fixed function pipeline. How effective it was at just producing an image, and the humble beginnings of our FF pipeline starting with vertex units and ending with unified shader units; recognize how far the 3D pipeline has come, and now augmented by compute shaders. Where with RT, we've only added just added RT intersection hardware so far, Nvidia a bit further with some hardware that could speed up denoising, but It's like judging rasterization by vertex units alone.

When developers start figuring out how to put together the two, they may find that there are other areas to accelerate. But developers have to be given the hardware, despite how basic it may be, to experiment and play with to obtain new results.
 
I think the hardware has to be of a rudimentary level to be worth including, otherwise it'll go completely overlooked as has happened in the past. But any intersect acceleration is going be beneficial because games are already using tracing of rays to calculate stuff to make them look better and speeding up that process is going to be beneficial. The only reason it wouldn't be is if silicon given over to RTRT calcs takes way from the amount of compute that could be used in its stead. eg. If it's a choice between 1 TF compute or 100,000 rays intersect tests a second, 1 TF compute is better. If the choice is between 1 TF compute or 1,000,000,000 rays intersect tests a second, the latter is likely better (or whatever the numbers need to be to make the argument ;)).

In the grand RTRT debate, we didn't know what performance we'd get and how much compute would have to be sacrificed. It seems like the gains are reasonable and the costs not too great, meaning a better balanced system overall.

As for raytracing hardware only just starting to evolve, I don't really think there's a lot that can be done with it. ;) All the FF units for rasterisation don't have a place in RT because the algorithm is so dirt simple. We might get better evolved intersect options and data-structure modelling options, but eventually I can see GPUs ditching all the FF units and having just compute and ray hardware, with rays used to determine what pixels to shade and compute used to shade them and create/process drawing tasks. Somewhere in there will be data-structure creation and maintenance, whether run on compute or some dedicated RT HW.

Rasterisation exists only because the modelling was restricted to triangles. Compute + ray tracing takes us to software renderers an complete freedom. The issues aren't really 'drawing' but sampling and processing, where the sampling becomes the bottleneck needing potentially massive memory throughput, is not latency tolerant, and is best solved with better representations and structures.

We've recently learned Epic's Nanite is using texture-representations for geometry. I wonder if we'll be able to represent 3D models in 2D spaces for better data analysis? And from there, how long until someone solves 3D intersects as a bunch of 2D texture samples and maths?!
 
Last edited:
I wonder if we'll be able to represent 3D models in 2D spaces for better data analysis? And from there, how long until someone solves 3D intersects as a bunch of 2D texture samples and maths?!
The problem is not the number of dimensions but divergent access. Projecting the problem to something 2D does not really help. (I remember this old paper as an example - tracing geometry images: https://www.cs.cmu.edu/~kmcrane/Projects/RayTracingGeometryImages/paper.pdf)
We could even argue that stack free traversal algorithms project the problem to the single dimension of a space filling curve, so that's nothing new either. Each ray still skips individual segments of the curve to be work efficient - the problem of divergence always remains because it's a result of avoiding unnecessary work, which is our primary goal.

The only option i see to improve this remains spatial grouping of both rays and geometry. Because that's a very generic problem, we would know it already if there were some magic data structure or algorithm, i'm afraid of.
We surely get this in HW, but at the moment even coherent rays are demanding so i think it's too early. (Jensen might think differently... :D )
 
The problem is not the number of dimensions but divergent access. Projecting the problem to something 2D does not really help.
What if the 2D textures can then be scrunched up into a ball? You'd be able to jump from any point to any other point. At least, that's how 'time travel as a string' explains it.
 
What if the 2D textures can then be scrunched up into a ball? You'd be able to jump from any point to any other point. At least, that's how 'time travel as a string' explains it.
Nope.
If you jump between 1955, London, becoming a girl, 2050 and Peking, it's the same problem of jumping as it is with just jumping between London, Peking, Vienna and Istanbul. Damn jumping remains. :)
 
How about we take the 2D texture of the 3D model and origami it into a different 3D model?
:) hehe, ok then...
In theory we could reuse the same origami UV patch layout for all models that have the same genus. So, the same cross shaped cube map for all balls, bunnies, Armadillos and so forth.
And our goal would be: Sharing this layout, we always access the same data, so it's always in cache. At least some of the data. (Though, what's about the difference between Armadillo and ball then?)

This has been done e.g. in the early DAG voxel compression papers. IIRC, the first was about a high res voxelization of the Epic citadel scene, and they used it to trace shadows in compute. And it was realtime so not bad.
The idea would be the same: They use a dictionary of 4x4x4 voxel blocks. Whenever the same shape (or larger branch of octree) appears we trace the same data, and this did not only help with compression but also with bandwidth during tracing, IIRC.
On the other side, it also causes larger jumps. The usual cache friendly morton memory layout of an octree breaks because we chase pointers to data at random memory locations. In the end, tracing perf was a bit worse than regular octree IIRC (their goal mainly was compression ofc.)

Slipped from dimension reduction to data sharing, i admit.
What you want is likely to get the performance of SS tracing for the entire world i guess?
But we already have the property of stuff being close spatially being also close in memory. We already have this for rasterization purposes, and we do profit from this with coherent rays in RT too.
We have to accept there is additional cost for the additional functionality of reaching the whole scene, with no information being missing or obscured. But who cares, it's Jensens problem now :D
 
If it's a choice between 1 TF compute or 100,000 rays intersect tests a second, 1 TF compute is better. If the choice is between 1 TF compute or 1,000,000,000 rays intersect tests a second, the latter is likely better (or whatever the numbers need to be to make the argument ;)).

You have to break this up conceptually.

There are different stages of evaluation:
a) I have a bunch of rays, I have a bunch of geometry, I need to know which rays hit which geometry; that's it
b) doing something with the hit-point, this is completely free-style, can be physics interaction sound, can be shading, can be really anything

That's the principle pipeline. Raytracing and rasterization as a term/algorithm describe a), not b).

In a) there is a parity region, where the two have identical performance, for some N rays and M triangles. Worded differently: the 3d graphs of performance measurements of the two intersect, and form a contourline in the plot, where they have identical speed.

Real world situtations are very diverse, within a frame you can have multiple different N vs. M situations (gbuffer, shadow, etc. pp.). You might find yourself in different local optima for raytracing vs. rasterization.

The normal graphics pipeline, as we know it, makes an inline shade b) call when hit-positions in a) are determined. In a deferred pipeline, you have a) and b) cleanly separated. The same occurs with raytracing. There are raytracers which make a inline call on hit to shade, there are deferred raytracers, that shade after all rays have been resolved.
A rasterizer can not easily calculate the front-most hit, a raytracer can not easily resolve neighbouring rays. For both deficiencies there are optimizations available (there always are). The fastest rasterizer are tiled binned ones in high speed on-chip storage, and per tile it can get the front-most hit rather well. Some fast raytracers use ray-bundles, which make them suitable for SIMD. This is a rather broad topic, there are many many permutations of the two, and they have different ideal spots in the performance graph, beating the others.

Now, in b) you might want to have further information, and you send out more hit-requests, and we get a recursive algorithm. If the new hits can be resolved faster with rasterization or raytracing, is not obvious (e.g. planar reflection through rasterization etc.), you'd have to look at the performance graph, and maybe trust your instincts.

Pathtracing (uni- and bi-directional) is raytracing in a very specific context of b), you can make the equivalent setup with rasterization.
It is more useful to evaluate the part that's happening in b) than talking about the a) part. Technically that means one has to use lighting equation terminology to precisely state what's going on there. Importance Sampling (which), Monte Carlo, BRDF (which), and so on. The performance difference betwen those implementations can easily be several orders of magnitude larger, than the difference between using rasterization or raytracing (think of hard shadows vs. spectral BTFs).

If all of your geometry is permeable (to light, to sound, etc.), there might be a third algorithm beating raytracing and rasterization in a large number of cases. (I would say that'd be voxel-grid based techniques, at very large scales probably vonoroy cell quantization, which are parametric 3d cells instead of regular boxes).
You might think this is some exotic oddity, but most volumetric calculations require such a form of resolve, as air and other gases and liquids are permeable, and increasingly contribute to the image quality.
 
Back
Top