Raytracing

Forrest · Jul 22, 2008

Is the new Ruby demo really raytracing on 4800 series in real-time? I have heard GPUs(as well as CPU) suck at ray-tracing.

zsouthboy · Jul 22, 2008

I have heard GPUs(as well as CPU) suck at ray-tracing.

That's like saying toasters suck at making soup - you can do it, it's just not as efficient.

Davros · Jul 22, 2008

Davros wants to see zsouthboys toaster making soup

ShaidarHaran · Jul 22, 2008

Davros said:
Davros wants to see zsouthboys toaster making soup

Sorry sir, members only. Now if you want to sign up for VIP access...

kyetech · Jul 23, 2008

I think toasters have alot to offer. They are the unsung hero of modern times.

ShaidarHaran · Jul 23, 2008

kyetech said:
I think toasters have alot to offer. They are the unsung hero of modern times.

Excuse me? How dare you place toasters on a pedestal and ignore our valiant friend the refrigerator! I am offended and demand an apology immediately. Also some milk.

Forrest · Jul 23, 2008

zsouthboy said:
That's like saying toasters suck at making soup - you can do it, it's just not as efficient.

If its not efficient, then how did ATI demo team do it with good frame rates?

ShaidarHaran · Jul 23, 2008

Forrest said:
If its not efficient, then how did ATI demo team do it with good frame rates?

Consumer GPUs are designed for rasterization, and as such are not best-suited to handle this task. CPUs are even worse, having no specialized hardware for any graphics rendering (unless you care to count SSE units under some loose definition). At least with GPUs you have an array of ALUs that can handle vast amounts of FP math, so it's not as bad as their narrower CPU brethren. It takes a lot of CPUs to handle RT in RT (haha, acronym confusion FTW!)

Andrew Lauritzen · Jul 23, 2008

GPUs aren't really that bad at raytracing... what they're bad at is building the acceleration structure, but once that's there, they're actually pretty efficient. They get hurt on ray divergence, but so does *every* architecture, that's just physics

.

zsouthboy · Jul 23, 2008

It's the damn KDTree building/traversal that is a pain.

Shooting rays is easy

ShaidarHaran · Jul 23, 2008

Andrew Lauritzen said:
GPUs aren't really that bad at raytracing... what they're bad at is building the acceleration structure, but once that's there, they're actually pretty efficient. They get hurt on ray divergence, but so does *every* architecture, that's just physics .

Thanks for the correction, Andy. I wondered if I should've mentioned the whole kdtree traversal issue... Would improved DB perf. help here, or will Real-Time Ray-Tracing (RT RT) on GPUs require a paradigm shift? Could a new algorithm solve the perf. issues? Or is it purely a hardware limitation?

zsouthboy · Jul 23, 2008

Cache coherency is painful

ShaidarHaran · Jul 23, 2008

zsouthboy said:
Cache coherency is painful

Meaning the dataset is simply too large to be usefully cached?

Bouncing Zabaglione Bros. · Jul 23, 2008

Forrest said:
Is the new Ruby demo really raytracing on 4800 series in real-time? I have heard GPUs(as well as CPU) suck at ray-tracing.

Read the article linked at the top of this thread.

Andrew Lauritzen · Jul 23, 2008

ShaidarHaran said:
Would improved DB perf. help here, or will Real-Time Ray-Tracing (RT RT) on GPUs require a paradigm shift?

Well certainly improved DB performance would "help", but we're at the point where it's only "slow" when it diverges... so it's just a case of losing SIMD coherence. From a hardware point of view, we seem to be settling on 4-16ish SIMD widths, so it's probably just something that we're going to have to live with. Now this is where typically people start to think along the lines of "well this is the only reason rasterization is faster... because we've designed SIMD hardware for it", but I'm not totally convinced. Then again, I'd be happy for the hardware people to go ahead and prove me wrong by making a MIMD design with similar throughput to a SIMD one

ShaidarHaran said:
Could a new algorithm solve the perf. issues? Or is it purely a hardware limitation?

Well that's a hard question. Certainly any algorithms that recapture coherence in some set of rays would improve things (on all architectures), which is really where the only big improvements are going to come at this point. As far as tracing random, totally incoherent rays, I doubt we're going to do much better than we can now. The point being that the "general" ray tracing problem throws out all information that would allow you to make it faster... kd trees are pretty much optimal if you have *no* information about the rays.

That said, the interesting problem is to find good ways to group and organize large sets of ray queries so that they can be efficiently and coherently evaluated in parallel. Incidentally one of those ways is called rasterization

ShaidarHaran · Jul 23, 2008

Andrew Lauritzen said:
Well certainly improved DB performance would "help", but we're at the point where it's only "slow" when it diverges... so it's just a case of losing SIMD coherence.

So is it a case of "too many branches" (for modern hardware) then?

Andrew Lauritzen said:
From a hardware point of view, we seem to be settling on 4-16ish SIMD widths, so it's probably just something that we're going to have to live with. Now this is where typically people start to think along the lines of "well this is the only reason rasterization is faster... because we've designed SIMD hardware for it", but I'm not totally convinced. Then again, I'd be happy for the hardware people to go ahead and prove me wrong by making a MIMD design with similar throughput to a SIMD one

I agree on the SIMD vs. MIMD issue, case-in-point: R6xx ALU utilization rates.

Andrew Lauritzen said:
Well that's a hard question. Certainly any algorithms that recapture coherence in some set of rays would improve things (on all architectures), which is really where the only big improvements are going to come at this point. As far as tracing random, totally incoherent rays, I doubt we're going to do much better than we can now. The point being that the "general" ray tracing problem throws out all information that would allow you to make it faster... kd trees are pretty much optimal if you have *no* information about the rays.

That said, the interesting problem is to find good ways to group and organize large sets of ray queries so that they can be efficiently and coherently evaluated in parallel. Incidentally one of those ways is called rasterization

Thanks for all the great info!

Forrest · Jul 23, 2008

So anyone has an idea about how did they implement it? During the presentation they said something about voxels. I also read the article interviewing John Carmack in which he says something about the voxel octree datastructure.

link

There is a specific format I have done some research on that I am starting to ramp back up on for some proof of concept work for next generation technologies. It involves ray tracing into a sparse voxel octree which is essentially a geometric evolution of the mega-texture technologies that we’re doing today for uniquely texturing entire worlds

nAo · Jul 23, 2008

Jon Olick will talk about id's raycasting on sparse octree tech at siggraph.

Andrew Lauritzen · Jul 23, 2008

ShaidarHaran said:
So is it a case of "too many branches" (for modern hardware) then?

Well more that just as soon as rays diverge in the data structure traversal (i.e. at triangle edges), the SIMD stuff effectively gets scalarized to the point where you wind up with 1/16 throughput. Still not terribly (and still generally faster than a lot of similarly-priced CPUs!), but definitely noticable. So ironically, you have trouble with scenes with lots of tiny triangles and high-frequency data structures... remind you of any very similar rendering algorithm that you know of?

ShaidarHaran said:
Thanks for all the great info!

No issues - certainly a lot of this stuff will come to the forefront in the next few years since we're definitely at the point where it's quite possible to implement an efficient ray tracer on consumer graphics hardware. Again, we're not quite at the point of generating the data structures there (ironically the best data structure scatter/sort unit on GPUs is currently the rasterizer

), but that will improve with time too.

killerbobjr · Jul 23, 2008

Forgive me for such an utter noob question, but I'm trying to conceptualize the inherent problems with raytracing on current hardware.

The way I understand it, there are two issues. First, you are trying to solve how a one dimensional object (the ray) interacts with a two dimensional object (a surface) within a three dimensional space (a scene) over a fourth dimension (time). Since you can't tell in advance which rays are going to be more important for the render (assuming a simple, unbiased renderer), you have to (essentially) build a database that describes every possible (within memory constraints) "bounce" a ray can take until it's no longer visible. If you're lucky, more than one ray from different sources will land on the same point on a surface at the same angle, allowing you to reuse that particular ray "path." However there's no guarantee that as any ray makes more "bounces" it will coincide with a previous ray, thus creating a problem with trying to reduce the number of calculations needed.

The second issue involves having dynamic objects within a scene. You can skip recalculating rays that didn't intersect with an object before it moved, and you can skip recalculating any rays that don't intersect with it in its current position. However, there's no guarantee that any new ray calculated for the moved object will coincide with any previously calculated ray path, again making it difficult to reduce the total number of calculations needed.

I know in the real world, there's all sorts of approximations and short cuts one can make to help reduce the number of calculations. But in general, is the above conceptualization correct, should it be expanded, or should I be riding the short bus?

Raytracing

Forrest

zsouthboy

Davros

ShaidarHaran

hardware monkey

kyetech

ShaidarHaran

hardware monkey

Forrest

ShaidarHaran

hardware monkey

Andrew Lauritzen

Moderator

zsouthboy

ShaidarHaran

hardware monkey

zsouthboy

ShaidarHaran

hardware monkey

Bouncing Zabaglione Bros.

Andrew Lauritzen

Moderator

ShaidarHaran

hardware monkey

Forrest

nAo

Nutella Nutellae

Andrew Lauritzen

Moderator

killerbobjr

Similar threads