Ray Tracing Versus Rasterization, And Why Billions Of Dollars Is At Stake

Brimstone

B3D Shockwave Rider
Veteran
Dean Takahashi updated his blog with an article on combo cpu+gpu chips, and ray-tracing.

I'll just post up the first few paragraphs, the rest of the article can be read at the link.

Not everybody may care about how they get their eye-popping graphics. But how it gets delivered to you will be determined by the results of a multi-billion dollar chess game between the chip industry's giants. Can you imagine, for instance, a future where Nvidia doesn't exist? Where there's no Intel? The survivor in the PC chip business may be the company that combines a graphics chip and a
microprocessor on a single chip.

In the graphics chip industry, everyone remembers how Intel came into the market and landed with a thud. After acquiring Lockheed’s Real3D division and building up its graphics engineering team, Intel launched the i740 graphics chip in 1998 and it crashed and burned. The company went on to use the i740 as the core of its integrated graphics chip sets, which combined the graphics chip with the chip set, which controls input-output functions in the PC. Intel took the dominant share of graphics as the industry moved to integrated, low-cost chip sets, according to Jon Peddie Associates. But the company never gave up on its ambition of breaking into graphics. Intel has a big team of graphics engineers in Folsom, Calif., to work on its integrated graphics chip sets. And it recently acquired graphics engineers from 3Dlabs. The Inquirer.net has been writing about rumors that Intel has a stand-alone graphics chip cooking. That may have been one of the factors that pushed Advanced Micro Devices into its $5.4 billion acquisition of graphics chip maker ATI Technologies. Because of that deal, the PC landscape has changed forever. Now there is an imbalance as Intel, Nvidia, and AMD-ATI try to find the center of the future of computing.

http://blogs.mercurynews.com/aei/2006/08/the_coming_comb.html#more
 
Kind of funny to see how Jen H has rather differing views than almost everyone else interviewed in that article.

And oddly, I'd have to agree with him in some areas. As nice as having a gpu and cpu on the same chip is from a romanticized point of view, it just wouldn't be feasible for anything but the low end stuff... meaning you won't be running all this fancy ray-tracing stuff on there anyways. In the short term (~5 years) you're talking about requiring double the die size or more than what is feasible to manufacture at high levels if we want a high end gpu/cpu situation, unless we have some amazing breakthroughs in fabrication. It sounds like a bunch of dreams of developers and industry types that gloss over the harsh reality of the situation. Some guys from ATI/AMD did mention it, that CPU/GPU integration would be for the low end markets, but a lot of them were all running with the idea that we'd be ray-tracing, rasterizing, and all that jazz on a single chip, in the newest games, in only a few years time.

The ray-tracing comments always seem funny to me... wake me up when ray-tracing actually starts being feasible realtime at comparable scene complexity to what we can pull off with brute force rasterization -- it's been almost here for the last 10+ years it seems. I do agree with some of the comments that the future of graphics will likely lead to a hybrid model where some parts are ray-traced along with normal rasterization methods... It just won't be for a while.
 
To me it really sounds like no one really has a clue. Everyone is in the guessing game process of trying to establish what the requirements are actually going to be for the future.

IMO there are going to be some big losers out of this, we are certainly on the cusp of change.

The main issue with the forward looking industry is not 'how will graphics be handled' but more like 'how will different functionality be combined and work together in the most efficient config'.

The companie/s who figure that out first, will be the ones going forward with a future.

Those that choose a bad direction or guess incorrectly now at the requirments are going to fail and go bust.

my thoughts.
 
While I do expect raytracing to become "big" somewhere down the road, I do not believe that a "good" raytracing architecture is going to resemble a classical CPU or a single-chip-CPU+GPU hybrid. If ray-tracing is going to be the big thing, I very much expect specialized architectures like SaarCor/RPU to provide a LOT more bang for the buck transistor-wise than, say, slapping together a bunch of Conroe-cores or Cell-SPEs. Integrating SaarCor-like technology with an existing pixel shader architecture also sounds much, much easier than integrating it with any existing CPU architecture.

Just because raytracing has traditionally been solved with CPU arrays in the past does NOT mean that the optimal processing model for mass-market raytracing is that of a CPU array. CPU arrays is something that you use when you have a computationally intensive problem and you aren't really sure of the best way to solve it (or when you ARE sure of the best way to solve it, but the market for your solution is not large enough to justify the expense of making a dedicated chip).
 
In interviews with its researchers, they are confident that graphics processing will naturally shift to the CPU from the GPU. That’s because they believe that the decades-old technique dubbed “ray tracing” will replace the technique of rasterization, or texture-mapping, that modern graphics chips have grown up with.

That's the first thing I've read that's actually helped snap various pieces into place in my own brain on the "wtf could Intel be thinking about to have a go at high-end graphics?" front.

Whether it turns out that way is a different question [please note, the preceding clause is the most important thing I'm going to say in this post ;) ], but at least it makes more sense now what Intel is thinking about and why it might seem reasonable to them to start ramping up to have a go at the high-end.

My question would be --how does API figure into the equation, or doesn't it? Is the API agnostic to ray tracing?

I certainly take it seriously when Dean Takahashi starts saying it, at least re his availability to the right people to talk to.
 
Agreed (arjan de lumens).


But no current company that is really big wants to explore such radical new directions with all their prior investments.

Thats why I have thought for a little while now that the rug could be pulled from their feet with an emerging / hip, young r+d group that can be in the right place for the changing climate, leaving the dinosaurs to bake out in the sun !

I love this, I think its about time for a shift in the paradigm, albeit one that is still going to take a decade !

edit: typos
 
Last edited by a moderator:
Another interesting take:

Henri Richard, chief sales and marketing officer, Advanced Micro Devices: “It’s more of a question of when than if. We will have a transistor budget at some point in time to combine the CPU and the GPU on one piece of silicon. In a multicore environment, one core will be the GPU.
 
If ray-tracing is going to be the big thing, I very much expect specialized architectures like SaarCor/RPU to provide a LOT more bang for the buck transistor-wise than, say, slapping together a bunch of Conroe-cores or Cell-SPEs. Integrating SaarCor-like technology with an existing pixel shader architecture also sounds much, much easier than integrating it with any existing CPU architecture.
It's been a while since I read the RPU paper, but I remember it being very similar to a GPU shading architecture like R520. I think it had a batch size of 4 instead of 16, a kd-tree traversal unit, and accessed memory a bit differently. In all though it wasn't drastically different from what I could tell. With the kd-tree traversal unit it had fixed function hardware just like a GPU. Ideally though the programmer could use whatever data structure is desired and currently only a CPU gives the ultimate flexibility in this regard.
 
To me it really sounds like no one really has a clue. Everyone is in the guessing game process of trying to establish what the requirements are actually going to be for the future. ...

Exactly--whenever you hear discussions about "future" technologies you need look no further than the present to understand them. In this case, this is just marketing talk designed to heighten public interest in cpus and to briefly shift the PR focus from the gpu to the cpu for graphics, because graphics advances have moved the markets for a long time. Intel has already paid for "studies" which have been published recently which accentuate the role of the cpu in "real-time ray tracing," ad nauseum. I see this as little more than the same warmed-over PR pablum we've heard for years, the purpose of which is to get people to think about cpus for a change--today, though, instead of tomorrow.

Given the rapid pace of technology and the fierceness of the competition involved, attempting to predict the State of the Art 5-10 years out is ridiculous...;) Nobody can do that. Just two to three years ago, who could have predicted that Apple would go Intel (especially considering Jobs' rhetoric), that Dell would pick up AMD, or that AMD would be doing as well as it currently is, or that AMD and ATi would merge, etc., ad infintum. Truth is always far stranger than fiction, and as far as I'm concerned talk about "real-time ray tracing" becoming mainstream is certainly fiction. But then, talk about real-time ray tracing isn't really meant to predict the future in the first place--it's meant to influence thinking today. It's just a form of PR, imo.
 
Ray Tracing Animated Scenes using Coherent Grid Traversal

http://www.sci.utah.edu/~wald/Publications/

The primary motivation of this approach is to enable ray tracing of dynamically deforming models. Rebuilding an acceleration structure on each frame enables ray tracing these models without placing any constraints on the motion. As this update cost is — like rasterization — linear in the number of triangles, it introduces a natural limit for the size of models that can be rebuilt interactively. The rebuild cost is manageable for many applications such as visual simulation or games, where moderate scene sizes with several thousand to a few hundred thousand polygons are common.

Our technique may be very appropriate for special-purpose hardware architectures such as GPUs and the IBM Cell processor [Minor et al. 2005] that offer several times the computational power of our current hardware platform. Though kd-trees have been realized on both architectures, they are limited by the streaming programming model in those architectures. In contrast, a grid-based iteration scheme is a better match to these architectures, and may be able to achieve a higher fraction of their peak performance. The current method may also be appropriate for a hardware-based implementation, similar to Woop et al. [2005].
The caveat here is that the paper focuses on primary rays and shadow rays - reflections and refractions haven't been implemented.

The video is nice, though a rather slow download.

Jawed
 
Apart from actually fitting a GPU (rasterizer or raytracer or whatever) on a CPU, and dealing with the thermal and power draw issues, what about memory bandwidth?

No consumer CPU can hold a candle to the 50+ GB/s today's high-end graphics cards can deliver on paper. Even if raytracing at 1600 screen resolutions only requires say a tenth of the bandwidth (probably unlikely), it still bites a sizeable chunk out of PC main memory bandwidth. Not to mention, dedicating half a gigabyte if not more to the video subsystem brings its own set of issues. High-performance memory sticks are NOT cheap, and cheap memory sticks are not high performance... ;)

Also, most people cringe at having to upgrade their CPU along with the GPU - now you'd HAVE to. Anyone who thinks that a CPU with an integrated GPU on it is going to be cheaper than just a CPU is a loonie...

There are a lot of pitfalls when it comes to combining the CPU and the GPU, and from where I stand, there's basically ZERO good coming out of doing such a thing, other than low-end systems might cost a little less upon purchase.
 
There are a lot of pitfalls when it comes to combining the CPU and the GPU, and from where I stand, there's basically ZERO good coming out of doing such a thing, other than low-end systems might cost a little less upon purchase.

I think it's entry and "low-mid" that AMD/ATI are after, for instance, with integrated solutions. The Vista requirements help that make sense too. Some of their statements seem to be pointed at customizing the system bus to help alleviate some of what you're pointing at memory-wise, but again I suspect only for what would be the low and low-mid discrete market of the day.

Intel, who seems to be pushing ray tracing the hardest, seems to be aiming higher up the discrete foodchain, and there I'd have to agree with you that I'm not seeing an obvious way for price/performance for BW to not be a serious limiter how far up the discrete gpu foodchain they can reach with an integrated part. Maybe separate memory slots on the mobo for high-speed graphics memory for that market? I dunno. But it seems platformization is gathering speed not declining, so could be. . .
 
In five years' time we could be looking at systems that use wireless buses. Suddenly the point-to-point restrictiveness of ultra-high-speed memory buses disappears.

Additionally memory becomes almost infinitely variable in configurations: we get mildly excited by ATI's x32-bit bus architecture (instead of x64-bit) but with wireless buses channel widths can vary all over the place. That means algorithms have more choice about the ordering of data in memory (or different tiling arrangements, depending on access patterns) and single PUs can choose to make low-latency or high-latency accesses depending on required bandwidth, say.

So you could argue that this will lead to a unified memory architecture, with extremely easy sharing of data (and easily maintainable coherence) amongst a set of PUs - whether they consist of a CPU with dozens of mini-cores or an array of CPUs or a range from CPU through PPU to GPU.

Jawed
 
Last edited by a moderator:
Okay, my brain exploded on that one. :p It may be three years and a bunch of intermediate steps before I can grok that one to fullness. ;)
 
Another thing to consider is that x86 cores have fairly lowsy floating point performance because they're focussed on correctness, IEEE, double-precision, the works.

Surely it can't be long before Intel decides it's time to put in some single-precision pipelines and steal-back some of the glory that Cell's taken? FP32 is good enough for an awful lot of computation.

I expect this is something ATI has figured is maybe 2-3 years away - once that cat is out of the bag, being a GPU provider is very much harder when GPU architectures have all gone unified and the fixed-function graphics gubbins that makes a data-parallel machine a GPU becomes a minor cost/complexity in the overall architecture. At the same time, ATI knows it is practically impossible to compete on Intel's home turf: double-precision. That complexity may never come to GPUs, it's utter over-kill and immensely costly in terms of transistors (well over 2x I think).

In effect ATI is on the brink of building mini-cores (that come with graphics gubbins for when you want a picture). These mini-cores will be in direct competition with whatever Intel decides to do (it's on their roadmap!) and ATI has decided that the best thing to do is formalise the "GP" in GPGPU and attach themselves to AMD's future CPU direction.

Just because Intel has heretofore failed to build a decent GPU or failed to implement ultra-fast, ultra-wide FP32, doesn't mean it won't happen.

Jawed
 
Last edited by a moderator:
Back
Top