Charlie should go hide in a hole, because his complete lack of understanding of the dynamics and technical factors at play is so pitiful that I'm not even sure where to start.
And I think he needs a serious reality check regarding the performance of that solution too:
http://www.idfun.de/temp/q4rt/benchmarks.html - notice that performance scales linearly with resolution, and that this is at 256x256. So, yeah, that's just 30-35x higher performance than at 1920x1200.... And that's for a laughably small numbers of secondary rays, I'd suspect!
If you want to do raytracing, at least do things right. Like, you know, these guys have:
http://graphics.cs.uni-sb.de/Publications/2006/drpuasic_rt06_final.pdf
The most important thing there to notice probably is the chip layout on Page 5. Notice how the shader core takes so much of the die. Next, look at Page 7's performance numbers compared to CELL, and read the following paragraph: "
A comparison to a Cell implementation of ray tracing shows up to 2.5 times higher performance, despite the hardware complexity being similar (see Table 1), and the DRPU8 ASIC performing much more complex shading (including textures). This shows the efficiency of the DRPU architecture compared to general purpose designs."
As such, excluding the shader core, the perf/mm2 and perf/watt of that chip are quite impressive to say the least. That's fairly unsurprising, given that it's fixed-function, but it does highlight how important that is. There is nothing magical about CPUs that makes them so amazingly good at raytracing, just like there is nothing magical about GPUs making them horribly bad at raytracing if the implementation is made with their architecture in mind.
Ask yourself this question: if raytracing becomes important as an addition to rasterization for certain effects or, less likely as a replacement for it... Which chip is the most likely to integrate fixed-function units for raytracing? The CPU or the GPU? Of course, that might not matter so much if the two chips merge, but right now that isn't going to happen in the high-end for the next couple of years.
Also, Techno+, these two threads might be of some interest to you:
http://forum.beyond3d.com/showthread.php?t=40372
http://forum.beyond3d.com/showthread.php?t=36792
In the end, what you really want IMO are serial-optimized, throughput and latency-tolerant processors along with special-function units, all on one chip. NVIDIA, AMD and Intel all seem to have long-term projects with at least three of these four things on the same chip, but I haven't seen any indication of anyone wanting the four at the same time. That would definitely be quite interesting, if the programming paradigm was right.
Another factor to consider is that design and mask costs seem to imply that the best strategy, going forward, is to maximize your addressable market for every chip you create. Chips optimized for a specific niche might not make sense eventually, and this is going to get even more pronounced as you get to the 32nm/22nm nodes and below. In that context, integration might become an economic negative, unless it is for an extremely large market in the first place, or that redundancy allows you to extend your target market.
There are many potential solutions to maximize RoI (Return on Investment) there while keeping integration levels high where it matters, but I'm already off-topic enough, so I'll just keep that for another rant...