Hey, welcome to the board Matt Pharr. Don't take the negative comments too seriously, the crowds become wild when you insult their favorite coprocessor!
Anyway, I think you're spot on, but that the presentation is severly wrong in terms of timeframes, although perhaps it's just I'm not getting the right impression on when you think this is going to become important. Simply put, in the x86 space, I'm skeptical we will see this sudden zomg increase in FLOPS
And secondly, I fail to see many algorithms using complex data structures that might become highly relevant to actual GPU rendering, and not just GPGPU (where, of course, more CPU flops and better interconnects would have huge advantages) - but then again, you're in a much better position to know about those than we are!
You mention quadtrees for shadowing for example. I'm a tad skeptical you're going to get any decent advantage from those compared to, say, cascaded shadowing techniques. You might be thinking of something else than I am, of course. I am very skeptical we are going to have a good reason to do that kind of stuff for rendering in the next 5 years, but of course, I'd LOVE to be proved wrong by an innovative idea/technique. Is there anything specific you're thinking of and could mention and explain in a tad more detail?
In the world of GPGPU, I agree there are huge possible rewards from a decent CPU you can interact with at good speed. Taking the example of raytracing, for outdoors environments, a basic scheme would use quadtrees for the world and octrees for each model. Now, imagine if you had an animated model and wanted to build an octree for all the triangles in it every frame after the GPU animated it via R2VB... Oops, good luck!
In D3D10, you could use the GS to hack it a bit to a certain extend. But in practice, it doesn't make any difference. It still doesn't allow you to do absolutely everything, and much of the time, if you can do something, the end result might not be too pretty. The resource limitations are very real, and if the rumours are to be believed, the performance will be catastrophic for remotely complex usage scenarios. And even if it wasn't, what you're fundamentally doing is using the GS to program the GPU as a serial processor, guaranteeing not-so-great performance even if the hardware was efficient at it, imo. So, is there some potential there? Sure. Is it even worth seriously thinking about? Not really.
So overall, I think you're right as I said, just that your presentation seems to imply this is going to make sense really soon and nearly would today, when in fact I'm a tad skeptical it does in everything but the PS3, where it makes "OK" sense and has a few interesting possibilities (although some of those still allow you to see the graphics pipeline as a one-way thing; if your goal is proper GPU utilization via smart algorithms, the sky is the limit, definitely). But where things will really begin moving towards that paradigm will be the next-next-gen console systems (PS4/XBoxC/etc.)
What's interesting is that these systems will most likely be based on the desktop D3D11 GPUs imo, if we think about it in terms of necessary timeframes. So there might be a few twists yet to come that nobody has thought of yet, except perhaps the few architects that are already working on these products right now, as we speak and dream. I'd tend to believe we'll be very pleasantly surprised, but it's much too early for me to dare taking a guess, no matter how much I'd like to!
Uttar