That 'dominance' is largely a myth nowadays: a) The double-attach rate is very high (discrete GPUs being added on top of IGPs), which means it's actually NV which has the largest overall 'used' GPU share. b) Much of Intel's IGP volume is in the enterprise market where GPUs are mostly useless.
Ok, that all seems reasonable to me. I'll concede that Intel's IGP really isn't used by anyone that cares at all about GPU performance.
I don't think you understood my point. There is no magical reason for Intel to be able to compete in the mid-range but not in the high-end. You need to lose your CPU mindset: that's a very normal thing to happen there because you can't just double your IPC by having twice as many units.
I agree that scaling a GPU from mid-range to high-end is easier than for a single-core CPU. Things are changing in the CPU world with the advent of multi-core chips. If multi-core really becomes successful (in that many application use multiple cores), you could imagine that exact same scaling applying to CPUs, too. That is CPUs are becoming like GPUs in this way (as well as many others).
As for volume: what matters in the fabless world is gross profit, not volume. You amortize fabs via volume, but you amortize R&D via gross profits.
I agree that fabless design greatly reduces the volume needed. However, you still need to amortize other fixed costs (the hardware design team, the driver development, etc.). I agree that the volumes for GPUs are such that even if their volume dropped significantly, they would still be able to amortize these fix costs.
I'm kinda tired of CPU experts being completely unable to consider the graphics market dynamics properly. If you still don't agree with me, may I suggest rereading what I said until you do?
I think
you're missing a key point: eventually (in, say, ten years), there won't be a "discrete GPU" market. Zero. Integration is *the* dominate trend in computing hardware. Either we'll have big GPUs with a little CPU in the corner (NVIDIA wins), or we'll have big multi-core CPUs with a little GPU logic (Intel wins). My prediction is that for high-end systems you'll include multiple identical copies of these GPU/CPU chips, but we won't have today's "one CPU/one GPU" divide.
Think back to the last time we have such a processor/co-processor support. It wasn't GPUs it was FPUs! The Intel 386 (and other processors of the day) couldn't fit a floating point unit on the chip. So there were separate floating-point co-processors (FPUs). Then, a few generations latter, they could fit a FPU on the CPU, and the FPUs simply went away.
When I say that Intel will take away the mid-range, I'm not talking about the "discrete GPU mid-range". I'm talking about on-chip GPUs taking more and more market share away from discrete GPUs.
Consider the following situation. What if Intel starts including 8 or 16 Larrabee cores in addition to a few bigger x86 cores on *every* chip it sells. Given a reasonable power and area budget, it seems like these chips would have good enough GPU performance. Perhaps it would have half or 1/4th the performance of a discrete GPU. If a system needs more GPU performance, the system could just include two of these chips (dual-socket motherboards are pretty common these days). ATI/AMD will likely follow a similar strategy. In such an environment, I think fewer PC purchasers will be willing to pay as much for a discrete GPU from NVIDIA. Certainly some hard-core gamers will, but not as many. I think such trends favor AMD/ATI and Intel.
Then again, perhaps the desktop market is irrelevant. Maybe the next war will be fought entirely in the mobile and game console space. In such spaces, x86 compatibility is mostly irrelevant. Perhaps some new upstart will come along and push Intel off its thrown by taking over the embedded space. Perhaps this has already happened...
Of course, these markets are even more cost conscious than desktop systems. In many of these systems there isn't even room for two chips (example: Apple's new MacBook Air that uses a special chip from Intel with a smaller package). To push down costs in the game console market, they will also be looking for more integrated solutions. This all goes double for mobile gaming (PSP, gameboy).
Integration is an unstoppable force.
Also, I'm still not sure you fully understand what DirectX does. You DO realize it's just an abstraction layer to the hardware, and as far from a renderer as can be, right?
Umm.. no, I really didn't know that. As you've castigated me for, I really am a multi-CPU guy trying to learn something about GPUs. GPUs and CPUs are on a collision course. I'm trying to figure out how that might play out. As GPUs and CPUs seem to be on this collision course, I think that CPUs guys (such as myself) and GPUs guys (such as those on this board) can learn a lot from each other.
As for fine-grain synchronization, certainly that won't work in DirectX - but depending on your definition, that's supported in CUDA in a perfectly good way.
I know a bit about CUDA, but my impression was that thread-to-thread synchronization (such as using locks) was still significantly more expensive than it would be on a multi-core CPU. Is that not the case? I do know that the GPUs have some support for fast barriers that coordinate many threads, which gives GPUs the edge on that sort of thing. But I was under the impression that even finer grain synchronization was expensive (or even really hard to express). Of course, graphics hardware is changing quickly enough, this might no longer be the case.