ROPs I think will go away eventually. I think both CS and OCL will eventually get support for the full texture sampler functionality, effectively killing SM/OGL X.X at that point forward (ie, SM and OGL spec development will die).
I think SM and OGL might live longer than that. Their presence as long-time standards will take time to dissipate. The leading edge of developers might hop off, but the rest of the market will want elaboration on their established code bases.
The engine per game era died a long time back to be replaced with the licensed engine era.
Given an existing CS/OCL/Metal engine and source, its is no harder to modify it than it is to roll your own engine using D3D X.X.
It's just as easy to modify an engine using a proprietary to-the-metal implementation as it is to modify an engine using universally known and widely accepted standard?
I might not be able to find someone who has fiddled with EPIC Engine6, whereas there is a larger pool of people and likely much better documentation associated with DirectX13.
I recall there was a spate of problems with the initial roll-out of the UE and the engine-maker's possible conflicts of interest in supporting its own game rollout versus supporting the customers it essentially competed with.
Also:
Better than that, it can still run code that ran on an 8088 and 8086! That's 25+ years!
I don't think Intel gambled on there being complete compatibility with chips one generation removed, ISA compatibility or not.
An i7 is so removed that there are likely hardware behaviors, bugs, and changed specifications in the way.
As to Larrabee:
Actually, I'd wager it does as well if not better on x86 code written a decade ago as any hardware that was available a decade ago.
We're talking about a best case of a 700 MHz K7 or a 600 MHz P3.
I just don't know about this one. It might be close in a few situations, and a loss in others.
On 10-year old code it's single-threaded code.
One Larrabee core can't run any multimedia extensions, is in-order, has a much narrower issue width (probably 1/3), awful branch prediction, and far fewer execution resources.
Assuming no crashing on instructions that came about after the P54 or P55 cores, this might still be balanced in some cases by an on-die L2 cache of 256KiB and the on-die memory controller and higher bandwidth.
And i7 runs code that used to run on a 386.
To be somewhat nitpicky, it
might run code that used to run on a 386.
There are low-level implementation details that have not stayed constant over the years that may or may not scuttle the attempt, assuming everything else in the system besides the CPU doesn't mess something up.
So we're going to leave it to Microsoft, AMD, Nvidia and Intel to define and implement every possible rendering technique that becomes feasible on future hardware? The reason it's doable now is that the set of feasible approaches is tiny. As that set grows it's highly improbable that people would be happy with whatever canned implementations those guys come up with.
One of the benefits I see of the current system is the amount of steering that Nvida and ATI/AMD had in determining additions that are feasible in hardware, which balanced nicely with the input software developers wanted, which provided a sanity check in the evolution of computer graphics.
I'm not entirely impressed with Sweeny's last decade of prognostications about what hardware was supposed to be able to do.
What aspect of DX11 does Larrabee not support?
To my knowledge, it should support it fully, at least in software.
The texture units are something of an unknown, but any minor case they might miss could fall back to software.
An architecture like Larrabee won't have that problem. It combines the throughput of a GPU with the ultimate programmability of a CPU. What developers want isn't super-fast tesselation and anything else running like a snail. They want a wide variety of algorithms to run at predictable performance without having to understand the hardware details.
Mostly, sort of.
It appears that it will have 1/2 to 1/3 of the throughput of the top GPUs in 2010 with the more robust programmability of a CPU. We'll need to see how things sort out with the actual implementations for sale.
As for the expectation of predictable performance, that would be a revolution given how poorly the crystal ball works on current multicores.