I was primarily responding to those who believe x86 is the cause of Larrabee's delay.
i, for one, believe that choosing a hapless ISA has
not helped intel to deliver this part. nothing more than that.
Replacing the x86 scalar cores with something using a leaner ISA would have given them 10% higher performance max. Do either NVIDIA or ATI blow off the release of a new GPU because they've only reached 90% of the performance they've hoped for? Don't think so.
not the performance per se. abysmal power efficiency (ie. performance/wattage) is what most likely killed this project (or 'delayed' it, as you put it). and yes, the poor choice of an ISA in a seriously-parallel GP part can have a devastating effect on the power efficiency, crucially more-so on the turf of specialized parts - don't forget it's not a CISC-vs-RISC case we're having here, it's CISC-vs-specialized. and nobody's bringing up the question of LRB's performance - i don't think a sane person expected LRB to take the GPU crown by storm - sane people expected it to be just in the ballpark, thus 10% performance drop is not worth discussing. 10% of the projected wattage, OTOH, is yet another nail in the coffin for a GP part trying to fight against specialized parts - ie. the GP part being already times worse than the specialized competition.
They are facing a much more serious issue. And I strongly believe it's a software issue. I've seen my own software renderer reach 30% higher performance by changing only a few lines of code, several times in a row. But it took me months of research, experiments and analysis for each of those changes.
i strongly doubt it would've taken intel months to spot a sw performance anomaly in their own chip. IMO, performance has likely been relatively on-par with the projections.
Intel chose x86 because it saves them a lot of time and money.
ok, screw the dumb seaplane analogies: how does owning something that does not fit a job save you time and money toward that job?
They own the ISA, they own the core IP, and they own lots of powerful development tools.
..where the majority of those tools were developed in the course of the LRB project? unless you'd argue intel meant to use LRB as a p5 farm.
Even though they're facing a delay, any other ISA would have costed them a lot more time and money. On top of that every PC developer has software that will compile for Larrabee with little or no issues. Don't underestimate how awesome it is to have something running on day zero and be able to incrementally improve performance.
i really doubt the bolded part ('hey! my C2D-targeting code runs faster on a p5! woot!'). unless you meant 3d API code, in which case i don't see the basis for the improvement vs a GPU.
There are also many binary libraries that won't require any recompile. Yes, they won't run efficiently, but that's not always necessay. You can have some very useful routines that perform complex tasks but you only need to call them a couple of times.
you totally lost me here. what x86 binary libraries you would like to run on a LRB, and why would you want to run them there, and not on your dual/quad-core monster CPU that was designed to eat old x86 code for breakfast?
So x86 allows them to create momentum much more quickly than anyone else. It still won't be easy, but the potential is huge and if they succeed it will be nearly impossible for the competition to do better with another ISA.
again, everything you've argued for so far has been based on the premise that x86 fits the task nicely. but we don't agree there. and low and behold, the 'competition' has material, currently-sold-on-the-market parts - how's that for doing better?
You mean the ZMS-08? It only supports OpenGL ES as far as I know. The choppy 3D demo with the motorcycle also couldn't convince me that it has competitive performance compared to a similar sized dedicated graphics chip.
zms-08 tapes out Q1/Q2 '10 (no LRB will be on the market by then). the 'choppy motorcycle' demo is some pre-production code on zms-05 - final code reportedly runs better. and i wonder how choppy LRB's first GPU-originated demos would have been, alas that's something we have no means to know. FYI, zms-08 should decimate zms-05 judging from the paper specs (i promise to report on that once i get my hands on it). also, the part offers much more than GL ES - all in the form of APIs (Creative are actually opposed to the idea of giving direct access to the GPGPU part to application programmers, at least not until they have a sound compiler targeting that).
They'll certainly have to provide highly optimized implementations of every version of Direct3D and OpenGL, but it's also intended to let developers have direct access to the hardware. That's the whole point of using x86. They want people to develop just as much software for it as for their CPUs. NVIDIA is trying to achieve the same thing with CUDA, but is having a lot of trouble creating momentum because it's hard to migrate existing code and distribute binaries. There simply is no software ecosystem for CUDA yet, and it will take a very long time to create and expand it. x86's ecosystem is already massive.
cool. and how big is the LRBni ecosystem?
They don't have to optimize every piece of code. 90% of execution time is spent in 10% of the code. The top hotspots are even smaller.
sorry, i meant to say 'every piece of code that matters' - clearly optimising every last op would be nonsensical.
Last time I checked, netbooks with Atom processors were selling like hotcakes...
last time i checked, the market i was speaking about was spread a tad beyond netbooks. basically, it works like this - the moment desktop windows becomes an non-factor, that same moment atom develops weak knees (or simply follows the way of desktop windows).