maybe. then again, there may be hw-posed issues for which you may never find a satisfactory solution in sw. as you said, since it's always the combination of hw and sw, a potential design issue discovered late in the hw (due to late sw, etc) could amplify the burden on the final sw by orders of magnitude. but since you brought up Abrash' role in the project - if Mike could not deliver a satisfactory rasterizer running on this hw, then maybe the hw was not meeting its GPU-domain targets?.. just speculating here, of course.
There's definitely a strong interaction between the software and hardware. Michael Abrash, Mike Sartain, Tom Forsyth and other people from the software team were
directly involved in the design of LRBni, and probably many other hardware aspects of Larrabee. It's quite possible the hardware didn't behave as expected, and this had big consequences for the software, causing much delay. Or possibly they misjudged some software aspects and this demanded a hardware redesign. But either way you can't just make hardware adjustments and expect the performance of the software to go through the roof. You still have to balance the use of all available resources and major software changes can be required for minor hardware changes. At least that's my personal experience hopping from one CPU generation to the next, and Larrabee is an entirely different animal.
Anyway all I'm saying is that the software side of this is more complex than for any other computing device like this. Software schedules slip all the time, and considering the strong interaction with the hardware and the high targets I'm really not surprised Larrabee got delayed, despite some very bright minds working on it. It doesn't mean it's cancelled for good, and it doesn't mean x86 is the culprit for the delay.
i really don't understand your logic about ISA acceptance - ISA's today are mostly judged by how well they play with their compilers (from coders' perspective), and meet their price/performance/wattage envelopes (from EE/business perspectives) - not by their proximity to their 40th anniversaries. look round you - chances are you'll find more devices hosting 'young' ISAs than such hosting x86 or older. so what reception and by whom are you concerned about?
PC developers. The ability to create libraries that can run on a CPU as well as on Larrabee is a major advantage. And being able to use familiar tools also lowers the entry. Also, selling CPU-compatible binaries forms an incentive to create not only full-blown applications but also libraries and tools for other developers: a software ecosystem.
dramatically - no. but then again LRB's scalar ISA did not have to copy in-verbatim any existing ISA - intel could have used that to their best advantage if they hadn't been so concerned with the central socket. heck, they could've used a modernized form of their dear x86 (say, x86-64) and shaved off the legacy, trimmed the pipeline, retooled the protection mechanism - any/all of the above, just to make LRB a better GPGPU core, while staying in familiar waters. but nooo - it had to be the word97-compliant GPU.
Being able to run the same libraries on a CPU and on Larrabee means they need a common base: the Pentium instruction set (with support for x86-64). Everything else is part of an extension so it's no different than developing for CPUs with support for different ISA extensions.
Besides, I really doubt you can make a significantly better x86 core by shaving off legacy stuff. An i386 supports 99% of all Pentium instructions (minus x87) and weighs in at only 275,000 transistors. And in fact they did avoid the biggest overhead by sticking to in-order execution.
i agree, the task could become prohibitively-complex if they were seeking for that head-shot at the GPU - 'behold, we're the new GPU masters!' - again, i don't think anybody (clinically sane) expected LRB to dethrone any GPU heavyweights - if intel themselves had such expectations then maybe they were not familiar with the problem domain they were getting involved in. again, not saying they had such expectations - just trying to carelessly speculate of the events we've witnessed lately.
It takes a certain level of confidence to enter the GPU market, that can quickly turn into cockyness. While I still believe they could eventually succeed at creating an interesing GPU, it's not unlikely that their original time schedule was too cramped.
i'd venture to guess ms' ref performance issues are mostly algorithmic, and only secondly - of insufficient clock-counting. IOW, one cannot say that had they 'coded to the metal' of any ISA they'd have achieved much better peformance. how does that prove the (non-)fitness-to-a-task of an ISA, though?
My point was that software performance varies by a huge amount, with the ISA choice being of only secondary importance. And Larrabee depends mostly on LRBni anyway, not the legacy x86 portion. So once again it's more likely it has been delayed due to software issues than ISA issues. I can't exclude other hardware issues, but it's very unlikely they'll touch x86, even if that was an option.
maybe i misunderstood you, but i believe you mentioned something about 'incremental performance improvements of present code' in your original post, ergo my performance reference. pardon me if i've erred.
Initially code targetting the CPU will run slower on Larrabee, which can then be improved incrementally to run faster.
so let me know if i got you right here: developers would really like to spare themselves a trivial recompile of their existing scalar code to a new scalar ISA, but they would eagerly face the challenges of a brand new VPU which, apropos, is only meant to carry the bulk of the workload? hmm..
Trivial recompile? Porting code is much more complicated than a recompile. For a lot of external libraries you might not even have the source code. Having to rewrite those really discourages a lot of developers. If instead you're able to get a prototype running on day zero it really boosts confidence. You might eventually end up rewriting those binary libraries anyway if it helps performance, but at least you'll be able to do that incrementally one function at a time, which is a massive advantage. Take it from a TransGamer.
you can open your whatever compiler and start using the APIs these parts prudently offer. occasionally, you might have to resort to heresies like OCL, CUDA or native compilers *gasp*. regardless, it would be still a tad better than what you could do with a LRB today.
Sure, but it wouldn't be better than what you'll be able to do with Larrabee tomorrow.
Larrabee is a long-term investment in dominating the computing world. Anything that helps smoothen the path is a significant advantage. Despite carrying some overhead, x86 is such an advantage.
what delay - the parts in question are on the market. the APIs are on the market. the native compilers are coming last, but you can ask intel how that generally goes. *wink*
The delay in creating momentum in software development. Sure, you can use the APIs to create certain applications, but you can't create anything not supported by these APIs, which the processor would actually be capable of. And even for APIs with the same functionality developers can have a certain preference. The more things you support the more attractive the platform becomes to a wide range of developers. And instead of writing everything themselves, Intel can offer x86 compatibility and a couple of tools to enable developers to quickly port things over, use existing APIs, and create and trade new components.
I don't really get your 'wink'. As far as I know Intel has always updated its compiler with support for new ISA extension very quickly. Besides, with x86 compatibility Larrabee doesn't even depend on Intel to deliver all the development tools. There are plenty of compilers and other tools for x86 that only need a minor update to support LRBni.
issues with what - putting their next SIMD into their next CPU - no - they just sent it off the door - no issues. or do you mean issues for the developers - easy use of the new extension without having to (re)learn a new ISA? - let me see - no auto-utilizing/auto-vectorizing compilers for generations upon generations of the ISA, instead some rudimentary in-house libraries, but clear delegation of the onus of making use of the new extension to the app developers. which, combined with lack of basic capabilities commonly found in other SIMD ISA, can only mean no issues for the app developers. *grin*
Despite all that, game developers and codec developers feast on new ISA extensions like vultures. Nobody wants to be left behind, and it only takes a few developers to abstract things into libraries. Given that LRBni does offer a wide range of SIMD capabilities, it will have no issues creating a software ecosystem from null. It's the legacy x86 portion for which the existing ecosystem really matters.
let me ask you something: why, in your sincere opinion, developers embraced the GPU shader architectures, particularly after the advent of the HL shading languages? and why do you think intel was so determined to come up with a GPGPU of their own. i mean, after all they had the, erm.. fine GMA line of GPUs (more than half of the pc market - that's what we call a developers' embrace, right?), and they held the key to the central socket. so why?
Developers embrace shaders... for graphics. It makes a lot of sense to describe the "shading" of millions of pixels as a function in a high level language. It all breaks down though when trying to create a different data flow. After CPUs became multi-core and considerably increased SSE performance, a lot of developers stopped bothering about GPGPU.
However, multi-core CPUs aren't the best answer to generic thoughput computing either. So Intel realized it had to come up with something that combines the best of both worlds...
Of course GPU manufacturers haven't been sitting on their hands either. OpenCL and architectures like Fermi chase the same goals as Larrabee. But it's pretty clear that the classic concept of shaders are getting dated. And it remains to be seen whether OpenCL and Fermi really offer the level of freedom developers are looking for, or whether it's just another half-baked solution for a specific domain. Note that for Intel it doesn't really matter. Larrabee will support OpenCL really well. Each 'shader unit' comes with its own fully programmable 'controller'. Either way Intel doesn't have to bet everything on one API. It can support anything developers want, sparking off from an existing x86 ecosystem.
so, another Q from me: what, again in your sincere opinion, failed this project? maybe Abrash & co's inability to pull a half-decent D3D/OGL implementation for it?
Abrash & Co. are very capable of pulling it off. I just believe it's not finished yet. There was no room for errors, but with a project of this complexity that was unavoidable. Nobody can predict the full consequences of every tradeoff, so even minor miscalculations can result in major redesigns.
i mean, according to you, the performance/power ratio should've been ok (unless there was something deeply screwed up in LRBni, since the rest of the chip - namely x86 and caches - were just fine), the adoption of the programming model would've been fine (x86? - woot!). pretty much everything would've been roses. and yet, no LRB on the shelves after 3 years of focused effort (by some pretty smart individuals at that, where we totally agree). so why*?
Three years is nothing. G80 took at least four years of development and was not nearly as ambitious. And since many of the things that delay GPU development have been turned into software on Larrabee, which had to be written from scratch, I'm betting the "why" is a number of software issues that simply need more time to get resolved.