Itanium?

Mendel

Mr. Upgrade
Veteran
What if any future would you predict for Intel's Itanium Processor? How would you view it's success, or lack thereof, so far? What if any architechtural capabilities/innovations will likely be used in future Intel processors? Is there anything to discuss about this or should it really be considered just a total failure?
 
IIRC, the next iteration will be the one where the Alpha team headed. It'll be interesting to see what happens there.
 
The problem with it is it's incompatible with x86, so even though it's much faster then the Opteron and overall much more robust, there is little support in the form of software. You would have to make a complete switch over. Of course it's price doesn't help matters either.
 
Actually, they have compatibility and by that I mean it runs like ass.

But their software solution which is basically a VM running on top of the EPIC ISA is supposed to very fast, though, last I checked it's still not public. Not suprising, Transmeta seems to have done fairly well with this strategy and Alpha with FX32.
 
Saem said:
Actually, they have compatibility and by that I mean it runs like ass.

It supports it but as you said it runs slow because it's just an emulation. That and any IA-32 program that requires system dlls to run will not work if those dlls are IA-64, and vice versa.

But their software solution which is basically a VM running on top of the EPIC ISA is supposed to very fast, though, last I checked it's still not public. Not suprising, Transmeta seems to have done fairly well with this strategy and Alpha with FX32.

I have not heard about this. If it is indeed fast like they claim there may be a future afterall, though I doubt it will be anytime soon now that the world seems to be backing x86-64 in force.
 
ANova said:
The problem with it is it's incompatible with x86, so even though it's much faster then the Opteron
Actually, I disagree with that. IMHO, the REAL problem is, that it is NOT much faster than Opteron/P4. You'd think it's a new design, intel has thrown away all legacy stuff, it costs an arm and a leg, so it should be faster, yes? But it just isn't, at least currently. Itanic 1 is so slow (especially integer performance) it's pathetic, Itanic 2 is better but integer benchmarks still can't really beat Opteron/P4. Look at the SpecINT scores, it can't really beat Opterons, and that's with a large cache advantage. SpecFP is very good, but that's just because it has more FP execution units than a Opteron, and, again, because of the large cache (you can extrapolate that from the Itanics with less cache). So, in the end, it's a new design, it needs super-complicated optimizing compilers (nobody so far can write) but it just doesn't achieve anything (since you can get the same performance with a "traditional x86" design).
I wouldn't say it's dead yet though. Maybe next generation will really be good? If not though (and I'm sceptic at that, because good compilers are very hard/impossible to write for this thing to keep execution units busy, it's design inherent) I suspect intel could be running out of time and instead of the PA-RISC, alpha -> IA64 switch people will rather do PA-RISC, alpha -> x86-64 switch (be it intel or amd) or maybe even switch to power5...

edit: just to illustrate ia64 performance, some spec numbers listed (fastest ones I've found, I've thrown in P4 and P4EE numbers to illustrate that cache DOES make a difference):
SpecINT base:
Itanium 1 (800Mhz, 2MB): 314!!! (granted, old compiler, but even PIII 733 beats it)
Itanium 2 (1500Mhz, 6MB): 1408
Opteron (150, 2.4Ghz, 1MB): 1566
P4E (3.4Ghz, 1MB): 1432
P4 (3.4Ghz, 512KB): 1342
P4EE (3.4Ghz, 2MB): 1666

SpecFP base (same cpus as above)
Itanium 1: 645
Itanium 2: 2161
Opteron: 1591
P4E: 1481
P4: 1300
P4EE: 1548
 
cristic said:
Does anyone know what's the price difference between a similiar Itanium setup and a Opteron setup ?

Well, you can pretty much buy dual opteron system (maybe not quite quad.) with the price of single Itanium.
 
ANova said:
URL doesn't work.
The Opteron 150 your using is AMD's latest. Wait for the 2.0 GHz Itanium 2 with 9 mb cache due out at the end of the year. IA-64 is better overall to the aging x86 platform; Intel just has to iron out the kinks.
Sorry, I don't buy that. You can't compare tomorrow's Itaniums with todays Opterons. Otherwise I'm going to say just wait for the dual core Opterons, these will blow any current Itaniums out of the water (Ok, dual core Opterons won't be here by the end of the year, but slightly faster 90nm Opterons will). And apparently intel's solution to the lacking performance? Add more cache - that doesn't make the architecture better...
And it's not like Itanium is new, just came out of the dark now. We're talking about 3 year (?) availability (granted, Itanic 1 was little more than "public evaluation chip"), the overall design and compiler research is at least some more years older, and intel invested some billions (!) - you don't think it's time IA-64 should begin to show what the architecture is capable of it is so good by now?
You might think I'm just bashing IA64, but actually I'm not. I think it uses indeed very interesting concepts (predication et al), and also think it's interesting intel kinda used the opposite approach to the P4 design (wide but short pipeline vs. narrow and long) but it just doesn't seem to work that well in practice. And I've said I'll give it another generation, but if it turns out to be disappointing too I think intel would be better off to bury & forget it. You can't wait eternally for the magical next generation which magically solves all performance problems.
 
mczak said:
URL doesn't work.

It should work now.

Sorry, I don't buy that. You can't compare tomorrow's Itaniums with todays Opterons.

The document I linked to shows the Opteron 244 as AMD's top performer at the time. Since then AMD has released updated versions and Intel has not.

Otherwise I'm going to say just wait for the dual core Opterons, these will blow any current Itaniums out of the water (Ok, dual core Opterons won't be here by the end of the year, but slightly faster 90nm Opterons will).

Multi core Itaniums are due for launch around mid next year as well.

And apparently intel's solution to the lacking performance? Add more cache - that doesn't make the architecture better...

Granted, they do have a habit of doing that but the architecture is already in place. I'm sure they don't want to spend more on development. What your failing to realize is a 1.5 GHz machine is performing nearly on par with a 2.4 GHz machine, both of which capatalize on the large number of IPC, low frequency approach. And like I said, IA-64 overall is a much better architecture. The Itanium is a better design, it just suffers from incompatibility issues atm and overpricing as a result of it's development cost.
 
ANova said:
The Itanium is a better design, it just suffers from incompatibility issues atm and overpricing as a result of it's development cost.

If the Itanium was priced to recoup development costs, you would have to add a couple of digits to its selling price - no, it is priced to be deployed. And for its intended market, the selling price isn't such a major consideration.

But as for performance, the processor just doesn't make much sense. And now that there are other good (and comparatively cheap) options in the 64-bit race, it is hard to see the Itanium go anywhere. Making multiway Opterons with good performance scaling is just too easy and inexpensive. If the Itanium doesn't have a compelling software story, and as far as I can see it doesn't, it's game over. Its current market niche is such that it will take a long time to die completely, but it's difficult to envision any other final outcome.

For an oldtimer like me, the similarities to the 80860 is obvious. An unarguably "better" design, great FP performance, dead duck in the marketplace and eventually killed off after doing limited service as a coprocessor and in Intels supercomputer projects.

IMHO the Itanium is doomed. It offers no compelling advantages to anyone. For a processor to be able to show such advantages in the future, I'd say it would have to be geared towards efficient on-chip parallell deployment, but even then I'm doubtful whether the application pressure will be there to drive such a design to commercial success as a general purpose processor.
 
ANova said:
Itanium workstation performance

The Opteron 150 your using is AMD's latest. Wait for the 2.0 GHz Itanium 2 with 9 mb cache due out at the end of the year.

So you're comparing a 400mm^2 chip with 9MB cache that burns 130W to a 130mm^2 chip with 1MB cache that burns 90W?

Currently the highest ranking Opteron beats IPF in SpecINT by a hair and looses by 25% in SpecFP (1600-something vs 2200).

ANova said:
IA-64 is better overall to the aging x86 platform; Intel just has to iron out the kinks.

It's an architecture of kinks. It was conceived (early 90s) and built on the premise that Out-Of-Order scheduling would be impractical (critical speed path limiting). So they made an in-order superscalar with cludged up instruction scheduling.

The fact that it's an in-order superscalar means that it doesn't tolerate memory latency very well (and hence Intel/HP has to stick large amounts of cache on it). The instruction bundles should ease instruction issue/scheduling, which it does - but at the cost of a significant higher number of execution units, because in order to issue instructions at full rate you need a cluster of execution units to match the worst case distribution of instruction bundles, and in doing so setting yourself up for the combinatorial explosion (well only N^2) of the dependency checking/score boarding/bypass network.

So they moved complexity from the front of the pipeline to the back and in the process of doing so reduced memory latency tolerance and added execution units - with fsck all performance gain to show for it.

IPF has a few redeeming features. In particular the 128 floating point registers, and in it's current incarnation it has dual MAC units (and large chunks of cache) again helping it alot on HPC stuff.

But it doesn't look like a win to me at all in the long run.

Cheers
Gubbi
 
ANova said:
mczak said:
URL doesn't work.

It should work now.
Ok. This is pure marketing BS, with "estimated" performance numbers, and they use different systems in each benchmark - if you ask me they left out the systems which would have been faster in these benchmarks... sorry, that paper is not useful for performance evaluation, not even close. It is useful to sell HP Itanium systems though I guess.

The document I linked to shows the Opteron 244 as AMD's top performer at the time. Since then AMD has released updated versions and Intel has not.
intel's problem. If they can't get faster versions out at the moment, that's not AMD's problem.

What your failing to realize is a 1.5 GHz machine is performing nearly on par with a 2.4 GHz machine, both of which capatalize on the large number of IPC, low frequency approach.
Old argument, but still wrong. What counts is what you can get out of the design at the end of the day, i.e. "IPC * max clock frequency". And the current Itanium just doesn't look like it's really a leader there.

And like I said, IA-64 overall is a much better architecture.
Well, if it is so good why isn't it fast then? You can't just say "it's a much better architecture". Sure it looks quite capable on paper, but the benchmarks speak another language.
 
Entropy said:
If the Itanium was priced to recoup development costs, you would have to add a couple of digits to its selling price - no, it is priced to be deployed. And for its intended market, the selling price isn't such a major consideration.

Obviously, they couldn't charge an extreme amount or they would doom the chip for good. I merely stated that because it is the reason for their current high pricing.

Gubbi said:
IPF has a few redeeming features. In particular the 128 floating point registers, and in it's current incarnation it has dual MAC units (and large chunks of cache) again helping it alot on HPC stuff.

But it doesn't look like a win to me at all in the long run.

I'm sure the Itanium will fail, especially since Intel is supporting IA-32e in it's Xeon line. I just think IA-64 would be a better route to take in the long run then continuing down the x86 path with extensions.

mczak said:
Ok. This is pure marketing BS, with "estimated" performance numbers, and they use different systems in each benchmark - if you ask me they left out the systems which would have been faster in these benchmarks... sorry, that paper is not useful for performance evaluation, not even close. It is useful to sell HP Itanium systems though I guess.

Possibly, possibly not. The numbers they do show seem to be in order from what we've seen. I agree about the benchmarks that leave out other chips being suspicious though.

Old argument, but still wrong. What counts is what you can get out of the design at the end of the day, i.e. "IPC * max clock frequency". And the current Itanium just doesn't look like it's really a leader there.

Well, if it is so good why isn't it fast then? You can't just say "it's a much better architecture". Sure it looks quite capable on paper, but the benchmarks speak another language.

It's still a relatively new and immature design. I may be wrong but I wouldn't be surprised to see some significant improvments if the industry did decide to back it.
 
ANova said:
Entropy said:
If the Itanium was priced to recoup development costs, you would have to add a couple of digits to its selling price - no, it is priced to be deployed. And for its intended market, the selling price isn't such a major consideration.

Obviously, they couldn't charge an extreme amount or they would doom the chip for good. I merely stated that because it is the reason for their current high pricing.

Gubbi said:
IPF has a few redeeming features. In particular the 128 floating point registers, and in it's current incarnation it has dual MAC units (and large chunks of cache) again helping it alot on HPC stuff.

But it doesn't look like a win to me at all in the long run.

I'm sure the Itanium will fail, especially since Intel is supporting IA-32e in it's Xeon line. I just think IA-64 would be a better route to take in the long run then continuing down the x86 path with extensions.

mczak said:
Ok. This is pure marketing BS, with "estimated" performance numbers, and they use different systems in each benchmark - if you ask me they left out the systems which would have been faster in these benchmarks... sorry, that paper is not useful for performance evaluation, not even close. It is useful to sell HP Itanium systems though I guess.

Possibly, possibly not. The numbers they do show seem to be in order from what we've seen. I agree about the benchmarks that leave out other chips being suspicious though.

Old argument, but still wrong. What counts is what you can get out of the design at the end of the day, i.e. "IPC * max clock frequency". And the current Itanium just doesn't look like it's really a leader there.

Well, if it is so good why isn't it fast then? You can't just say "it's a much better architecture". Sure it looks quite capable on paper, but the benchmarks speak another language.

It's still a relatively new and immature design. I may be wrong but I wouldn't be surprised to see some significant improvments if the industry did decide to back it.

I have two Itanium2s since it's been released - a piece of crrap, compared to our dual Opterons. Only reason we got them they've got 12 gigs of memory - and they were the only option back in time.

Itanium is DEAD. It becomes a high priced toy for certain scientific projects but that's it. It's a very screwed up architecture, marching exactly against today's general computing needs.
 
Is Itanium/IA64 really that much better than x86? Some shortcomings of the IA64 compared to well-designed RISC ISAs or even the crufty old x86:
  • Only 1 addressing mode for load/store instructions (register indirect) forces the IA64 CPU to spend precious arithmetic instruction slots to compute addresses for many load/store operations and prevents HW optimizations such as sum-addressed memories from being used. Sure, it has post-increment modes, but using these forces artificial dependencies between otherwise independent load/store instructions, reducing IPC. In the mean while, people have figured out how to do all the complicated x86 addressing modes very, very fast.
  • The very large architectural register files make it difficult and expensive to add Out-Of-Order execution to the IA64 architecture - all IA64 implementations to date have been in-order, while most other processor families that matter in the high end have gone Out-Of-Order a long time ago. This also makes it harder for the architecture to mask/overlap cache misses.
  • The pseudo-VLIW instruction format gives poor code density (due to both large instructions and many NOPs), increasing the strain on the instruction cache. Also, there is enough trickery in the instruction set that it defeats the advantages of going VLIW in the first place (which are mainly extreme simplicity of instruction decoding and scheduling; the IA64 ISA specifies a lot of tricky virtual->physical register renaming and it does allow for instruction dependencies within the instruction word.)
  • Integer multiplies are done using the Floating-point multiplier, meaning that in order to multiply two ints you need to move a lot of data back and forth to the FP register file, making the ISA itself slowing down multiplies.
  • The FP fused multiply-add instructions act as if their intermediate result has infinite precision. This may be a good thing, but it breaks IEEE-754 compliance.
 
Back
Top