Risc vs Cisc and John Carmack

JonWoodruff

Newcomer
John Carmack brought up some interesting points about x86 these days being just fine compared to modern RISC processors, and also Itanium's VLIW architecture. One of the key points was code compression. This didn't make too big of an impression on me until I read that http://www.theinquirer.net/?article=7993 compiled Itanium code for VMS is 3x larger than x86 code. If code can be 3x larger, than that means that data cache is 1/3 as effective! (no wonder Itaniums have huge caches) And bandwidth for instructions is also 1/3 as effective.

So the question is: Are we better off with a pure RISC architecture or a CISC architecture? If the P4 was running on pure native microcode instead of unpacking x86 instructions, would it be faster or slower? Considering that decoding only takes a few(2?) stages of a 20 stage pipeline, could it be that the compression you get from having smaller instructions with more implications is worth it?

Disclaimers:
This is all disregarding the limited number of registers in x86 vs other architectures.

I don't expect all Itanium code would be 3x its x86 equivilent.
 
First of all, Itanium is running 64bit code. That means a lot of things will be twice the size. So 3 times larger doesn't mean much.

BTW, people have done bytes/instructions of lots of various architectures on various code and x86 doesn't have that big an advantage. Not too mention a lot of x86 optimizations means lots of different paths which bloats your code anyway. You'll find more on this at RWT, they're dicussing just this after Linus' thoughts on x86 and the rest.

Code density is nice, but the x86 brand of CISC sucks. A hybrid which used a GOOD RISC foundation and built more powerful --I'm aviod complex for the obvious reason-- instructions then I'd say it's good. Until then, CISC where CISC is basically x86, it blows goats.

PS. I disagree with Linus. Money made x86 implementaions run fast. Not x86. If one was to employ all the "tricks" that are applied on for the latest x86s MPUs and did it to a 64 bit RISC chip with 32-64 GPRs, guess who'd have their ass handed to them in EVERY respect?

BTW, decoders, especially the x86 variety suck for clock rate. This is precisely why the P4 moved the decoder out of the critical path. This is why I think Hammer will suck in the scalability department -- clock-rate. I don't believe it'll clock high enough and the IPC advantage over the P4 will be stripped away with Prescott's improvements and higher clock rates. Not saying Intel did it right, merely AMD isn't necessarily doing it right.

I'm taking an course and we use IA-32 assembly, it blows goats. I hate it soooo much. It's just stupid has hell.
 
If IPF code size were 3x larger than x86, it would make the instruction cache 3 times less effective, not the data cache. In any case, the figure is wrong: in reality the IPF bloat is ~1.5-2.25x (on SPEC). BTW, the reason Itanium needs such big caches isn't code bloat so much as the fact that it is in order and thus has to sit and wait every time it goes to main memory.

In any case, code size isn't really a strong advantage of CISC over RISC. General purpose RISCs all use 4 byte instructions, where the average x86 instruction works out to ~2.5-3 bytes. But there are plenty of RISC variants for embedded processors that use 2 byte instructions, e.g. MIPS16 and the ARM "Thumb" ISA.

It's really kind of funny that here, 17 years or so after microprocessor designers stopped arguing over CISC vs. RISC, so many people still make such a big deal of it. Each approach was well suited to the circumstances of its time (i.e. fabrication technology/levels of integration; main memory cost and speed; compiler technology; etc.)--which means late 60's and 70's for "CISC" and mid 80's to mid 90's for RISC. There was a crossover period, starting in the early 80s where three teams independently conceived of what became known as RISC (Stanford, Berkeley and IBM) and a three or four year period in the mid 80s where there was some valid debate over whether these new research architectures were really better than CISC. And they were. And everyone soon saw and agreed that they were. And that should have been that.
 
I think it's finally being realised --I wonder WTF took so long-- that RISC and CISC have their place inside the same processor. RIS is nice for doing fast math, but you need some CIS to speed up some OS stuff. All in all, I think people are finally seeing the light of hybrid architectures.
 
Isn't the 970 have some of these type of instructions?

They were mentioned in a RWT thread by PD, I believe. I can't find it right now.

Not CIS in the classical sense.
 
Back
Top