Should I bother to continue using 0x87 registers?

K.I.L.E.R · May 14, 2007

I run a 64bit OS and only develop 64 bit programs.
AMD recommend I use only SSE registers, however I do not see the problem with using 0x87 registers in order to do calculations on the side.

What's up with the XMM registers that make them so damn special, that exclusive use of them is recommended?
PS: I'm not talking about GPRs in this discussion, just 0x87(MMX, 3DNow) vs XMM.

Even GCC on a 64 bit OS uses SSE registers by default for FP stuff.

soylent · May 14, 2007

K.I.L.E.R said:
What's up with the XMM registers that make them so damn special, that exclusive use of them is recommended?

There's many good reasons to use SSE(x).

XMM registers are randomly accessible instead of having an impractical stack based registry architecture and sharing it's registers with MMX(AFAIK switching to and from MMX requires a 'reset' either using FNSAVE and FRSTR or by issuing EMMS when finished with MMX to clear.).

There are 16 xmm registers in X86-64 mode; I can't find anything to suggest that the FPU stack has been expanded beyond 8 registers.

SSE can handle both vectors and singles, depending on what you need.

AMD has said that they do not execute X87 code as efficiently as SSE AFAIK.

SSE has fast low-precision approximations for reciprocals, square roots and reciprocal square roots which can be very helpful when loss of precision is a non issue.

It's easier to debug and code for a registry structure where loading a float to a register means that float will stay in the register until you deliberately change it instead of moving around every time you do an operation that affects the number of floats on the stack. Underflows and overflows on the stack can't happen in SSE.

SSE supports mixed integer and float operations as well as bitwise logical ops.

Support for x87 code may disappear in future OS's running in X86-64 mode.

K.I.L.E.R · May 14, 2007

Thanks. Very interesting on point " AMD has said that they do not execute X87 code as efficiently as SSE AFAIK.", I thought that this was the case but couldn't back it up and saw no proof of it in their docs.

Zengar · May 14, 2007

Really? There is a n AMD document that describes different ASM optimizations. They have tables with instruction execution time, SSE is in worth case at least 50% faster then FPU.

K.I.L.E.R · May 15, 2007

Do you know the name or number of that document?
I've downloaded just about every document off their site.

Sorry I'm silly, you're talking about the table in the end of the PDF.

Blazkowicz · May 15, 2007

but, x87 is 80-bit precision

K.I.L.E.R · May 16, 2007

I thought the SSE latencies were bad, I thought wrong.

Simon F · May 16, 2007

Blazkowicz said:
but, x87 is 80-bit precision

..and what an evil thing that is too.

Should I bother to continue using 0x87 registers?

K.I.L.E.R

Retarded moron

soylent

K.I.L.E.R

Retarded moron

Zengar

K.I.L.E.R

Retarded moron

Blazkowicz

K.I.L.E.R

Retarded moron

Simon F

Tea maker

Similar threads