How many bits are in a float in x86-64

lost

Newcomer
In anand's preview of windows 64, the theoretical performance of the a64 goes up in terms of mips, and I was wondering what format a float follows in x86-64 architecture. Is a float then 64 bits and a double 128 bits when running in 64 bit kernel mode or still the 32/64? If it is 64/128 would that account for the decrease in game scores because although the theoretical numbers are up, the actual instruction throughput is reduced?

If the size of floats/doubles in x86-64 is 32/64 still, does that mean the only real gain in x86-64 is enlarged memory addressing (40 bit? ~1 tB)
 
lost said:
In anand's preview of windows 64, the theoretical performance of the a64 goes up in terms of mips, and I was wondering what format a float follows in x86-64 architecture. Is a float then 64 bits and a double 128 bits when running in 64 bit kernel mode or still the 32/64? If it is 64/128 would that account for the decrease in game scores because although the theoretical numbers are up, the actual instruction throughput is reduced?

If the size of floats/doubles in x86-64 is 32/64 still, does that mean the only real gain in x86-64 is enlarged memory addressing (40 bit? ~1 tB)
Float = 32 bits, double = 64 bits. These datatypes haven't changed under x86-64.

What has changed in x86-64 is mainly the width of the integer/pointer registers (from 32 to 64 bits; allows >4GByte addressable memory and makes 64-bit integer arithmetic much faster; since pointers are widened from 32 to 64 bits, data cache performance is somewhat reduced because it can fit fewer pointers) and the number of integer and SSE registers (from 8 to 16; should bring about 10-15% performance increase to 64-bit applications with a recompile; more registers means that the compiler doesn't have to insert so many instructions to swap data betwen the register file and the stack all the time).

If the windows64 kernel brings a performance decrease, it is most likely due to overhead of switching between 32 and 64 bit modes, unoptimized drivers, or data cache thrashing due to the larger pointer size.
 
I remember reading that the x87 instructions are deprecated in x86-64 mode. AMD apparently recommends shifting over to using SSE2 for floating point calcs.

It does strike me as a step down of sorts for those who liked the additional accuracy permitted by the x87 unit's 80 bit stack registers, but it looks like there are more people wishing for a more RISC-like FPU with three register operands, less problems with moving values to and from memory, and a way that didn't require juggling a stack. It might also explain why AMD's scalar performance is so much higher in SSE2 than its vector capability, since that unit is taking on the role of the FPU (well, that part of the big FPU/3dnow/MMX/SSE/SSE2 block).

Heck, anyone remember how AMD was originally going to totally junk x87 and implement technical floating point? Apparently SSE2 kind of cut in on the need for it.
 
There was a rumor about AMD intended to introduce a new three-op FP instructions, probably completely replace the old x87 FP instructions in 64 bits mode. However, a true three-op FP instruction set is just too different from the normal x86 instructions (although x87 instructions are also very different from normal x86).

SSE2 is not a bad choice: it's designed for IEEE 754 compliance, and it's simple. Not making scalar SSE2 runs fast in P4 is, IMHO, a bad design decision. Although you'll lose 80 bits precision in x87 if you use SSE2, it's actually easier and more efficient to be 754 comliance than using x87. And many compilers just don't support 80 bits precision.
 
Back
Top