I think an important point here is the distinction between 64 bit as memory address and as internal workspace and associated performance. In the Olden Days, an 8 bit processor would perform calculations on 8 bits at a time, a 16 bit processor on 16 bits at a time. If you wanted big numbers on an 8 bit processor, you had to work around the register limits. Then the same with 16 bit and 32 bit registers - if you were dealing with 32 bits of data at a time, a 32 bit processor was faster than using a 16 bit processor.
That's irrelevant now. We have 128 bit vector processing units. The 'bit-ed-ness' of CPUs, always misleading (see Motorolla 68000) is now just a matter of memory addressing, not internal performance. Internally they're all 128 bit vectors in calculations.