What does make you believe that a sw emulation may be (significantly?) faster than the actual not-so-fast hw implementation?I know that, but like you said, it's not very fast. That's the point of approximating with FP numbers, so you can boost the speed rather significantly.