Basically the hardware threads have a zero cost to context switches. Software threads usually have a very large (relatively) thread switch overhead.
The reason this is important is because a large portion of thecomputation resources available to a processor are largely unused because of stalls (usually from memory access). The idea with hyperthreading or similar technologies, is that you can utilise these otherwise wasted cycles on the second hardware thread.
As pointed out in general you will not get 2x the performance, because you now have 2 threads contending for the cache/memory.