The way I've seen it work in practice is the OS will load-balance between 2 or more processors. So if you have a single process that is enough to conceivably swamp a single processor, you still would not see this by default. The OS would load twin CPU's at 50% each, which makes up the cumulative of 1 CPU of work. You can, however, manually pick a discrete process and assign it to a specific processor (my guess is to just let the OS schedule the way it likes, to take advantage of granularity, unless you have a very specific reason to assign to a single CPU). Then you would see one processor pegged at 100%. Programs designed to utilize multiple processors would be able to load both processors at greater than 50% each, naturally. Getting full 100% on both actually can be hard to achieve, as usually there are many other things that may get in the way of continuous full utilization conditions- waits for memory access, disk accesses, contention from other processes, etc.
Now isn't there an issue where local cache capacity is compromised somewhat since now the data for a particular process will reside in the local caches of both processors? So you have some memory wasted with duplicate entries and various states of revision. ...and a worst-case scenario is where you have a pending process waiting at one processor but the required piece of data happens to be on the other processors local cache. So you have a "few" clock cycles wasted while that data makes its way over. Potentially, this is where hyperthreading comes in handy to shove in something else to do while a pipeline is held up waiting for data that happens to not be local (as far as my layman's understanding of this stuff). So I guess this cache wastage issue is a potential reason to try assigning a process to a single processor. Naturally, cache management algorithms aren't exactly stupid either, and will be successful at masking these problems (less space wastage) in most cases. So you can't just assume you will get better cache performance for your process by binding it to a specific processor.