The problem is that if your two viewpoints are not in the same position then what they see from the scene is different. There will be areas in the background that are covered by objects in the foreground, which should become visible once you use two cameras instead of one.
You can test it yourself, just hold up your index finger about 5 inches from your face and look at it with only your left eye, then your right eye. Now lift your hand in front of your face about a foot away and look at whatever's beyond your hand, wall, TV, or the computer screen, so that you cover up a good chunk of it, and then repeat. In both cases you get two very different images, you get to see stuff that the other eye can not see. Your brain does a lot of image processing to deal with stuff like this when both of your eyes are open. So you don't really notice but it's there and stereo games and movies have to deal with it as well.
Now even if you have a full G-buffer, you don't store occluded pixels in it - but you'd need them to get this stereo effect. So what needs to be done is a full reconstruction of the background. There are techniques for this in movie VFX (think about how they painted out Andy Serkis in LOTR in order to replace him with the far thiner Gollum), which rely on 2D image editing, retouching etc. The above mentioned fake 3D-upgrade would of course have to cover this as well. But games don't have the human brain and painting skill to replicate this trick either.
Also, the above mentioned fake 3D can't deal with the first example either, the case where you have two different images of your finger. One with the front and right side, one with the front and left side, whereas a mono view would only have the front side. It is detail that's lost from a 2D movie as well, which is why I've mentioned the cardboard effect before. If you have static objects and backgrounds and a moving camera then it is possible to use simple 3D geometry and re-project the movie's frames onto them to get rid of it but it's quite unlikely that they'd do it for the actors, or complex stuff like trees etc.