Interesting results you got with instancing tests with your Fermi (GTX 680) GPU:
https://nbertoa.wordpress.com/2016/02/02/instancing-vs-geometry-shader-vs-vertex-shader/
https://nbertoa.wordpress.com/2016/02/04/instancing-vs-geometry-shader-vs-vertex-shader-round-2/
Geometry shader usually doesn't score wins, but in this special case, your instance vertex count is tiny. Only 8 vertices. Fermi doesn't seem to pack vertices from multiple instances to a single warp. As warp is 32 threads and each instance is only 8 vertices, you are likely seeing only 25% utilization of vertex waves. There could be some other bottlenecks as well. The bottleneck disappears when you bump up the vertex count to 44 per instance, which is still a very low vertex count per object by today's standards. In practice most objects have more vertices than that.
On AMD card (GCN) you would see different results, since AMD packs multiple instances to each wave (64 threads). AMD GCN1-3 (Radeon 7000 series, 200 series, 300 series) also have poor strip rendering performance. Geometry shader outputs strips and strip cuts, which is bad for AMDs architecture. Polaris and Vega have improved strip rendering performance, but I would expect instancing still to beat geometry shaders, even at very low vertex counts.
There is a workaround for the instance packing inefficiency. You create a vertex buffer with N copies of the same object, for example N=4. Then you use SV_InstanceId and SV_VertexId to calculate the actual instance id, and do custom fetch of instance data from a buffer. Use constant buffer for instance data if your instance count is small, since Nvidia and Intel have special hardware (and special on-chip memory) for fetching and storing constants. Draw calls with huge amount of instances need to use Buffer<T>, StructuredBuffer<T> or ByteAddressBuffer for instance data.
There are also various tricks you can use to avoid instancing completely and reduce the vertex data size. Very helpful when rendering lots of instances with tiny vertex counts:
Thread:
https://forum.beyond3d.com/threads/programmable-vertex-fetching-and-index-buffering.57591/
Post about emulating multidraw with index packing:
https://forum.beyond3d.com/posts/1900656/