Win32 DLLs are not position independent. They need to be relocated during loading, unless the fixed base it was built with happens to be unused in the loading process. Relocations to the same address can be shared, but if different processes have conflicting memory layouts, the loader needs to generate multiple copies of the DLL in memory. When the Windows loader maps a DLL into memory, it opens the file and tries to map the file into memory at its preferred base address. As these mapped pages are touched, the paging system will see whether the pages are already present in memory. If they are, it just remaps the pages to the new process since the relocation has already been done by the loader at the preferred base address. Otherwise the pages are fetched from the disk.
If the preferred address range for the DLL is not available, the loader maps the pages into a free location in the process address space. In this case, it marks the code page as COW (copy-on-write) which previously was marked read+execute, since the linker will have to perform code fix ups at the time of relocation, necessitating the page to be backed up by a paging file.
Linux solves this issue by the use of PIC (Position Independent Code). Shared objects in Linux usually contain PIC which avoid the need to relocate the library at load time. All code pages can be shared amongst all processes using the same library and can be paged to/from the file system. In x86, there is no simple way to address data relative to the current location since all jumps and calls are instruction-pointer relative. Hence, all references to the static globals were directed through a table known as Global Offset Table (GOT).