Just out of curiosity, is that purely speculation or is it based on any kind of inside source? Given how little used command lists are (and how troublesome/unhelpful they are), I would be surprised by that. But stranger things have happened.
It is not based on a quotable inside source. But even without that, every indication points in that direction.
Apart from that: Traditionally, for a few years, Nvidia drivers have had the reputation of being able to extract more performance in apparently CPU-limited scenarios, but only if the CPU in question had at least four threads available. This has been attributed from time to time to a "low-res strength of Fermi/Kepler/whatevs" or a "hi-res strength of Radeon XYZ", where the latter looked comparatively better and better the higher you went, available vid-mem nonwithstanding.
When I compared the R9-290X and the 780 Ti a couple of months ago for our CPU-tests on older dual- and quad-cores from the Core 2 generation, it showed, that they were basically on par again (and thus the tests quite ok for comparing CPU performance). This is a step up for Nvidia in case of the dualcores, where they didn't perform that good earlier. Notable was though, that in BF4, the Nvidia card managed to pull ahead of the Radeon (both using DX11, in single-player, no AA, AF, AO, 720p!). Further tests with Mantle showed that with AMDs API, the Radeon could really pull ahead, as was to be expected. This further confirms, that the performance difference is indeed due to the way how the cards' drivers utilize processor ressources in this case, since we were far away from the graphics cards graphical limits with these old CPUs.
So, given that there are other DX11 games in our testing parcour, here's a little on that: Anno 2070 (I think it is or was called Dawn of Discovery in the rest of the world?) is multithreaded, but a very strong primary thread dominates performance here, same goes for F1 2013 apparently, since we're seeing here a great disparity according to IPC, not only core count between CPU types. Crysis 3 seems to be an example, where the dev took really care of distributing the load on many lighter threads to a lot of cores (and this is not only speculation), which is especially apparent in our
testing scene. Here, AMDs CPUs are able to keep up very good with Intel ones, so maybe (and this is a speculation maybe), we're seeing a custom-case of DX11 CL at work already. This test's scores increased quite a bit with the recently released REL337 drivers from Nvidia - maybe (again, spec.) able to better/fully utilize CLs as well.
And the one accusation that can unambiguously be checked, the disappearing source code, turns out to be flat out wrong. Is that a good summary of the situation?
Well, I don't know if it has ever been freely available in the first place, but the gameworks-specific modules are not to be freely downloadable on Nvidias site (anymore?) - instead of a link, there's a "contact us for licenscing".
The other DX11- and OpenGL-samples (still) are readily available, though.