Architecturally, no - compute is required for D3D. In terms of execution, clearly NVidia has serious problems with execution.
NVidia's old architecture has a severe setup bottleneck, theoretically - though one that was never seen in games, in practice. NVidia seemingly had no choice but the kind of significant architectural change we see, in order to implement decent tessellation performance. Do we believe the engineers who say that the distributed setup system was a ball breaker?...
There's very little about compute in Fermi that's beyond D3D spec: virtual function support is required, so we're looking at a few percent spent on double-precision and ECC as the CUDA 3.0 tax.
NVidia even attempted to pre-empt 40nm by moving "early", originally planning to release 40nm chips in autumn 2008 before AMD. But NVidia's execution is generally so bad (see the string of GPUs before this) that that came to naught. Charlie's argument that the architecture (G80-derived, essentially) is unmanufacturable appears to hold some water, because G80 is the only large chip that appeared in a timely fashion.
GDDR5 might have been the straw that broke the camel's back: leaving implementation until the 40nm chips seems like a mistake, but the execution quagmire drags that whole thing down anyway.
Jawed
That doesn't explain the larger chips (and thier 2 highest end cards) from AMD being very hard to get. The hd5870 seems to have now some what better, but the HD5970 can't seem to find it anywhere on the major internet shops.