Why do you so bloody want 8 pipelines (as 8 pixels/clock)?
Lets do a little math shall we:
At 1600x1200 there are 1.92 million pixels on screen and if we are to draw this at 60 frames per second we need 115.2 million pixels per second of fill rate. Now look take fill rate of Radeon 9700 Pro for example: at 2600 million pixels per second its fill rate would suffice to overdraw each frame
22.57 times (now which game has or will have that kind of overdraw). That's 60 fps at 22.57 overdraw at resolution of 1600x1200!!
Of course you would run out of memory bandwidth even on Radeon 9700 Pro to do this. Lets say we have to draw 2600 million pixels back to front (worst possible case). For every pixel we have to: read Z (4 bytes), write Z (4 bytes), write color (4 bytes), texture read (4 bytes - simple point filtering not to complicate things further). To do this you'd need 40GB per second of memory bandwidth.
On the other hand: how many games today use simple single texturing or "one instruction pixel shaders" (multiplying a texture with constant or something)? How many games will do this in the future? There are very few situations in modern games where pure fill rate is the problem, and in this cases GeForce FX will act like an 8 pipe solution (stencil shadows for example).
But all in all, Radeon 9700 Pro still rules due to its 8 pipe architecture, because this 8 pipes also come with 8 floating point arithmetic units and 8 texturing units that can operate in parallel. On the other hand NV30
is a total mess when it comes its arithmetic capabilities. From tests we have done here on this forum it seams like NV30 is just an over clocked GeForce 3 or 4 when it comes to integer arithmetic speed and even slower on floating point arithmetic's. And that's why NV30 sucks.
If you look it from a little different angle (better angle
): Radeon 9700 can do 2600 million floating point operations per second, while NV30 can do 2000 million
fixed point operations per second (GeForce 4 Ti 4600 can do 1200 million fixed point operations per second).
IF NV30 would be capable of
8 floating point operations per clock it would be able to do
4000 million floating point operations per second and it would leave Radeon 9700 Pro far behind. But then again NVIDIA probably wouldn't target 500MHz and would need a dust buster cooling solution...
In the end with fill rates as high as they are today you don't really need to push them even higher. You should just make sure that pixel shaders run fast enough. To do this you don't need to output 8 pixels per clock instead of 4 pixels per clock as 99% of stuff will take a couple of clocks anyway.
No problem if NV35 or even NV40 are sill just 4 pipe solutions. But they need to push 8+ floating point arithmetic instructions per clock in parallel with 8+ texturing instructions just like Radeon 9700 and Radeon 9800 as NV30 doesn't do that.