I love monster ALU power as much as the next guy, but every single generation, people have been raving about more ALU power and prettier pixels, and every generation we still see games getting tex/fillrate/bandwidth limited.
It's probably fair to dissociate 'fillrate' (aka ROPs) and texturing, IMO. The former is *always* going to sometimes be a limitation (unless you're triangle setup limited, for example) during z-passes, shadow generation passes, etc. - you just can't really get away from that, so it's a very different case.
And yes, with more ROP power, you can nearly always make some things look prettier fairly easily. You can have more AA, potentially larger shadowmaps (but that can get expensive for the rest of the GPU if you do advanced filtering) or at least you might have more time to work on the rest of the scene, thus potentially making it prettier. However, whether that makes as much of a difference (per mm2) as more TMUs and ALUs depends on a case-per-case basis.
There is one limitation to this, of course, and that's bandwidth - as G80 highlights, however, those limitations are probably not quite as strict as most would have imagined as long as your compression algorithm is good enough. But you probably couldn't go much beyond what G80 does, ROP-wise, without embedded memory or something similar. And G80 arguably already has too much ROP power (exception: stencil isn't so hot) for its bandwidth in many (but not all!) cases.
I'm willing to agree with you that scaling up the TEX : ALU ratio is useful
With TEX : ALU, the problem is different, because you always have the two working at the same time. So all that matters is the ratio, and it will vary substantially during a single frame (IMO, that is even more important than varying from game to game). It varies from pass to pass (shadow filtrering, lighting, post-process motion blur, post-process depth of field, tonemapping etc.) and within a single pass.
The typical case is that it varies from material to material, for example the rocks in 3DMark had a longer instruction count than anything else. This also applies to particles which tend to use fewer ALU ops (but they might be ROP/Bandwidth limited anyway). Another example is determining the average luminance of a scene for tonemapping - that's mostly TMU-limited, or perhaps ROP-limited if you do it naively.
However, it's interesting that you mention lighting because that's precisely the lighting model for specific material is arguably one thing that could benefit from a lot more ALU ops. The only good example of that I could find via Google is
this thread on gamedev - look at the amount of ALU operations *per light* in that Cook-Torrance shader, along with the complete lack of TEX operations used to simulate that effect. The screenshot in the last post is pretty nice too!
Another interesting case of ALU-limitations is when you have a character with very complex animation that is relatively far away, and thus the triangles are only a few pixels big. If that character is not triangle setup-limited, then that extra VS work will affect the overall TEX : ALU ratio on an unified architecture. It could be thought of as a corner case, but it definitely could happen AFAICT...
but I happen to think that disproportionate scaling will yield more benefit to GPGPU workloads, physics, audio processing, etc and less 'prettier pixels' because it appears to me that the biggest problems yet to be solved on GPUs with respect to lighting also happen to be those which require alot of scatter/gather/TEX functionality.
Well, sure - GPGPU is the field that benefits the most from truckloads of ALUs. Certain subsets of it also benefit a lot from bandwidth though, interestingly. And physics/audio on the GPU would have a very low TEX : ALU ratio too (although still quite far from zero, I suspect!)...
But I disagree completely that TEX is the primary thing that is holding graphics quality back. It remains *very* important, but ALUs are also extremely important (and in terms of lighting equation, probably more so!) and I would be incredibly surprised if the ALU/TEX ratio didn't keep growing in the 'main pass'.
However, I also think that the other passes might have a much more variable ratio, so that increasing the amount of TEX you have will nearly always improve your performance too. The real question is how much it will increase it per-mm2, compared increasing TEX... And that depends on your existing ratio, obviously. If you take the G80, it seems kind of obvious to me that increasing the ratio is the best way to improve performance. On R600, that's probably not quite the case, however!