By now, all that you need to know about RSX is already in the public domain. The clock, the bus, the number of pixel and vertex pipelines, etc. Accept that it is what it is.
The texture caches that you talk about, indeed they were increased, only to cope with the higher latency when pulling data from XDR memory. And just so you don't get any ideas, the texture buffers were increase from 48k to 96k. Nothing ground breaking.
Also the shaders got a couple of extra instructions - a fast vector normalize and some extra texture lookup logic. Nothing fancy again.
So don't look for any miracles here. If you want to feel better, understand this - noone has ever used the 'lowly' 7800GT's to it's full potential yet. That's the beauty of fixed hardware, you know all that the hardware is capable of and you don't have to worry about compatibility with 5 year old crapy cards. So that's where the true power of RSX lies - in developer's talent to extract incredible effects from a very capable and robust piece of silicon.