From what I've gathered not so much in application research but from common sense is that processor designs are much more power focused than area focused. Take for example Nvidia dropping their hot clock: Area-saving feature vs. power saving feature.They should just be generic SIMD operations. Texturing is little more than a generic mipmap LOD calculation, a generic texel address calculation, a generic gather operation, and a generic filter operation. All of this can and has been done in shaders already. Likewise programmable rasterization is currently a hot topic in graphics research.
With even smaller process geometries, you get an increasing amount of transistors per area (in effect: per dollar), but less improvement in calculations per watt. So, unless you're in a business that does not care about power (yet), you might already be designing to meet specific power targets rather than specific area targets. IOW you deliberately spend more area because you know that your processor cannot switch all it's gates (in the cores) at once in your power target anyway.