If older chips had 30 clock domains don't you think newer ones would as well.
Allow me to point to
earlier comments in this thread.
Context is everything.
Lukfi was clearly talking about a separate shader clock, not clock domains in general...
I'd be very surprised if it clocked lower than the 3870. Hopefully they could just focus on improving the speed of the stream processors (and texture units if possible) and clock those higher.
Well, that's really the point of my post: if you want to increase the clock speed of a monolithic clock domain, you don't have the luxury of improving one block and not the other. It's all or nothing. Improving 'just' the shader and texture unit would imply that they are running on different clocks. Since this is not the case for RV670, it can only be done by changing the architecture.
The only thing I'm not sure I agree/understand your point with 45nm/40nm. I mean, that kinda goes against what I said in my latest 40G news piece - even if you optimized mostly for power efficiency, it'd still be very easy to get a 100MHz bump. Am I missing something here?
I haven't had the chance to play with 40/45nm libraries, I'm just not holding my breath: the trend is very clear in that the speed improvement in going to smaller processes is getting progressively smaller. The step from 90 to 65nm was really quite disappointing. Also note that fab houses (pretty much all of them) have a long history of being too optimistic about performance of new processes. I've seen cases where initial spice decks were 20% faster than the final production ones (over the years they've been getting better at it, but it still pays to be very skeptical.)
As for stepping up 100MHz: that depends on your initial speed right? Going from 1.5GHz to 1.6GHz is going to be much easier than going from 200MHz to 300MHz, but you knew that.
In the context of a hypothetical RV770 in 40nm: beats me. These days, RAM speed is particularly dicey, but I guess going from 750MHz to 850MHz is not all that unreasonable...
I don't know the trade-offs or complexities involved in using multiple kinds of transistors for the same chip, but perhaps others would know better.
There are processes that support 3 types of standard cells, with different transistor threshold voltages: LVT (Low), SVT (Standard) and HVT (High), in order of decreasing leakage and decreasing speed. (See
this presentation.) They can be freely mixed, but LVT cells should be avoided like the plague: their leakage can be many orders of magnitude higher than HVT cells, for only a 2x or so speed increase!
Backend tools are supposedly robust enough to upgrade cells to HVT when there's enough timing slack.
There are also options to mix multiple processes on the same chip, but AFAIK this is not very common and tool flows are still immature.