K.I.L.E.R. brings up a good point. To flesh this out a bit:
...it depends what you mean by "cost". Does "cost" mean:
1) # of transistors (after all, same bandwidth -> same PCB costs?)
2) actual marginal cost per good GPU die from 3rd-party fab
3) marginal cost per die along with total R&D/driver development costs
4) marginal cost per die plus R&D/driver costs
amortized over total sales
5) above plus IP licensing fees
6) wholesale price of GPU for board integrator
7) wholesale board price
8) retail board price
As we all know, "cost" (as calculated in certain of the above ways) will presumably vary a great deal depending on which IHV is developing these hypothetical TBDR/IMRs. Are we comparing, say, the costs to Nvidia of doing NV40 as an evolutionary IMR vs. utilizing their Gigapixel IP and moving to a TBDR? Are we comparing, say, a new PVR TBDR to a new IMR from ATI?
Offhand, I'm inclined to choose "cost = # transistors" because it is removed enough from real costs that I don't feel bad then going on to assume hypothetical equally-talented engineering teams. From there I would assume the IMR has the full complement of current bandwidth-saving/overdraw reducing technologies implemented, because honestly, none of us is too interested in a chip that doesn't. And therefore I would assume the TBDR "costs less", because I think the transistor count necessary to process scene geometry and implement a tile-sized on-die depth-sorting cache is less than the transistor count for the IMR efficiency features. Note that this assumption is based only on comparing how IMR transistor count has increased as such features have been added, to guessing how many transistors were required for the TBDR functionality of e.g. Kyro, which is to say it's based on not very much.
In the end, this conclusion is so IMO meaningless to the real-world desirability of TBDR vs. IMR that I won't bother voting until the question is clarified a bit.
PS - funny how item #8 in the list turns into the "cool shades dude" emoticon...