I notice quite a lot of "mm2" or transistor numbers throwing around. Why is that?
"more is better"
Acer93 was using it as some kind of.... performance measurement ( ? ) though.
I like number of transistors a lot more, for that kind of comparison (still doesn't really relate to performance that much, if at all).
edit: maybe someone has a list of "fps per mm2", or "transistor per fps" for a number of GPU's, then you would see how much sense that makes (or doesn't make) when comparing architectures.
Great explanation!
I guess that comparing 2005 and 2012 GPU designs on the basis of mm2 is pretty useless then.
Maybe you could quote me because I don't think you understood what I said or the meaning of my posts.
Transistors are the physical microscopic "parts" that compose the logic and memory in modern electronics. Transistors aren't a great gauge of performance because (a) people count transistors differently, (b) different architectures use transistors more or less efficiently, (d) memory is more dense than logic hence one design may have a large amount of cache but less logic, etc. There have even been examples of moving down a process (hence close proximity), or time to refine a design, has resulted in an architecture reducing transistor count. Importantly, due to these reasons, my OP was only looking at Moore's Law as a guideline (~2x density every 18-24 months) and the scaling of transistors as a general guideline to what we could project
n-process nodes into the future, with the caveat of architectural issues (e.g. features often come at the expense of performance; yet more advanced features may make certain desirable techniques performant whereas the older architecture scaled up would not, etc).
mm^2 (area, e.g. 10mm x 10mm is a 100mm^2 chip) is not a direct comparison of performance either. A ~ 250mm^2 RSX on 90nm is going to have about 1/10th the transistors of a 250mm^2 GPU of a similar architecture on 28nm. Area also doesn't tell us about the architecture and what kind of frequencies that architecture allows.
What mm^2 does allow us to do, as long as we take market conditions into consideration, is get a barometer of cost. This is not a 100% correlation due to said market considerations e.g. costs change over time (namely nodes tend to get more expensive), production early on a process is more expensive than when it is mature, wafers get bigger (which can reduce chip costs in the long run), etc.
So whether we are on 90nm process or a 28nm process a 225mm^2 (15mmx15mm) chip on a 300mm (12 inch) wafer (which are round)
nets about 245 die. Assuming that your wafer cost is
exactly the same in 2005 on 90nm as it is in 2013 on 28mm (bad assumption) you could get the same number of chips at the same cost. There are too many variables at play to get an exact cost change--e.g. we would have to know when the consoles would ship for one. But assuming late 2013 28nm will probably be more mature than 90nm was in 2005 at TSMC. But then again iirc there was more competition in the fab space in 2005 and costs have been increasing over time. But the transition to 300mm wafers is pretty standard (I don't remember if all 90nm production was on 300mm or some on 200mm).
What may also be missed in there is costs have not always gone up. Dies have gotten bigger over time and IHVs have been able to fit more product onto a board and make profits. Another dynamic is how the GPUs and CPUs have swallowed up other onboard chips (e.g. Intel's FSB was on the northbridge iirc, now IGPs that were on the Motherboard chipset are APUs on the CPU die, etc).
I fully admit there are many, many factors and variables. My posts recently looking at die size (area, mm^2) did touch on the very fact that die size alone doesn't tell the whole story--e.g. the new "barrier" may have shifted from cost of the die to TDP. If that is the case a larger chip aimed at a specific TDP (which would likely have lower voltage, frequency, and an architecture aimed at a set TDP) may provide more performance than a smaller higher clocked chip which will hit the TDP wall sooner. The other aspect is the bus size. If you want a 256bit bus you will need to be a certain size. Just as importantly you can go larger as you need to consider the limits at the next node. e.g. If you chip just fits a 256bit bus on 28nm then a shrink to 20nm may not help you as the chip bus is so large (they don't shrink fast at all). Hence so much talk of 128bit busses. Of course it may be cheaper to target 28nm for a much longer product cycle as 20nm may be expensive and integrating FINFETs may not be an option (or even available).
Anyways, I think you misunderstood my posts.
One of the walk away points in general is that I think we will see Sony and MS reduce the silicon budget (but rest assured they will claim the chips costs them even more...) and those budgets will be shifted to things like Kinect2 and media experiences (think of how Sony sold out for Blue Ray; this gen it will be selling out to the "Cloud"). I guarantee you that we will soon be inundated with a flood "
n more transistors!" and the like and
gawdy numbers like "6x faster!" when, I predict, we are going to see a large reduction in investment into silicon. Which is fine, just as long as they are not obscured by useless expressions like "But it is 6x faster with 5x the transistors!"
Likewise the performance levels being discussed these days (e.g.
Cape Verde) is 2009 GPU performance and doesn't even crack 30fps in Crysis 2 at 1080p. Which, again, is fine as long as it isn't trumpted as some leap in performance. Hence my responses that these chip specs are quite old by today's standards, don't show any benefit of the extra long generation, don't offer technology that displays products far and away a generational leap over the current consoles,
and would reflect a steep decline in silicon budgets. That last point is the one I have been mainly discussing. Steep cuts in chip size may be the best move for these companies, but my purpose to to look at the flip side of "Cape Verde is sooo much faster than Xenos!" and say, "Yes, but in terms of silicon footprint it is a massive reduction over RSX/Xenos and for those angling at
n performance increase Cape Verde class hardware offers little in terms of offering the performance for a generational leap graphically." And the benchmarks show that. Alas this is more a back and forth of what people want/desire, expect, what is good strategy, what is possible, etc. If you take anything away from the previous posts is 1TFLOPs GPUs or Cape Verde would be much smaller than Xenos/RSX and there is no reason more could not be obtained from these GPUs will similar TDP limits if that, instead of die size, is the limiter.
The other point I was discussing was larger chips must have a very high TDP when this isn't necessarily true. It seems frequency accelerated toward the TDP wall faster than die size on some designs and processes so a larger, lower clocked GPU will likely provide more performance per Watt. Hence a larger die does not, out of hand, have to violate power constraints. In fact a larger chip within TDP budgets would indicate the design didn't run away from the performance issues and toss their hands up and say, "Well, lets just invest the extra money on Cloud Services." Which, again, may be the best strategy but not necessarily a technical barrier.
EDIT:
A chip that is 50% smaller on the outside could have 5 times the performance (2012 vs 2005).
Ok, I know you don't understand my posts. Go back over them again please before telling me my comparison is incorrect and has faulty logic. I am obviously not denying a GPU 1/2 the size on 28nm won't be faster than Xenos. In between frequency increases, architecture, and sheer logic increases it will be much faster.
Yet a 135mm^2 GPU is a huge reduction in physical real estate from, say, a 258mm^2 RSX and it represents a massive shift in console design and, frankly, purpose. If you are fine with a 135mm^2 GPU, that is fine but also neither here nor there in terms of my point.
What I would say is that looking at gaming benchmarks Cape Verde class GPUs struggle with games on high quality at 1080p which doesn't paint a rosey picture for traditional and progressive visual enhancements. It also has specific architectural considerations, as I noted in those posts, that why would someone argue for DDR4 with a lot of bandwidth AND eDRAM with Cape Verde obviously doesn't need much more bandwidth than what DDR4 would offer.
It actually seems that an APU with a Cape Verde class GPU with a large pool of DDR4 on a wide bus would be a very well balance system design--one large pool of memory (e.g. 8GB), CPU and GPU on a single large die which makes the wide DDR4 bus possible, and the close proximity of the GPU and CPU should cut down on some bandwidth.
If you wanted a cheap console that was able to balance a CPU, GPU, and Memory well equiped to work together this seems perfect to be quite honest. eDRAM doesn't seem necessary for this class of GPU and only complicates the design and makes it more expensive. Which was one of my points.