[Analysis] TSMC 40G to deliver up to 3.76x the perf/mmÂ² of 65G & Power Implications

B3D News · May 1, 2008

[Analysis] TSMC 40G to deliver up to 3.76x the perf/mm² of 65G & Power Implications

It turns out TSMC's 40nm general-purpose process will be even more impressive than previously expected: we knew it was going to sport 2.35x the gate density of 65nm, but now it turns out it'll deliver 60% higher performance too for a theoretical 3.76x perf/mm² boost. The big question now? Power.

Read the full news item

MTd2 · May 3, 2008

I emailed the administrator, but It didnt work.

The link from the front page is broken to this page...

Friendly Neighborhood Mod Edit: Thanks, it's now fixed. FYI, Site Feedback might be a better second option.

Mart · May 5, 2008

This news got me a bit confused. Would anybody be so kind to explain the difference between power saving and performance increase? From what I've always learned, chips do wonderfull things, use power to do it, and generate heat by doing so. The chip can be clocked at a certain speed, but if you clock it higher, it will use more power and get hotter. There is a certain ceiling for the clock speed in the amount of power you can use and the heat you can dissipate. AMD's Phenom cannot be clocked much higher because it's allready using more than 125Watt. Correct so far, or do I get something wrong?

Now, if you save power, you would also generate less heat. Doesn't that all mean you can just clock your chips higher? However, the article makes a distinction between power saving and performance increase. How does that work?

Arun · May 5, 2008

Mart said:
This news got me a bit confused. Would anybody be so kind to explain the difference between power saving and performance increase?
[...]
Now, if you save power, you would also generate less heat. Doesn't that all mean you can just clock your chips higher? However, the article makes a distinction between power saving and performance increase. How does that work?

Easiest way to explain that is to make you realize that 20 years ago, chips took very little power. Yet they couldn't have clocked as high as they do today. And single-cores don't clock 4x as high as quad-cores.

Power and heat can limit clock speeds; however, they're not the main factor especially outside of the ultra-high-end. I won't go into this too much, but clock speed is very much a factor of both the process and the design (more specifically, the weakest link in a given clock domain of the design).

The reason why modern chips take more power than old ones is that dynamic power/transistor hasn't scaled down as fast as transistor density scaled up, while leakage also went up. As I said, there's no magical solution to that problem on the horizon. So it's going to become more and more important for the reasons I explained in the news item to optimize for performance/watt, even in domains that traditionally haven't done so much if at all.

Time · May 6, 2008

Mart said:
There is a certain ceiling for the clock speed in the amount of power you can use and the heat you can dissipate. AMD's Phenom cannot be clocked much higher because it's allready using more than 125Watt. Correct so far, or do I get something wrong?

No, no, no, no, no. Check out the last slide on this page: http://www.bit-tech.net/news/2008/03/27/amd_announces_new_phenom_processors/1 . See the massive jump in TDP when going from 2.4GHz to 2.5GHz?

That's because AMD have set the allowable voltage for the 9850 higher than the rest of their quad core line, that's also why AMD's tri cores are stuck in the same TDP as their quad core speed-equivalents. As a result AMD are then able to grab any chips that didn't quite make the grade and sell them anyway.

But as to the reason for Phenom being clocked so low: it was designed to be efficient, it's targeting the (fast growing) high efficiency server area. That's maybe not the reason why it is clocked so low, it also has a wierd idle instability issue where it's unstable at idle but stable at load.

Anyway the only point that I'm trying to make is that it isn't a thermal issue that stops Phenom (or to be more accurate K10, K10h, Barcelona or Agena) being clocked higher.

Mart · May 6, 2008

Thanks Arun, that really cleared things up for me

Time said:
Anyway the only point that I'm trying to make is that it isn't a thermal issue that stops Phenom (or to be more accurate K10, K10h, Barcelona or Agena) being clocked higher.

Point taken, thanks for clarifying!

aca · May 7, 2008

In conclusion, TSMC looks like they're delivering very well on their roadmap and getting ahead of everyone else in the industry, at the obvious exception of Intel. That doesn't allow them, however, to break the laws of physics; silicon still leaks, and power density is still going up. Both will have increasingly important effects on the semiconductor industry, and executive-level strategic decisions must be made with proper consideration of the severe changes ahead.

The article provides a good discussion on the power matter. But I cannot really place the sarcastic remark that I put in bold above. To me, it kind of deprecates the level of the article. In order to preserve the same content, it would be better (imho) to write it like:"Despite good prospects concerning integration and bla bla bla , the power density was not improved. It is reckoned that the latter will have important effects bla bla bla". But anyways, the article provided good content. Interesting to read, and I hope to see more of these coming.

Arun · May 7, 2008

aca said:
The article provides a good discussion on the power matter. But I cannot really place the sarcastic remark that I put in bold above. To me, it kind of deprecates the level of the article. In order to preserve the same content, it would be better (imho) to write it like:"Despite good prospects concerning integration and bla bla bla , the power density was not improved. It is reckoned that the latter will have important effects bla bla bla".

Well, in my mind, that doesn't really say the same thing. I could have been more detailed there though, I admit - my point was that obviously TSMC would be able to add a lot of value for their customers if power/perf scaled down as fast as perf/(mm²*wafer cost). So if there was a way to achieve that, they would have done it - but they clearly didn't see one.

So arguably lower power consumption would add more value for many of TSMC's customers than higher performance per transistor, yet they just can't achieve that. TSMC's statements about high-k indicate they don't think it'll be a major help either; so there are just incremental improvements on the horizon, and it'll just become more and more of a problem.

But anyways, the article provided good content. Interesting to read, and I hope to see more of these coming.

Thanks!

I've been thinking of doing more of that kind of analysis in the form of articles, rather than as part of news pieces. Stay tuned...

Time · May 7, 2008

aca said:
The article provides a good discussion on the power matter. But I cannot really place the sarcastic remark that I put in bold above. To me, it kind of deprecates the level of the article. In order to preserve the same content, it would be better (imho) to write it like:"Despite good prospects concerning integration and bla bla bla , the power density was not improved. It is reckoned that the latter will have important effects bla bla bla". But anyways, the article provided good content. Interesting to read, and I hope to see more of these coming.

Are you kidding, it's sarcastic remarks like that that keep me awake though press releases

.

Back to the article: So heat density is going up and IHS can't handle it? Will ATI's ringbus design now be more useful compared to NVIDIA's crossbar?

aca · May 8, 2008

Arun said:
Well, in my mind, that doesn't really say the same thing. I could have been more detailed there though, I admit - my point was that obviously TSMC would be able to add a lot of value for their customers if power/perf scaled down as fast as perf/(mm²*wafer cost). So if there was a way to achieve that, they would have done it - but they clearly didn't see one

So arguably lower power consumption would add more value for many of TSMC's customers than higher performance per transistor, yet they just can't achieve that. TSMC's statements about high-k indicate they don't think it'll be a major help either; so there are just incremental improvements on the horizon, and it'll just become more and more of a problem.

Probably they know how they could add more value. Still, there is a difference in knowing the path, and walking it. It seems like TSMC is focussing more on the 32 node. Maybe they had issues with metal gates/high-k for the 45nm, and decided to move their efforts to the next node. And knowing the troublesome introduction of high-k dielectrics, it is no shame either.
So I think they will make a step forward again with their 32 products. These look quite promisig: triple gates, very low k isolation, high k dielectric completed with copper interconnects and metal gates. Basically all of these should provide power reduction in the sense of less capacitive loads (leakage, less current drive) and resistive losses. But it will definately be intersting to see how the W/um[sup]2[/sup] evolves and with what kind of designs customers will respond.

Arun · May 8, 2008

32LP and 32G won't support high-k/metal gates, only 32HP will. And TSMC in public statements didn't seem overly excited about high-k; i.e. it's an advantage, but not an overwhelming one given the wafer cost difference.

Anyhow, one thing I realize I might have wanted to make clearer in my news piece: while higher performance won't magically improve perf/watt, it *will* improve (perf/watt)/$ if used towards that goal rather than raw perf/$. This is because you can then sacrifice die area and performance in favour of power throughout the design and synthesis processes, and this gets compensated by the higher performance. So higher performance is always good; but it means the 'free lunch' for engineers is over (arguably has been for a while, but it's really gradual imo) and they'll need to start thinking about more complex trade-offs and start using that perf/mm² advantage as a way to improve (perf/watt)/$ instead.

aca · May 10, 2008

I agree. So how about the development in this power consumption synthesis? Haven't seen much of it. Aren't current approaches merely qualitative? If we are going this route, I expect that there should be more quantitative analyses for doing EDA. Especially in the field of layout.

silent_guy · May 10, 2008

aca said:
I agree. So how about the development in this power consumption synthesis? Haven't seen much of it. Aren't current approaches merely qualitative? If we are going this route, I expect that there should be more quantitative analyses for doing EDA. Especially in the field of layout.

I'm not entirely sure what you're hinting at. At this point, it is possible to estimate final power consumption with, say, 10% accuracy, based on just your RTL code. In many cases, this can be done even without any simulation stimuli, if you're being smart about toggle factors. (The current tools, like PT-PI, can estimate toggle rates of combinational gates based on the logic cone that drives it.)

When you're talking about power EDA tools for layout, you're really already so far ahead in the process that there's no room for design iterations: your estimates will just become more accurate.

What's really needed are guidelines about saving power on the high-level architectural level. This is much, much harder and doesn't go much farther than 'don't move data all around the chip' or 'try not to recalculate what you've already calculated. And even then I wouldn't expect miracles: at the end of the day, transforming the same input into the same final output, will still require a certain minimum of logical operations.

soylent · May 17, 2008

It was my understanding that the (non-leakage) power output of CMOS was effectively proportional to the cube of frequency(because P proportional to V^2*f; minimum voltage for which the IC is capable of operating roughly proportional to frequency), but only linear in surface area.

Given that 3d graphics is inherently so embarassingly well suited for parallelization, can't GPU's capture a big increase in performance from a smaller process even if each transistor has the same power output as last generation by just using more of them operating at a slightly lower frequency?(assuming leakage is kept under control)

[Analysis] TSMC 40G to deliver up to 3.76x the perf/mmÂ² of 65G & Power Implications

B3D News

Beyond3D News

MTd2

Mart

Arun

Unknown.

Time

Mart

aca

Arun

Unknown.

Time

aca

Arun

Unknown.

aca

silent_guy

soylent

Similar threads