[Analysis] TSMC 40G to deliver up to 3.76x the perf/mm² of 65G & Power Implications

Discussion in 'Graphics and Semiconductor Industry' started by B3D News, May 1, 2008.

  1. B3D News

    B3D News Beyond3D News
    Regular

    Joined:
    May 18, 2007
    Messages:
    440
    Likes Received:
    1
    [Analysis] TSMC 40G to deliver up to 3.76x the perf/mm² of 65G & Power Implications

    It turns out TSMC's 40nm general-purpose process will be even more impressive than previously expected: we knew it was going to sport 2.35x the gate density of 65nm, but now it turns out it'll deliver 60% higher performance too for a theoretical 3.76x perf/mm² boost. The big question now? Power.

    Read the full news item
     
  2. MTd2

    Newcomer

    Joined:
    May 13, 2004
    Messages:
    212
    Likes Received:
    0
    I emailed the administrator, but It didnt work.

    The link from the front page is broken to this page...

    Friendly Neighborhood Mod Edit: Thanks, it's now fixed. FYI, Site Feedback might be a better second option.
     
  3. Mart

    Newcomer

    Joined:
    Sep 20, 2007
    Messages:
    27
    Likes Received:
    0
    Location:
    Netherlands
    This news got me a bit confused. Would anybody be so kind to explain the difference between power saving and performance increase? From what I've always learned, chips do wonderfull things, use power to do it, and generate heat by doing so. The chip can be clocked at a certain speed, but if you clock it higher, it will use more power and get hotter. There is a certain ceiling for the clock speed in the amount of power you can use and the heat you can dissipate. AMD's Phenom cannot be clocked much higher because it's allready using more than 125Watt. Correct so far, or do I get something wrong?

    Now, if you save power, you would also generate less heat. Doesn't that all mean you can just clock your chips higher? However, the article makes a distinction between power saving and performance increase. How does that work?
     
  4. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    299
    Location:
    UK
    Easiest way to explain that is to make you realize that 20 years ago, chips took very little power. Yet they couldn't have clocked as high as they do today. And single-cores don't clock 4x as high as quad-cores.

    Power and heat can limit clock speeds; however, they're not the main factor especially outside of the ultra-high-end. I won't go into this too much, but clock speed is very much a factor of both the process and the design (more specifically, the weakest link in a given clock domain of the design).

    The reason why modern chips take more power than old ones is that dynamic power/transistor hasn't scaled down as fast as transistor density scaled up, while leakage also went up. As I said, there's no magical solution to that problem on the horizon. So it's going to become more and more important for the reasons I explained in the news item to optimize for performance/watt, even in domains that traditionally haven't done so much if at all.
     
  5. Time

    Newcomer

    Joined:
    Mar 15, 2008
    Messages:
    13
    Likes Received:
    0
    Location:
    London
    No, no, no, no, no. Check out the last slide on this page: http://www.bit-tech.net/news/2008/03/27/amd_announces_new_phenom_processors/1 . See the massive jump in TDP when going from 2.4GHz to 2.5GHz?

    That's because AMD have set the allowable voltage for the 9850 higher than the rest of their quad core line, that's also why AMD's tri cores are stuck in the same TDP as their quad core speed-equivalents. As a result AMD are then able to grab any chips that didn't quite make the grade and sell them anyway.

    But as to the reason for Phenom being clocked so low: it was designed to be efficient, it's targeting the (fast growing) high efficiency server area. That's maybe not the reason why it is clocked so low, it also has a wierd idle instability issue where it's unstable at idle but stable at load.

    Anyway the only point that I'm trying to make is that it isn't a thermal issue that stops Phenom (or to be more accurate K10, K10h, Barcelona or Agena) being clocked higher.
     
    #5 Time, May 6, 2008
    Last edited by a moderator: May 7, 2008
  6. Mart

    Newcomer

    Joined:
    Sep 20, 2007
    Messages:
    27
    Likes Received:
    0
    Location:
    Netherlands
    Thanks Arun, that really cleared things up for me :)

    Point taken, thanks for clarifying!
     
  7. aca

    aca
    Newcomer

    Joined:
    May 4, 2007
    Messages:
    44
    Likes Received:
    0
    Location:
    Delft, The Netherlands
    The article provides a good discussion on the power matter. But I cannot really place the sarcastic remark that I put in bold above. To me, it kind of deprecates the level of the article. In order to preserve the same content, it would be better (imho) to write it like:"Despite good prospects concerning integration and bla bla bla , the power density was not improved. It is reckoned that the latter will have important effects bla bla bla". But anyways, the article provided good content. Interesting to read, and I hope to see more of these coming.
     
  8. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    299
    Location:
    UK
    Well, in my mind, that doesn't really say the same thing. I could have been more detailed there though, I admit - my point was that obviously TSMC would be able to add a lot of value for their customers if power/perf scaled down as fast as perf/(mm²*wafer cost). So if there was a way to achieve that, they would have done it - but they clearly didn't see one.

    So arguably lower power consumption would add more value for many of TSMC's customers than higher performance per transistor, yet they just can't achieve that. TSMC's statements about high-k indicate they don't think it'll be a major help either; so there are just incremental improvements on the horizon, and it'll just become more and more of a problem.

    Thanks! :) I've been thinking of doing more of that kind of analysis in the form of articles, rather than as part of news pieces. Stay tuned...
     
  9. Time

    Newcomer

    Joined:
    Mar 15, 2008
    Messages:
    13
    Likes Received:
    0
    Location:
    London
    Are you kidding, it's sarcastic remarks like that that keep me awake though press releases :razz:.

    Back to the article: So heat density is going up and IHS can't handle it? Will ATI's ringbus design now be more useful compared to NVIDIA's crossbar?
     
  10. aca

    aca
    Newcomer

    Joined:
    May 4, 2007
    Messages:
    44
    Likes Received:
    0
    Location:
    Delft, The Netherlands
    Probably they know how they could add more value. Still, there is a difference in knowing the path, and walking it. It seems like TSMC is focussing more on the 32 node. Maybe they had issues with metal gates/high-k for the 45nm, and decided to move their efforts to the next node. And knowing the troublesome introduction of high-k dielectrics, it is no shame either.
    So I think they will make a step forward again with their 32 products. These look quite promisig: triple gates, very low k isolation, high k dielectric completed with copper interconnects and metal gates. Basically all of these should provide power reduction in the sense of less capacitive loads (leakage, less current drive) and resistive losses. But it will definately be intersting to see how the W/um[sup]2[/sup] evolves and with what kind of designs customers will respond.
     
  11. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    299
    Location:
    UK
    32LP and 32G won't support high-k/metal gates, only 32HP will. And TSMC in public statements didn't seem overly excited about high-k; i.e. it's an advantage, but not an overwhelming one given the wafer cost difference.

    Anyhow, one thing I realize I might have wanted to make clearer in my news piece: while higher performance won't magically improve perf/watt, it *will* improve (perf/watt)/$ if used towards that goal rather than raw perf/$. This is because you can then sacrifice die area and performance in favour of power throughout the design and synthesis processes, and this gets compensated by the higher performance. So higher performance is always good; but it means the 'free lunch' for engineers is over (arguably has been for a while, but it's really gradual imo) and they'll need to start thinking about more complex trade-offs and start using that perf/mm² advantage as a way to improve (perf/watt)/$ instead.
     
  12. aca

    aca
    Newcomer

    Joined:
    May 4, 2007
    Messages:
    44
    Likes Received:
    0
    Location:
    Delft, The Netherlands
    I agree. So how about the development in this power consumption synthesis? Haven't seen much of it. Aren't current approaches merely qualitative? If we are going this route, I expect that there should be more quantitative analyses for doing EDA. Especially in the field of layout.
     
  13. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    I'm not entirely sure what you're hinting at. At this point, it is possible to estimate final power consumption with, say, 10% accuracy, based on just your RTL code. In many cases, this can be done even without any simulation stimuli, if you're being smart about toggle factors. (The current tools, like PT-PI, can estimate toggle rates of combinational gates based on the logic cone that drives it.)

    When you're talking about power EDA tools for layout, you're really already so far ahead in the process that there's no room for design iterations: your estimates will just become more accurate.

    What's really needed are guidelines about saving power on the high-level architectural level. This is much, much harder and doesn't go much farther than 'don't move data all around the chip' or 'try not to recalculate what you've already calculated. And even then I wouldn't expect miracles: at the end of the day, transforming the same input into the same final output, will still require a certain minimum of logical operations.
     
  14. soylent

    Newcomer

    Joined:
    May 4, 2005
    Messages:
    165
    Likes Received:
    8
    Location:
    Sweden
    It was my understanding that the (non-leakage) power output of CMOS was effectively proportional to the cube of frequency(because P proportional to V^2*f; minimum voltage for which the IC is capable of operating roughly proportional to frequency), but only linear in surface area.

    Given that 3d graphics is inherently so embarassingly well suited for parallelization, can't GPU's capture a big increase in performance from a smaller process even if each transistor has the same power output as last generation by just using more of them operating at a slightly lower frequency?(assuming leakage is kept under control)
     
    #14 soylent, May 17, 2008
    Last edited by a moderator: May 17, 2008
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...