Rv770/Rv790/rv740 and beyond - linear frequency scaling FTW?

Discussion in 'Architecture and Products' started by turtle, Mar 30, 2009.

  1. turtle

    Regular

    Joined:
    Aug 20, 2005
    Messages:
    279
    Likes Received:
    8
    Rv770/Rv790/rv740 and beyond - linear frequency scaling FTW? (or why AMD wins and your electric bill loses)


    After looking at the recent leaks of 4890, it all seemed to fall in line with what rv770 could do if given the power and a process capable. I broke down the math, and it falls in line. If you think this is crazy, I encourage you to read the rational for why this is important a few paragraphs down under the italic header.

    Percentage of mhz increase vs. freq/tdp if linear:

    690-700mhz = 110W (4850)
    120%= 828-840mhz= 160W (4870)
    140% = 966-980mhz = 210W
    160%=1104-1120mhz = 260W

    mhz vs % increase/wattage if linear:

    850mhz = 122-124%- 160-162W
    1000mhz = 143-145% = 217-222W

    wattage versus % if linear:

    225W = 146% = 1007-1022mhz
    300W = 176% = 1214-1232mhz

    The first and second numbers include the tdp of 4850/4870 and avg max clockspeed at their set voltage, they do fall in line, and are 'knowns'. 4850 is used as a baseline. Also, linear scaling shows how 1000mhz at under 225W (2x6-pin) is possible (as well as 'coincidentally' 4890's max oc slider in CCC), as well as a possible tdp @ 850mhz. Rough rule of thumb would be frequency percentage increase over 700/700, and taking that number when .2=50W. Add that to 110W, and you'd have a TDP estimation.

    ex: 1000/700 = 1.428... .428... when .2 = 50W = 2.14. 2.14x50 = 107. 107+110= 217W.

    I'm way too tired to try to figure out the theoretical algebraic formula.

    IMO, a GTX285 at stock would require a rv790 @ ~1100mhz. Such is the world with coincidences: The original GTX280 has a 236W TDP, a 1050mhz rv790 theoretically would have a tdp of 235W. On the flip side, at GTX285's TDP, you'd end up with a rv790 @ 910mhz, which would compete well with the original 280.

    In other words:

    GTX280 TDP ~= rv790 tdp @ GTX285 performance
    GTX285 TDP ~= rv790 tdp @ GTX280 performance


    Is this coincidence a prelude to the future?

    Bye-bye performance-per-watt, hello performance-per-mm2 (or How Everything Old is New Again)

    This makes sense for several reasons. Obviously, small die sizes help AMD with profit. With such a scalable (freq) architecture, the trickle down philosophy makes sense, as parts with less shaders can inevitably be clocked higher to compensate against former higher generation parts they are replacing as the newer chip is built on a smaller process, or even if it's not...Just give it more power. For instance, 800mhz and 950mhz rv740 parts replacing rv770, after the initial launch at lower frequencies to replace rv730. I imagine this leads to smaller architectural changes and greater frequency hikes in the future, even at the high end. Could rv870,for instance, be around 1ghz, and it's 32nm counterpart 1200mhz? With as few as 1280 shaders this could be formidable to 384sp nvidia counterpart on 40nm and 32nm, as smaller die sizes could allow for greater obtainable frequencies in a similar power envelope. If the die is kept small but somewhat similar in size, the option to bump frequencies is always there, if competition demands, just like rv790. You have much less performance flexibility with a larger die.

    For instance, if nvidia's 40nm performance chip is close to 300mm2,and ATi's is closer to 200mm2, it's not absurd to think that ATi is shooting for a very high clockspeed (1ghz), and nvidia is shooting again for a greater amount of units with a lower clockspeed (700c/1750s and/or 800c/2000s). In such a scenario, ATi would likely have a similar performing product using 2/3 the wafer space. Think of it as an massively overclocked 3870 versus a 9800GTX. Rv870 will have a similar die size to rv670. GT212 will have size similar to G92. 3870 had a TDP of 105W. 9800gtx had a tdp of 156W. What if 3870 would have had a second 6-pin connector, a 225W tdp, and a voltage hike? How would have they competed in that case?

    I think we're about to find out.

    I find the die size game between nvidia and ATi amusing; medium, larger, huge, back to , and now smaller with a freq hike. It seems to me that ATi is one step ahead in this game, and it may culminate this next round. I imagine they have some crazy formula that the ghost of Noodle whispered to Dave one night sometime around the time of rv670. Something along the lines of "The chips must be proportional in size...205mm2 (rv870), 137mm2 (rv740), 68mm2 (rv810). The ratio of units must be in proportion to power envelopes for frequencies now and in the future in adjacent performance sectors, build the smallest 256-bit you can, the memory controller on trickle-down 128-bit and 64-bit chips will be compensated by trickling down of faster memory, and in doing this, will allow for higher frequencies to compensate on the high end."...etc.

    "Oh yeah, and don't forget to wipe."

    That's one smart cat, I tell you.

    At any rate, I find it interesting that everything old is new again. Back to smaller chips, only now with a new spin. Now it seems the frequency that can be cranked out from smaller dice actually can compete with the architecture additions that that clockspeed has to be balanced against. That's a big deal, and a large fundamental change. ATi sees this coming...Does Nvidia?

    Another question remains: Is the smallest possible 256-bit bus chip that fits the architecture and it's fab processes the ideal candidate to test the theory, or is there a better balance of slightly larger dice still able to obtain high speeds in a similar power envelope?

    If all inventive competitions are based around building a better mousetrap, does the tdp/architecture balance now heavily tilt towards adding frequency to that equation with a weight much higher than before?

    I think so.
     
  2. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,837
    Likes Received:
    2,132
    Location:
    Germany
    To put this into perspective I'd love to get the IHVs stands on how content they were/are with recent processes' results.

    If both say (and honestly mean it!): Everything went smooth as silk, then you'd definitely have a point.
     
  3. Tchock

    Regular

    Joined:
    Mar 4, 2008
    Messages:
    849
    Likes Received:
    2
    Location:
    PVG
    Turtle, you didn't consider the RAM?

    It would appear to give the 4870 an even better case, but screws around with the percentages.
     
  4. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,106
    Likes Received:
    1,071
    With only two datapoints both subjected to noise, (and a narrow interval,) anything will look linear.
    To make a better investigation, simply take a 4870 and downclock and upclock it in fairly close steps. A plug level power meter is sufficient to read power at the different frequencies. Subtract the baseline for clarity of curve shape.

    This is a reasonable methodology BUT it misses the most important point - voltage.
    If you down clock, you can get away with lower voltages, and enjoy drastically lower power consumption. This is a common path among graphics enthusiasts that also value power conservation and/or silence. If you increase clocks on the other hand, you will need to increase voltage (and cooling) to maintain stability, drastically increasing power draw.

    Over the years and under competitive pressure, the PC semiconductor industry have pushed their parts ever higher on the power curve, and closer to the limits of stable operation. The data I've seen show the function of power vs. clock for most parts to be roughly O(n^3) including the voltage effects above.
     
  5. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    Also, with HD4890X2 unlikely to be "official" (HD4870X2 is pretty much at the limit) it seems AMD's garden is not particularly rosy.

    Jawed
     
  6. aaronspink

    Veteran

    Joined:
    Jun 20, 2003
    Messages:
    2,641
    Likes Received:
    64
    In general, all the vendors are at the limit. Everyone has pretty much reached peak power in each of the graphics market segments.

    You can look at it as thus:

    Max perf without external power connectors
    Max perf with 1 power connector
    Max perf with 2 power connectors

    Both ATI and Nvidia have hit those points and actually exceeded them in reality though not on purpose it seems.

    So going forward unlike the past neither have the easy solution of just throw more watts at the problem. In general this will probably result in either a greater time between release (which some will argue has already happened to some extent and will only get longer) and/or slower overall increases in performance. One can in some regard liken it to the slowdown in performance that affected CPUs once they hit the 100+ W point.
     
  7. Freak'n Big Panda

    Regular

    Joined:
    Sep 28, 2002
    Messages:
    898
    Likes Received:
    4
    Location:
    Waterloo Ontario
    It's going to be interesting to see how 40nm changes the landscape. Nvidia didn't gain much in the way of power savings with the transition from 65-55nm. If the transition to 40nm doesn't save much power either we're in for a pretty boring DX11 generation from a performance standpoint.
     
  8. AnarchX

    Veteran

    Joined:
    Apr 19, 2007
    Messages:
    1,559
    Likes Received:
    34
    Hmm...

    GTX 280 -> 285, 50W lower TDP while ~10% increased clocks.

    Also on G92 brought 55nm and time some nice improvements:
    http://www.xbitlabs.com/articles/video/display/palit-gf250gts_6.html#sect0
    ...GTS 250 with 738/1836MHz (128SPs) has a similar consumption as the 65nm 8800 GT with 600/1500MHz (112SPs):
    http://www.xbitlabs.com/articles/video/display/gainward-bliss-8800gt_5.html#sect0
     
  9. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,428
    Likes Received:
    181
    Location:
    Chania
    G80 turned out over 2.5x times faster than a G71 without increasing by that margin power consumption, while being on the same 90nm manufacturing process. Any objections you might have to that example, you may want to consider that GT3x0 has been laid out for 40nm from the get go and include possible architectural changes that would help controlling better power consumption.

    I don't think any of the two IHVs would rely exclusively on 40nm for saving their day in that department. Especially since the performance increase targets with each new technology generation is bigger than with anything else.
     
  10. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    the 55nm GTX 260 did use the same amount of power than the 65nm version but it's because the G200-B2 revision was used (always a B3 revision on the 285 and 295)
     
  11. AnarchX

    Veteran

    Joined:
    Apr 19, 2007
    Messages:
    1,559
    Likes Received:
    34
    Wrong:
    http://www.xbitlabs.com/articles/video/display/evga-geforce-gtx260-216-55nm_5.html#sect0
    ...Xbit-Labs saw also some nice improvements on 55nm B2 EVGA GTX 260.

    I think it was more a problem, that some sites compared 1.12V 55nm cards with 1.06V 65nm GTX 260 (which entered the market in late 08), so they got equal numbers.

    B3s strengths are higher clocks (GTX 285, B3-GTX260-OC results) and the possibility to power GTX 295 die @ only 1.05V.
     
  12. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    It'll be interesting to see if there's a decent bump in performance/W from Global Foundries - something that won't become relevant for a while though.

    Jawed
     
  13. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,837
    Likes Received:
    2,132
    Location:
    Germany
    Seconded. There was large variation on the samples, so one might have been better off with taking a look at the supplied voltages.
     
  14. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,428
    Likes Received:
    181
    Location:
    Chania
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...