AMD: R9xx Speculation

Discussion in 'Architecture and Products' started by Lukfi, Oct 5, 2009.

  1. rjc

    rjc
    Regular

    Joined:
    Oct 27, 2008
    Messages:
    270
    Likes Received:
    0
    Just following up previous post...it's spelt "Hecatonchires", of which there are 3 members: Briareus, Cottus and Gyges.

    Interestingly Briareus ~ "sea goat" has another name "Aigaion", or "Aegean" in English.

    Looking at the NI codenames - Mykonos, Ibiza and Cozumel. Mykonos is an island in the Aegean sea, would be nice if the 2 chips were somehow related, make things a bit easier to remember.

    Sadly suspect AMDs intention is the opposite. :cry:

    Edit: Misremembered the Strings the Gipsel found in the driver: should be Kauai not Mykonos, so no obvious relationship betwen the two series.
     
    #441 rjc, Mar 30, 2010
    Last edited by a moderator: Mar 31, 2010
  2. jaredpace

    Newcomer

    Joined:
    Sep 28, 2009
    Messages:
    157
    Likes Received:
    0
    Could someone please explain how ati designs are more dense with transistors but use less power than Nvidia designs? Is that considered an example of superior engineering? I would assume it is, but how do they do it?

    Also,
    N.I. = rv1070 (new design, new node, next year), and
    Hecatonchires = rv970 (refresh, same node, this year)
    ?
     
  3. aaronspink

    Veteran

    Joined:
    Jun 20, 2003
    Messages:
    2,641
    Likes Received:
    64
    Comes down essentially to overhead per ALU. Nvidia effectively has a control overhead per 16 alus (8 in G80), iirc. ATI has a control overhead per 80 alus. This has impact not only in the pipeline but also in the register file and any bypass networks.
     
  4. John021

    Newcomer

    Joined:
    Jan 1, 2010
    Messages:
    29
    Likes Received:
    0
  5. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    Fits Lixian's theory to a T.

    @TSMC, what is up with you guys? Is cancelling/fucking-up processes the new black?
     
  6. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    BTW, do the contracts include penalties should a fab fuck up a process, screwing customer roadmaps?
     
  7. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    The memory controllers and miscellaneous doodads would seem to fit the somewhat nebulous uncore (not sure what counts as core in a Cypress-style GPU).
    The L2 cache seems tied closely enough to the memory controllers, so that might change.

    The shader arrays are tightly linked to the TMU, and LDS, and the GDS has to interface with them. That seems to cap what can be done about those.
    The scheduling hardware is also tightly linked to the current SIMD structure and instruction support.
    ROPs interact with memory and shader writeback, but what point is there in modifying them if the shaders that feed them don't change?

    Perhaps whatever is on the other side of the setup engine can be fiddled with. The setup engine feeds into the scheduler, so it is one step removed from the shaders. Would that count as uncore, or at least notcore? (the latter not being a serious word)
     
  8. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    Exactly what Lixian said.:smile: Setup/raster+tess+cache. And since these were the architectural advantages of fermi, then without B1 it seems 6870 should be able to beat a 480.
     
  9. Squilliam

    Squilliam Beyond3d isn't defined yet
    Veteran

    Joined:
    Jan 11, 2008
    Messages:
    3,495
    Likes Received:
    114
    Location:
    New Zealand
    I guess then that might add credence to the idea that there will be a B1 variant of G100? (If the above is true).

    However early information can often be false information and if ATI are going through GloFo then the regular rumour mill doesn't exactly apply here.
     
  10. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    I wonder what will happen to the leaks from ATi side should they switch to GF. Are the fab workers in Germany/NY any less loose lipped?
     
  11. Squilliam

    Squilliam Beyond3d isn't defined yet
    Veteran

    Joined:
    Jan 11, 2008
    Messages:
    3,495
    Likes Received:
    114
    Location:
    New Zealand
    I would say that the standard rumour mill isn't developed as much in Germany as compared to Taiwan etc because there hasn't been any reason to really point our noses in that direction for rumours and tidbits.
     
  12. dizietsma

    Banned

    Joined:
    Mar 1, 2004
    Messages:
    1,172
    Likes Received:
    13
    With TSMC making a mess of 40nm and 28nm causing concerns it does make you wonder whether AMD and nvidia will be thinking more and more about Global Foundaries or other.
     
  13. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    http://www.semiaccurate.com/2009/08/18/nvidia-takes-huge-risk/
    http://www.semiaccurate.com/2010/03/30/atis-next-generation-outed/

    After 40/32/28, can't say I blame nv/ati. :???:
     
  14. GZ007

    Regular

    Joined:
    Jan 22, 2010
    Messages:
    416
    Likes Received:
    0
    They could also change the ALU clocks like nvidia did long ago. ROP-s and TMU-s are likely bandwith bound but ALU-s usualy not.
    Its quite hard to increase the clocks to 1.2 GHz for the whole chip, but not if u would just increase the SP clocks to 1.2 GHz.
    Something like 900 MHz for the whole gpu and 1.2 GHz for the SP-s.(1.3333 multiplier) Just this mild frequency increase could earn with the same 1600 SP-s near 40% more performance for same die area.
    But It surely has some penalties to keep two clock domains on the chip.
    edit: Actualy whats the disadvantage to keep two or more clock domains on the chip ?
     
  15. hkultala

    Regular

    Joined:
    May 22, 2002
    Messages:
    297
    Likes Received:
    38
    Location:
    Herwood, Tampere, Finland
    1) Having multiple clocks that are not multiplies of each others is problematic for data integrity, needs additional buffers and data integrity logic

    2) These increase the latency of the data going thru the clock speed boundary

    3) Distribution of clock signals becomes more complex thing to do when there are multiple different clock signals

    4) Control logic becomes more complex when everything cannot be calculated by simple clock cycles.
     
  16. GZ007

    Regular

    Joined:
    Jan 22, 2010
    Messages:
    416
    Likes Received:
    0
    The 9xxx nvidias had 2.5 multiplier. My 1.3333 was a bad example :lol:. But of course keeping in sync the scalar sp-s could be much easyer than the vector ones.
     
  17. eastmen

    Legend Subscriber

    Joined:
    Mar 17, 2008
    Messages:
    13,878
    Likes Received:
    4,727
    Its looking more and more like the 5870 is bandwidth limited at high resolutions by reading the 2GB edition benchmarks.

    Do you guys think ati will go with a wider bus with the refresh ?
     
  18. hkultala

    Regular

    Joined:
    May 22, 2002
    Messages:
    297
    Likes Received:
    38
    Location:
    Herwood, Tampere, Finland
    2.5 is also fractional multipler, its equally bad(or might be worse, as requires multply/division by 5, 4/3 requires multiplying/division by only 3 and 4)

    2 is an easy one. 4 is easy one. 8 is easy one. 3, 5, 6 are bit more difficult, but easier than 2.5 or 1.33
     
  19. Mindfury

    Newcomer

    Joined:
    Oct 6, 2009
    Messages:
    232
    Likes Received:
    0
    According to B3D benchmark,5870 is not bandwidth limited.

    I'm pretty sure ATI won't use weird bus like 384bit.512bit bus will cost too much die space.I think they will stay with 256bit bus this year.
     
  20. GZ007

    Regular

    Joined:
    Jan 22, 2010
    Messages:
    416
    Likes Received:
    0
    If the gpu was designed for 256bit bus and around 4.8GHz clocks than just increasing the clocks wont show to much increase. U have the same 32 ROP-s for same 8*32bit controlers and buffers.
    As the caches are now several hundred GB/s , and the gpu is designed for a given buss width ,bandwith , ROP-s,buffers u wont see much difference with just memory clock changes. At least my theory. :oops:
    I just want to say that u could see much more difference with the 384bit when the whole gpu would be designed round it.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...