AMD: R7xx Speculation

Discussion in 'Architecture and Products' started by Unknown Soldier, May 18, 2007.

Thread Status:
Not open for further replies.
  1. dnavas

    Regular

    Joined:
    Apr 12, 2004
    Messages:
    375
    Likes Received:
    7
    Heh, thanks "silent_guy" :)
    Lots of stuff there I didn't know -- and here I was only worried about alignment and heat removal....

    'k. Not yet convinced of the utility of the ringbus, but it does have the attribute that it doesn't need a centralized piece. On the otherhand, it isn't clear to me where some of the other bits hang out yet either -- tesselator, rasterizer....

    -Dave [redonning skeptic hat, wondering how this approach is anything other than a way to avoid building high-end chips....]
     
  2. _xxx_

    Banned

    Joined:
    Aug 3, 2004
    Messages:
    5,008
    Likes Received:
    86
    Location:
    Stuttgart, Germany
    Feel free to elaborate :) It's probably a trade-off in other areas then.
     
  3. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,125
    Likes Received:
    2,885
    Location:
    Well within 3d
    AMD's K8 has a 12-stage integer pipeline that has a decoded instruction cross a significant portion of the processor's die.
    Floating point might take up to 17, and it crosses the width (narrow side, but still not 1/10 the length of R600) of the Opteron die.

    AMD's L2 cache latency is 12 cycles, of which about 9 are actually involved in accessing the cache, which is over half the die right there.

    At the same time K8 clocks in at 3 GHz, much higher than R600.

    Itanium has a huge cache that covers large portions of the chip, but it has a 14 cycle L3 cache latency.
    That means the worst-case where an L3 line the furthest from the core is accessed takes that time to make it back. Several cycles are probably needed for tag comparisons, so it takes less than 14 cycles for the signal to cross the cache.
    Couple that with Itanium's pipeline length, we're talking 8 cycles for an instruction to cross the core portion.
    That's on the order of maybe 20 cycles worth of time for a load needed for an operand on an executing result from the L3 to propogate through a cache an layers of complex logic.

    Last I checked, Itanium is larger than R600 and clocked twice as high.
     
    #63 3dilettante, May 23, 2007
    Last edited by a moderator: May 23, 2007
  4. _xxx_

    Banned

    Joined:
    Aug 3, 2004
    Messages:
    5,008
    Likes Received:
    86
    Location:
    Stuttgart, Germany
    But isn't the situation much different with the kind of workload we have in the GFX-cards? The pipelines are much longer etc. AFAICR.

    EDIT: and thanks :)
     
    #64 _xxx_, May 24, 2007
    Last edited by a moderator: May 24, 2007
  5. aca

    aca
    Newcomer

    Joined:
    May 4, 2007
    Messages:
    44
    Likes Received:
    0
    Location:
    Delft, The Netherlands
    I think it's about time to replace those classical mechanical saws by laser cutting machines.
     
  6. Bouncing Zabaglione Bros.

    Legend

    Joined:
    Jun 24, 2003
    Messages:
    6,363
    Likes Received:
    82
    Wouldn't the heat generated by a laser needed to cut the substrate destroy the chips? Sure, a saw may generate heat as a side effect, but it's primary cutting mechanism is mechanical. To use a laser that has no mechanical cutting edge means you need a lot of heat to perform the cut, and I can't see today's modern chips being able to survive that.
     
  7. IbaneZ

    Regular

    Joined:
    Apr 15, 2003
    Messages:
    743
    Likes Received:
    17
    The Next Last Last Last R700 Speculation Thread (2009) is gonna rock. :lol:
     
  8. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    (Deleted, as BZB wrote exactly the same thing.)
     
  9. nicolasb

    Regular

    Joined:
    Oct 21, 2006
    Messages:
    421
    Likes Received:
    4
    Maybe you could use an ultra-violet laser? That wouldn't generate heat.
     
  10. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,125
    Likes Received:
    2,885
    Location:
    Well within 3d
    The point of a laser cutter is to burn or vaporize away whatever conntects the two sides of what you want separated.
    How would a UV laser burn the silicon away without heating it?
     
  11. aca

    aca
    Newcomer

    Joined:
    May 4, 2007
    Messages:
    44
    Likes Received:
    0
    Location:
    Delft, The Netherlands
    When used it in a pulsed manner, I think it would be fine: you can have an average below thermal annealing temperature. Furthermore, the accuracy of the beam helps to localize the heat as good as possible. In other words: the cutting line can be realy sharp (=less area=> less heat).
     
  12. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    This is about producing millions of pieces. You don't exactly have the luxury to go at a leisurely pace. Do you really think you can heat up silicon enough to vaporize it without the heat extending father that 80um?
     
  13. Bouncing Zabaglione Bros.

    Legend

    Joined:
    Jun 24, 2003
    Messages:
    6,363
    Likes Received:
    82
    Here's an interesting document:
    If lasers were a preferable replacement for saws, then Intel, IBM, TSMC, AMD or any one of the other chip manufacturers would be using them. They already use lasers for etching the surface of chips or cutting jumpers on the die packaging, so the fact they don't use them for cutting dies from the wafer should tell you something.

    Lasers may become more useful as wafers get thinner, but again, chip layers are getting denser, so this may offset the viability of laser cutting. Chips are basically a glass-like substance full of heat-sensitive circuits (that can destroy themselves if operated for a few tens of seconds without a heatsink), so it doesn't seem very practical to use a cutting method that relies purely on the heat from a beam of light to cut through the wafer.
     
    #73 Bouncing Zabaglione Bros., May 24, 2007
    Last edited by a moderator: May 24, 2007
  14. nicolasb

    Regular

    Joined:
    Oct 21, 2006
    Messages:
    421
    Likes Received:
    4
    That's what UV lasers do. The energy contained in a UV photon is high enough to destroy any chemical bond. So UV lasers directly sever the bonds between atoms.

    By way of a rather crude analogy, if you imagine atoms connected together with string :) a conventional laser makes all the atoms jerk to and fro faster and faster until eventually they are torn loose. To do that you have to make a large number of atoms vibrate furiously to and fro. A uv laser can go straight in and directly cut the strings without introducing vibrations.
     
  15. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,125
    Likes Received:
    2,885
    Location:
    Well within 3d
    Chemical bonds are states of lower energy between two atoms.
    If a bond is broken, then enough energy was injected into the system to push the atoms out of their bond.
    That energy eventually has to be dealt with.

    Is a UV laser simply able to do this without exciting too many of the atoms, since its mechanism is no different than a lower-frequency laser?
     
  16. 3vi1

    Newcomer

    Joined:
    Jan 25, 2007
    Messages:
    22
    Likes Received:
    3
    Jawed made a post a while back referencing this page to which I have been spending some time reviewing - I'm halfway through the course work now (listening to the lectures and reading the power point slides).

    Anyhow, this morning the NVIDIA guy (teaching) said some interesting stuff on the Lecture5 tape. He suggested (at 29:40) that currently the TFs (Texture Filters) are done through dedicated hardware but once revealed through the API with programmable elements the floating point power would be roughly doubled.

    Can anyone confirm if that idea is a viable reason to why G90 would reach nearly 1 TFlop?

    He then goes on to say (at 1:04:20) that Bi/Tri linear filtering is also currently done through and dedicated hardware and that the Next generation or the generation after that will utilize all those floating point units for general computing.

    Is this info helpful?


    PS: I listen to the lessons at about 160% normal speed. Try it if you don't normally use that feature - it's very helpful.
     
    Jawed likes this.
  17. bdmosky

    Newcomer

    Joined:
    Jul 31, 2002
    Messages:
    167
    Likes Received:
    22
    What's wrong with cutting wafers using high pressured water?
     
  18. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    299
    Location:
    UK
    3vi1: Very nice find, now I regret not having listened to ALL those lectures, heh! :)
    Interestingly, what David Kirk seems to be describing there is programmable filtering hardware (which NVIDIA doesn't have a patent on), rather than using the ALUs for filtering and then load-balance (which they do have a patent on).

    I guess it is fairly logical that if you want to implement one, the other goes in hand with it. Otherwise you're just tempting programmers to use filtering operations not supported on the TMUs and you're wasting that silicon completely...

    And I don't think this is a G9x feature, let alone because that'd be ridiculously more advanced and programmable than what the D3D10.1 API asks for. Furthermore, it's really not obvious to me how you be able to use that in a generic manner.

    You could think of it as a coprocessor to the main ALUs for GPGPU, but that's kind of messy. So that's probably what he's thinking of in terms of not being sure how to expose it. That seems to imply it would actually be exposable, so that G80's filtering hardware is already programmable, but I think he stopped just short of confirming that really and that it simply isn't the case. I'd love to be wrong, though.
     
  19. aca

    aca
    Newcomer

    Joined:
    May 4, 2007
    Messages:
    44
    Likes Received:
    0
    Location:
    Delft, The Netherlands
    Hehe, that is reasoning the other way around. And I can't really agree on this point. IBM, Intel and so on want to use proven tech, unless it's really really really really proven that lasers are better. And this doesn't stop just at the manufacturing level. Also architecture, devices and so on. I have some nice stories from a colleague from Intel Haifa (Israel) to illustrate it, but it would go a bit off topic. :wink:

    The size of the ammount of layers on top of the chip is quite small compared to the substrate. And the substrate needs to be thick for mechanical stability, mainly limited by 'the saw'. Especially in my own field (RF analog IC design) we would like the substrate to be as thin as possible to limit the parasitics. (We hate it when our 60 GHz signals leak away :mad: ).
    And I'm not really sure what you're referring to with the 'heat sensistive circuits'. You're reffering to burn-out or some kind of latchup? Or maybe junction break down? Usually wells in combination with specific doping profiles will make sure that this kind of failures will rarely occur.

    Sorry for the off topic btw. Back to R700 :!:
     
  20. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,430
    Likes Received:
    433
    Location:
    New York
    Hmmm where's tertsi...? I believe he had some theories about the possiblity that G80 could schedule MADs on the filtering units.
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...