AMD: Pirate Islands (R* 3** series) Speculation/Rumor Thread

Discussion in 'Architecture and Products' started by iMacmatician, Apr 10, 2014.

Tags:
  1. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    If they did go as large as they could on the interposer, however, that would also raise questions.

    Following the assumption that you want each stack aligned with the GPU, and following the picture that has as little margin as possible, the math gets weird in the absence of implementation details like how much area is lost to spacing requirements or any components.
    Let's just say being flanked by HBM takes a rounded up 12mm off of one dimension of a GPU mounted on a big, say 26x32mm interposer.
    The other dimension of a 14 x ~32 or ~26 x 20 area is the length that the 7.3mm long HBM stacks plus some margin can fit into. If it were not significantly shorter in that dimension as well, it would seem a little bare.
     
  2. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    The picture I linked earlier seems to show a long edge. I say that because two adjacent HBM chips take up 14mm of GPU side and there's about 5mm separating them. There appears to be about another 5mm of interposer from the main HBM chip to the bottom edge of the interposer, which I believe is just visible in the bottom right corner. So three gaps of 5mm plus 14mm takes us to ~30mm.

    My estimate of .25mm spacing twixt HBM module and the large chip seems too small, now. I reckon it's more like 0.5mm. The gap between the edge of the interposer and the HBM module seems to be about the same, for what it's worth. So it seems likely to me that the GPU would be about 1mm smaller than the long dimension of the interposer. So the earlier estimate of 30mm for the long dimension seems reasonable.

    That would then leave the other dimension in the region of 13mm. So that would lead to a GPU die size of something like 390mm².

    Here's an actual production interposer using UMC's 65nm tech on page 9, which is described as "~775mm²". Which is 31x25 :razz: :

    www.semiconjapan.org/en/sites/semiconjapan.org/files/docs/SPR7_25_Xilinx_SureshRamalingam_0.pdf

    The really interesting question is how much die space is gained from deletion of GDDR5 PHY? I'm not too sure of a size estimate for the PHY in Hawaii. Something like 80% of the size of Tahiti's 74mm²: 60mm²?

    Could we be talking about the PHYs for HBM being something like 20mm²? Saving 40mm². Making Fiji the equivalent of a 430mm² conventional chip? Smaller than Hawaii.

    45% more CUs in Fiji over Hawaii, with the CUs taking >50% of the die would be impossible to fit.

    So, I have to wonder if AMD is building Fiji with low double precision capability. Perhaps AMD will go for 1/16th DP rate to squeeze in all those CUs. Then there's the small question of the cost of double the ROPs, double the triangle rate and the new cost for delta colour compression.

    The theoretical transistor counts just don't add up for a mere 390mm² chip.

    Even if we switch dimensions, 19x24 = 456mm². Which would be equivalent to a 496mm² GDDR5 chip. That's only 13% bigger than Hawaii.

    The only way I can reconcile that picture with a 19x24mm GPU is if the HBM chips are asymmetric along the long side of the GPU. The HBM module that's only just visible has its top edge aligned with the top edge of the GPU, so that both are 0.5mm away from the top edge of the interposer. That would make the long edge we can see 7 + 5 + 7 +5 = 24mm.
     
    #1042 Jawed, May 21, 2015
    Last edited: May 21, 2015
  3. tunafish

    Regular

    Joined:
    Aug 19, 2011
    Messages:
    542
    Likes Received:
    171
    This slide is still relevant. It's from Synapse design, they do some parts of the physical design for AMD gpus. AMD very likely has some GPU that's >500mm^2, and at this point I think it's a good bet that that one is Fiji.
     
  4. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    That could be the bottom edge. Not knowing what a GPU interposer should look like at its lower corner, I'm not sure how to interpret the patterning, or that particular shade of beige at the bottom margin of the picture.

    One question I have, looking at the Xilinx pdf, is what requires 5mm or so of spacing. The FPGA's various components don't need that much space, nor did Kalahari's memory stand-ins.

    Perhaps it is more important for the GPU's internal architecture to balance its data paths instead of cramming them too close together, or there are other reasons like better physical reliability if components are not all off on one side.
    Otherwise, it might be possible to pack them bit closer, where having a long dimension that is 3-4x the length of the memory may be useful when a GPU comes out with higher bandwidth needs.

    Hynix gave a picture of the base die's ballout. I can't think of a reason for why the GPU's side would be massively different, so I think it may be between 24-30mm2 for all 4 interfaces. This is rough guesstimate, as I haven't tried to pixel count to any level of accuracy. The slides say roughly 6.0x3.3mm, and the PHY is a perhaps a little over 1/3 of that area.
    http://www.hotchips.org/wp-content/uploads/hc_archives/hc26/HC26-11-day1-epub/HC26.11-3-Technology-epub/HC26.11.310-HBM-Bandwidth-Kim-Hynix-Hot Chips HBM 2014 v7.pdf (page 19)
     
  5. moozoo

    Newcomer

    Joined:
    Jul 23, 2010
    Messages:
    109
    Likes Received:
    1
    Unfortunately I suspect it will be 1/16.
    AMD sets there fp64 performance to be "much more than Nvidia". I don't think they have a particular target value or a commitment to pass on advancements outside of their Firepro range.
    If the R9 390X has a fp32 rate of ~ 8192Gflops (http://www.techpowerup.com/gpudb/2633/radeon-r9-390x.html) then 1/16th gives an fp64 rate of 512Gflops. Which is still much more than Nvidia (barring the one titan).
    Unless something changes re Nvidia or Intel, going forward, I suspect fp64 Gflops will end up being capped ~500-700Gflops, no matter how fast fp32 advances.
    Personally I'd prefer a 1Gflop cap
     
  6. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    You're assuming that AMD knew during the design of Fiji that gm200 would have reduced FP64 performance. I'll be very surprised if Fiji doesn't support 1/4 or even 1/2 in a professional version.
     
  7. flopper

    Newcomer

    Joined:
    Nov 10, 2006
    Messages:
    150
    Likes Received:
    6
    Titanx cant sustain 60fps in witcher 3 at 1920x1080! a new game using 12gb ram.
    costing 1200euro seems silly at best a new card with so much ram cant do better with new games.
    it sits with 9gb not doing anything when you game so I rather have a card that is faster and allows me to utlize all the ram.
    4k gaming is years away for single card set up still.
     
  8. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    The picture could be showing a GPU with HBM2 modules: 8GB in 2 modules with 512GB/s bandwidth...

    That's the only other way I can think of to explain this picture, if it's a picture of an AMD GPU with HBM modules and the GPU is >500mm² and there's a single interposer.

    The alternative is that we're talking about a GPU being mounted upon multiple independent mini interposers. e.g. 2 interposers that run the length of the long side of the GPU, yet merely wide enough to support all of the power and interconnect duties for both the GPU and the HBM modules (microbumps/TSVs should be dense enough to support the two or three hundred amps of current required by the GPU). In this scenario we'd have a GPU that is mostly mounted on some kind of underfill, with two narrow strips of interposer along the long edges, to connect to the HBM modules.

    In that design, the interposer dimension limits (31 x 25mm) are relaxed. Each interposer might be 31 x 15mm. You could then get a GPU in the region of 550mm² or even larger, with HBM modules along two sides of the GPU.
     
  9. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    Agreed. But, knowing the maximum dimensions of an interposer are either 31mm or 25mm, there really isn't much choice about that part of the picture...

    EDIT: Although with 3 HBMs along that edge we'd be talking about 7 + 5 + 7 + 5 + 7mm = 31mm. 6 chips in total and 768GB/s? Or, where's the fourth HBM chip?

    Yes, I was surprised by that spacing too.

    I assume that PHY in the GPU is larger because it's solely driving the address and command interfaces and other bits and bobs, whereas the HBM base die only has to drive data back to the GPU.

    But, I can't work out the labelling on that picture. Why are there channels numbered 0, 1, 4 and 5? Why are there 5 areas for each channel? I think the numbering is a copy-paste error.

    Also, PHY has to include HBM power. At the very least I would guess that the four thin strips in the centre of the "channel 0,1,4 and 5" area are the power, with each of those 4 strips being power for a single memory die in the stack.

    So it seems to me that the PHY area estimate needs to be lowered a smidgen for the GPU, since power is more widely dispersed across the whole surface of GPUs and wouldn't need to be so acutely localised. (Though one could argue that the difference is so tiny, e.g. 0.1mm², that it's pointless to even raise this.)

    So, erm, back to square 1.
     
    #1049 Jawed, May 22, 2015
    Last edited: May 22, 2015
  10. LordEC911

    Regular

    Joined:
    Nov 25, 2007
    Messages:
    789
    Likes Received:
    74
    Location:
    'Zona
    I guess you are talking about the 26x32mm interposer?
    I would imagine a GPU of 21x25mm would be the absolute largest they could do, depending on the spacing requirements and based off of AMD's 5x7mm for a HBM IC.
     
    #1050 LordEC911, May 22, 2015
    Last edited: May 23, 2015
  11. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    When I wrote that I assumed that AMD has taken the decision not to make Fiji a FirePro compute-monster chip, because it would need to support 16GB of memory.

    Of course, there's a theory that simply by attaching HBM2 modules to Fiji, you get a compute-monster chip with 16GB of memory.

    Instead I think AMD looks at this as a test chip for HBM, which is on 28nm (easy to test on). A bit like Tonga seems to be a test for colour delta compression (the chip is absurdly large for its performance, terribly unbalanced in its fundamentals).

    I don't know if AMD can get HBM2 and the next process node to synchronise, say in spring or summer 2016. I would expect it to though. And for there to be a compute-monster chip with 16GB of memory to be derived from that, with >10 TFLOPS single precision performance, therefore ~ 5 TFLOPS double-precision for the compute monster.
     
  12. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    I'm talking about a 31 x 25 interposer, with 2 HBM modules solely on one side. Which I think allows for a 25 x 24mm GPU.
     
  13. Tridam

    Regular Subscriber

    Joined:
    Apr 14, 2003
    Messages:
    541
    Likes Received:
    47
    Location:
    Louvain-la-Neuve, Belgium
    Lightman and Jawed like this.
  14. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,489
    Likes Received:
    400
    Location:
    Varna, Bulgaria
    I wonder how much are the savings on the GPU I/O perimeter padding with HBM? Probably enough to compensate for the tighter memory integration on the same package.
     
  15. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    9,966
    Likes Received:
    4,561
    #1055 ToTTenTranz, May 22, 2015
    Last edited: May 22, 2015
    elect, DieH@rd, Lightman and 2 others like this.
  16. LordEC911

    Regular

    Joined:
    Nov 25, 2007
    Messages:
    789
    Likes Received:
    74
    Location:
    'Zona
    Edit- Awww beat by 2mins. Guess I shouldn't have read the other posts...
     
  17. AlBran

    AlBran Ferro-Fibrous
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    20,719
    Likes Received:
    5,815
    Location:
    ಠ_ಠ
    RADEDN
     
    homerdog likes this.
  18. lanek

    Veteran

    Joined:
    Mar 7, 2012
    Messages:
    2,469
    Likes Received:
    315
    Location:
    Switzerland
    Well the O and D can look a bit similar, but it is the just the typography used who make them look close on the upper left edge.

    [​IMG]
     
    #1058 lanek, May 22, 2015
    Last edited: May 22, 2015
  19. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,183
    Likes Received:
    1,840
    Location:
    Finland
    Sebbbi isn't Repi, though. Sebbbi is Sebastian Aaltonen
     
  20. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    9,966
    Likes Received:
    4,561
    Derp.
    I feel so stupid right now..
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...