AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Discussion in 'Architecture and Products' started by Deleted member 13524, Sep 20, 2016.

  1. Anarchist4000

    Veteran

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    More fun to call HBM cache as it's on the package and then a SSD your video memory. That way the marketing department can advertise 1TB VRAM compared to Nvidia's 12GB in the case of Titan.
     
    Kej and Lightman like this.
  2. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    I'm not sure. I was wrong. Somehow misunderstood it at the first glance.
     
    CarstenS likes this.
  3. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    They specifically talked about the throughput of one NCU compared to one traditional CU with 64 SPs. And doubling the number of SPs is the simple way of doing it. How they would feed a higher number of SPs and how they are organized, no idea. Could be dual issue to two separate vector ALUs each clock. Or something more out of the ordinary like dual issue to the same vALU and computing over 8 cycles to match a round robin scheme over 8 vALUs (I don't think this will happen) or something else. There is also the possibility they can somehow fuse certain combinations of ops in the scheduler and issue the fused ops (meaning the higher throughput is only usable for relatively specific cases). This is still unknown right now.
     
  4. seahawk

    Regular

    Joined:
    May 18, 2004
    Messages:
    511
    Likes Received:
    141
    If you put more SPs into the CU, the Truck graphic makes even more sense.
     
  5. Arnold Beckenbauer

    Veteran Subscriber

    Joined:
    Oct 11, 2006
    Messages:
    1,756
    Likes Received:
    722
    Location:
    Germany
    entity279 likes this.
  6. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    Up to 11 triangles get clipped/rejected per clock I guess. Up to now, the geometry throughput of AMD GPUs doesn't change that much depending on the visibility of the triangles.
    [​IMG]
     
    Razor1 likes this.
  7. Anarchist4000

    Veteran

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    Edited my response earlier, but it could be FMA4 style instructions with 4 operands. With all the packed math being performed that would make a lot of sense.

    EDIT: It would also work well that that scalar per SIMD design I was theorizing. When not using the 4th operand, it could feed 16x4 scalar registers into L0 registers for a scalar. Translating the opcodes to do that shouldn't be difficult. Bulldozer had the FMA4 instructions, and I think GCN had the extra operands, but they were used to feed the single scalar or move data around.
     
    #547 Anarchist4000, Jan 5, 2017
    Last edited: Jan 5, 2017
  8. xEx

    xEx
    Veteran

    Joined:
    Feb 2, 2012
    Messages:
    1,060
    Likes Received:
    543
    the ve.ga site is down right now :runaway:
     
  9. Arnold Beckenbauer

    Veteran Subscriber

    Joined:
    Oct 11, 2006
    Messages:
    1,756
    Likes Received:
    722
    Location:
    Germany
    Not for me. ~ 50 minutes.
     
  10. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,296
    Location:
    Helsinki, Finland
    About time! I have been fixing and improving console GCN2 cache management code in the past two weeks. I am happy to hear that L2 handles ROPs now as well (even if there's still some tiny L1 ROP caches). Much less L2 flushing needed. Should be good for async compute as well :)
     
    Alexko, chris1515, gfoyle and 7 others like this.
  11. seahawk

    Regular

    Joined:
    May 18, 2004
    Messages:
    511
    Likes Received:
    141
    Yep, all those changes point to CUs with more SPs. 32CU @ 128SP - anybody?
     
  12. Urian

    Regular

    Joined:
    Aug 23, 2003
    Messages:
    622
    Likes Received:
    55
    I believe that the CU continue being the same, 128 Ops is from FMADD (2 ops per component).
     
    Ryan Smith likes this.
  13. Rootax

    Veteran

    Joined:
    Jan 2, 2006
    Messages:
    2,401
    Likes Received:
    1,845
    Location:
    France
    I'm more curious about rop & geometry(the strange 11 triangle instead of 4 on fiji seem nice) .Fiji was already a "compute" monster imo...
     
  14. xEx

    xEx
    Veteran

    Joined:
    Feb 2, 2012
    Messages:
    1,060
    Likes Received:
    543
    again :cry:
     
  15. revan

    Newcomer

    Joined:
    Nov 9, 2007
    Messages:
    55
    Likes Received:
    18
    Location:
    look in the sunrise ..will find me
  16. Malo

    Malo Yak Mechanicum
    Legend Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    8,931
    Likes Received:
    5,533
    Location:
    Pennsylvania
    Is there a live stream to the event? It just started.
     
  17. xEx

    xEx
    Veteran

    Joined:
    Feb 2, 2012
    Messages:
    1,060
    Likes Received:
    543
  18. pTmdfx

    Regular

    Joined:
    May 27, 2014
    Messages:
    417
    Likes Received:
    381
    It is quite interesting that they turn the local graphics memory into a cache. But it remains to be seen whether it is a page table and software magic (like Linux VMM) or a real hardware cache.
     
    #558 pTmdfx, Jan 5, 2017
    Last edited: Jan 5, 2017
  19. Malo

    Malo Yak Mechanicum
    Legend Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    8,931
    Likes Received:
    5,533
    Location:
    Pennsylvania
    ah ok
    WTF is the use of having a countdown on ve.ga? So they could load that web page (which was being hammered) at CES event instead of something local running?
     
  20. SimBy

    Regular

    Joined:
    Jun 21, 2008
    Messages:
    700
    Likes Received:
    391
    Is the size (520-530mm2) double confirmed?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...