AMD: Navi Speculation, Rumours and Discussion [2019]

Discussion in 'Architecture and Products' started by Kaotik, Jan 2, 2019.

  1. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,072
    Likes Received:
    1,035
    Couldn’t find it either. One would hope Radeon VII, or the efficiency numbers will be dominated by lithography and not architecture. (And efficiency is subject to large change merely by changing clocks and voltages around, so God know what that will mean for real products)
    IPC is difficult when actual loads are strongly influenced by the memory solution. But they did mentioned their methods for achieving that, so there is probably something to it.
     
    #481 Entropy, May 27, 2019
    Last edited: May 27, 2019
    Lightman likes this.
  2. Frenetic Pony

    Regular Newcomer

    Joined:
    Nov 12, 2011
    Messages:
    346
    Likes Received:
    89
    A few quick mental estimates put a 40cu card at similar to a 2070 on titles like FFXV and Total War/hammer 2, at least at 4k, and those are Nvidia favored titles. If it's the same clockspeed as a Radeon VII then a bit below, if it's say 2ghz, seemingly doable, it could be a bit above.

    Of course AMD cards have done much better at getting not bottlenecked by sheer pixel throuput, vs other overhead like geometry where Nvidia has clearly won at lower resolutions. Whether this has been improved any will have to wait, same for any word on raytracing capability.

    Any estimates of the die size from that shot? The tdp efficiency dropping to half is a giant improvement even if it is just from 14nm Vega. Assuming it's not as inefficient as a Vega VII then AMD could easily fit a "big Navi" 80cu card next year.
     
    #482 Frenetic Pony, May 27, 2019
    Last edited: May 27, 2019
  3. itsmydamnation

    Veteran Regular

    Joined:
    Apr 29, 2007
    Messages:
    1,301
    Likes Received:
    397
    Location:
    Australia
    https://www.amd.com/en/press-releas...ion-leadership-products-computex-2019-keynote
     
  4. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,803
    Likes Received:
    2,064
    Location:
    Germany
    I think with Rys working where he does, he'd have notified B3D-Suite users if those tests were blatantly misleading wrt any architecture in general and GCN specifically.

    That out of the way, the increase from random to black is not the all-telling value. For Radeon VII with it's massive raw bandwith, there's virtually no increase throughput compared to using the texture caches over multiple runs. Whereas the V64 LCE, which can clock almost as high over short term loads but has less than half the bandwidth available, can achieve a factor of 1.4 for 1 texture layer and 1.3 for 2.

    For Geforce-cards, the lowly 1030 (G5 version) can achieve the highest jumps from random to all black with about 3.8, where a 2060 FE hovers at 2.1 (max). With higher number of texture layers (up to 32), the increase goes massively down for the 1030, notably down for Turing in general and only very little for bigger Pascal chips (GP104, GP102). And no, I don't have a link handy, since I did not publish those results anywhere yet.
     
  5. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    Typically the reduction in ALU throughput is more severe than in other aspects (fillrate, bandwidth) for the salvage GPU...

    Does it need to be "balanced"...

    Agreed.

    Now imagine how a software renderer would do this. e.g. threads spawned across arbitrary processors to match workload. With data locality being a parameter of the spawn algorithm. And cost functions tuned to the algorithms.

    The problem with hardware is partly that the data pathways have to be defined in advance for all use cases and have to be sized for a specified peak workload. So the data structures are fixed, the buffer sizes are fixed and the ratios of maximum throughput are fixed.

    This would be similar to how unified shader architectures took over. To do that, substantial transistor budget was spent, but the rewards in performance were unignorable. Despite the fact that the hardware was no longer optimised specifically for the API of that time. Remember, for example, how vertex ALUs and pixel ALUs were different in their capability (ISA) before unified GPUs took over? (Though the API itself was changing to harmonise VS and PS instructions.)

    In the end, primitive culling has been identified as a serious bottlneck in historical AMD GPUs. Loosening that bottleneck costs in complexity and potentially requires that the "smallest" GPUs (say $75) are comparatively "over-sized" compared with their predecessors. But that's been the story of the GPU for a long time. It's a bit like the smallest GPUs being able to do 8x MSAA, when they would crumble if actually ran that way.
     
    anexanhume likes this.
  6. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    Wild arsed guess: AMD will re-introduce odd and even hardware threads...

    Agreed.
     
  7. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    10,073
    Likes Received:
    4,651
    Is anyone trying to measure that Navi's size?
    Seems to be a direct replacement for Polaris 10, something between 200mm^2 and 250mm^2.


    I'm actually betting on Vega 14nm parts as comparison.

    If they were comparing to Radeon VII, then the +50% efficiency comparison would be much closer to the +25% IPC.

    My guess is the architectural improvements are coming at very low power cost, so they're getting:
    - x1.25 more performance out of new cache hierarchy and CU adjustments
    - x1.2 higher clocks out of 14nm -> 7nm transition (which is the clock difference between a 1.45GHz Vega 64 and a 1.75GHz Radeon VII)

    1.25 x 1.2 = 1.5, hence the 50% higher efficiency, or higher performance at ISO power.

    There's also an update at Anandtech quoting Lisa Su who said the new efficiency comes partly from new process technologies.

    I think it's a 40 CU / 2560sp part running at 1.75GHz, with a power budget close to 200W.
    It should get Vega 64 performance at 300W / 1.5 = 200W.
     
    Lightman and Entropy like this.
  8. Ryan Smith

    Regular

    Joined:
    Mar 26, 2010
    Messages:
    611
    Likes Received:
    1,052
    Location:
    PCIe x16_1
    Best guess from Andrei Frumusanu based on some other photos we have: 275mm2, +/- 5mm2.
    For power efficiency, it would seem to be against a 14nm Vega 10 product. Lisa's specific words on the subject:

    "And then, when you put that together, both the architecture – the design capability – as well as the process technology, we're seeing 1.5x or higher performance per watt capability on the new Navi products" (emphasis mine)
     
  9. anexanhume

    Veteran Regular

    Joined:
    Dec 5, 2011
    Messages:
    1,551
    Likes Received:
    736
    I see a lot of problems with the latter statement. We know Radeon 7 improves on Vega 14nm performance 1.25X iso power due to AMD slides and benchmarks. If Navi is 1.25X better than Vega architecture wise, it should be 1.25X times 1.25X (1.56X) faster than Vega 14nm on a per watt basis. This implies a performance per watt regression on node compared to Radeon 7’s gains. What’s the reason for this? More power dedicated to memory, proportionally?
     
  10. Globalisateur

    Globalisateur Globby
    Veteran Regular

    Joined:
    Nov 6, 2013
    Messages:
    3,011
    Likes Received:
    1,727
    Location:
    France
    It must be bigger than 40 CU if it has a size of 275mm2. No ?
     
  11. anexanhume

    Veteran Regular

    Joined:
    Dec 5, 2011
    Messages:
    1,551
    Likes Received:
    736
    Probably. Polaris fits 36 CU in 232 mm^2 on 14nm.

    That would be some big CU growth if it wasn’t 48 CU or more, IMO. Vega 7nm fits 64CU in 330 mm^2. I’m assuming the 4096 HBM interface is at least as big as 256-bit GDDR6 interface here.
     
  12. del42sa

    Newcomer

    Joined:
    Jun 29, 2017
    Messages:
    169
    Likes Received:
    90
    it could be like 56CU or something like that. And I do agree, from that point of view it doesn´t look that great ....
     
  13. yuri

    Newcomer

    Joined:
    Jun 2, 2010
    Messages:
    184
    Likes Received:
    152
    There are probably major changes in cache hierarchy and also the CU organization. On top of that there is an AMD's first gen GDDR6 interface. This might imply changes in perf/mm2, right?
     
    Lightman likes this.
  14. McHuj

    Veteran Regular Subscriber

    Joined:
    Jul 1, 2005
    Messages:
    1,449
    Likes Received:
    567
    Location:
    Texas
    Why? We don’t know what area impact the new enhancements had. I bet the increased on chip cache significantly
     
  15. anexanhume

    Veteran Regular

    Joined:
    Dec 5, 2011
    Messages:
    1,551
    Likes Received:
    736
    I’m actually hoping for some CU footprint growth. Nvidia has made it clear you can be bigger and more efficient. Apple has shown this too with mobile CPUs. Die cost hurts, yes, but when a better arch comes out the other end, I’ll take it.
     
  16. Malo

    Malo Yak Mechanicum
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    7,120
    Likes Received:
    3,182
    Location:
    Pennsylvania
    From the Anandtech article it seems to be compared to Vega 14nm.
     
    del42sa likes this.
  17. del42sa

    Newcomer

    Joined:
    Jun 29, 2017
    Messages:
    169
    Likes Received:
    90
    Vega already presented a big change in cache organization with enormous 45MB of SRAM on die. Adding even more with Navi seems highly unlikely....
     
  18. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    10,073
    Likes Received:
    4,651
    Not if they're significantly increasing cache sizes, front-end width and using a larger proportion of higher-frequency / lower-density transistors.
     
  19. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    523
    Likes Received:
    240
    They're being purposefully vague, for whatever reason.
     
  20. TheAlSpark

    TheAlSpark Moderator
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    20,814
    Likes Received:
    5,915
    Location:
    ಠ_ಠ
    For all the times she's gone on stage, you'd think we'd just have the physical measurements for Lisa's hand for easier size analyses. :wink2:
     
    Kej, Lightman, Ryan Smith and 6 others like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...