AMD: Navi Speculation, Rumours and Discussion [2019]

Discussion in 'Architecture and Products' started by Kaotik, Jan 2, 2019.

  1. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,243
    Likes Received:
    1,244
    This is true.
    My concern is with the ”trickling down the stack”, because what has allowed this in the past is lithographic advances. But as that goes ever slower, particularly for high power chips, that trickling may simply dry up if the new technology doesn’t also offer increased efficiency.
    There is nothing wrong with having part of the lighting being optionally computed using raytracing for high end PC graphics! But then again, if confined to that niche its overall impact on real time graphics in the industry will be modest.
     
  2. Cat Merc

    Newcomer

    Joined:
    May 14, 2017
    Messages:
    128
    Likes Received:
    114
    It's not, according to Robert Hallock of AMD:
     
    snarfbot, Kaotik and Lightman like this.
  3. OlegSH

    Regular Newcomer

    Joined:
    Jan 10, 2010
    Messages:
    410
    Likes Received:
    406
    Yeah, sound ridiculous, saying they will release RT when it's free is like saying they they will never release it:-D
     
  4. Ike Turner

    Veteran Regular

    Joined:
    Jul 30, 2005
    Messages:
    2,075
    Likes Received:
    2,210
    Well, she literately never said that IIRC..so...yeah W0lfram has his own way of "expressing" himself...
     
    Konan65, Lightman and no-X like this.
  5. Malo

    Malo Yak Mechanicum
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    7,860
    Likes Received:
    4,039
    Location:
    Pennsylvania
    Ya'll just need to stop replying to his nonsense.
     
    Picao84, sir doris, del42sa and 3 others like this.
  6. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    829
    Likes Received:
    878
    Location:
    55°38′33″ N, 37°28′37″ E
    There could be lot of reasons why AMD decided to skip the initial 'tier 1_0' hardware implementation. They could 1) work on preliminary 'tier 2_0' specs which offer performance benefits, 2) research improved heterogenous integration options that allow faster on-die memory and multi-die interconnects; or 3) just wait for game developers to learn the API and optimize their software paths. Either way their first generation implementation could actually be faster than competition's current implementation.

    One can interpret their comments about the need to 'get the ecosystem ready' as primarily relying on case 3) - that said, AMD wouldn't really admit cases 1) and 2), they are very tight lipped on their future plans recently, probably owing to multiple delays with Navi.
    They probably already taped out a hardware raytracing implementation, but it's designed for a $200 mid-level APU in a $500 game console - so it would be indeed 'most basic' comparing to $1000-plus GPUs which still struggle to provide acceptable performance levels.

    I'd rather take their mid-2020 implementation designed for a high-end desktop GPU.
     
    #1166 DmitryKo, Jul 3, 2019
    Last edited: Jul 4, 2019
  7. snarfbot

    Regular Newcomer

    Joined:
    Apr 23, 2007
    Messages:
    633
    Likes Received:
    203
    Yea it's definitely something that I don't think I would personally use, preferring to have double the frame rate or a higher resolution. But going forward yea it would be a nice to have option, not necessarily a game changer, yet.
     
  8. Rootax

    Veteran Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    1,634
    Likes Received:
    1,007
    Location:
    France
    I wonder how much size amd rt implentation will take. With Navi, which is a "small" cheap on 7nm, it seems they have hard time competing with nvidia big tu 104 / 106 which are still in 12nm, efficiency / power wise (based on what they said, but I concede we need to wait for the review to be sure of that). 7nm won't save them if the implentation is not efficient...
     
  9. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,363
    Likes Received:
    3,735
    That assumes the competition would set on their laurels and do nothing to improve their current RT solution, which we know won't likely happen, as NVIDIA will push their RT angle to the extremes.
    And I would rather take a CPU from 2025 paired with a GPU from the same era, but we are talking about the here and now.
     
  10. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,243
    Likes Received:
    1,244
    There are a number of different efficiency metrics.
    Lets never forget that in the desktop market the by far most significant is performance/$. AMD is pretty much competitive here and differences will be small between the manufacturers unless one player in the duopoly makes a major push for market share.
    If you’re a manufacturer you have reason to care about performance/mm2 since you pay for wafer starts. This is modified by different cost structures for different nodes, but even before Navi AMD did quite well. They have a distinct advantage of course using a denser node, pitching their 251mm2 Navi against Nvidias TU104 (rtx2070S) at 545 mm2 and TU106 (rtx2070) at 445 mm2.
    To try to avoid the effects of process and attempt to evaluate architectural efficiency, you would look at performance/gate instead, where the same chips have 10.3, 13.6 and 10.6 billion transistors respectively. AMD is definitely competitive.
    And of course you have performance/W which is a tricky one because it changes so drastically depending on frequency and voltages in the relevant intervals. This is also where the lack of test data from the rx5700 makes comparisons difficult at the moment, but that will be rectified within days. It matters mostly at the extreme limit of cooling ability. For mid range products, the current manufacturer scheme unfortunately seems to be to push the chips as far as they will go on 200W or so of power which is possible to cool at reasonable cost.

    Looking at the overall picture, and bearing in mind that independent test data is lacking, AMD and Nvidia seem to be within spitting distance of each other, apart from performance/mm2 which is mostly a manufacturing concern that isn’t critical in the midrange. It will be interesting to see what Nvidia will achieve once they move to finer lithography.
     
    milk, no-X, Tkumpathenurpahl and 2 others like this.
  11. Xbat

    Veteran Newcomer

    Joined:
    Jan 31, 2013
    Messages:
    1,534
    Likes Received:
    1,158
    Location:
    A farm in the middle of nowhere
    Yeah and I'd like a CPU and GPU combo from the 2030. That's a bullshit comparison. Waiting one year doesn't equate to waiting five.
     
  12. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,363
    Likes Received:
    3,735
    Two years or more in this case (starting from 2018), likely more.
     
  13. Xbat

    Veteran Newcomer

    Joined:
    Jan 31, 2013
    Messages:
    1,534
    Likes Received:
    1,158
    Location:
    A farm in the middle of nowhere
    Yes from a hardware perspective but was there any software really driving the reason to have RTX in 2018 is there any now?
     
  14. Ethatron

    Regular Subscriber

    Joined:
    Jan 24, 2010
    Messages:
    874
    Likes Received:
    278
    It looks like it does indirectly through occupancy considerations. My interpretation is that is constrained in consider splitting 256 VPGRs budgets (4x64).
    256/1 = 256 (4x64, 1 wave)
    256/2 = 128 (4x32, 2 wave)
    256/3 = 84 (4x21, 3 wave)
    256/4 = 64 (4x16, 4 wave)
    256/5 = 48 (4x12, 5 wave)
    etc.
    As you see there is no divider between 1 and 2 which would allow 4x48 VGPRs as resultant. The code then decides to maximize register use within the occupancy bin.
     
  15. Rootax

    Veteran Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    1,634
    Likes Received:
    1,007
    Location:
    France

    I really like your post, it's very informative.

    But IMO, it's much simplier. Navi is 250mm2 chip on 7nm @ 200w + for the xt version. nVidia is doing bigger, faster, on a bigger node, in the same enveloppe (or less)...
     
    del42sa and pharma like this.
  16. JoeJ

    Regular Newcomer

    Joined:
    Apr 1, 2018
    Messages:
    990
    Likes Received:
    1,122
    We can make some assumptions from the AMD RT patent, and from what we can guess about NV:



    AMD uses TMUs to process one iteration of the RT loop, which means the shader program issues to intersect one level of BVH BBox / triangle intersection. A compute shader would look like this (simplified):

    queue.push(BVH_root_node)
    while (!queue.empty())
    {
    intersection_info = TMU.interesct(ray, queue) // may push new nodes to the queue, maybe implemented using LDS memory
    if (intersection_info.hitType == triangle) closestHit = min(closestHit, intersection_info.triangleID)
    }
    store closest intersection for later processing...

    This means the shader is busy while raytracing, but also there is felxibility in programming (could terminate if takes too long, maybe really interesting things...)



    On NV it looks more likely just like this:

    intersection_info = RT.Core.FindClosestIntersection(ray);

    Which means the shader core likely becomes available to other pending taskes after this command (like hit point shading, or async compute, ...).
    Also we have no indication NV RT cores would use TMU or share the cache to access BVH.



    Conclusion is NV RT is likely faster but takes more chip area. AMD likely again offers more general compute performance which could compensate this.
    But it could also happen AMD adds a FF unit to process the outer loop i have written above. Patent mentions this as optional. Still, fetching textures while raytracing would compromise perf more than on NV - maybe. (Patent mentions sharing TMU/ VGPR advantage is to avoid the need for specialized large buffers to hold BVH or ray payload data.)

    It will become very interesting to compare performance, and to see what programming felxibility (if so) can add...


    My bet (better said my hope) is, the next logical step would be to make BVH generation more flexible.
    For example if they want to be compatible with mesh shaders, they just have to make this dynamic.
    This would be awesome because it solves the LOD limitation. (I would not even care if BVH generation becomes FF too :) )

    After that i would make sense to decrease ROPs and increase RT Cores. Up to the point where rasteirzation is implemented only with compute. (Texture filtering remains, ofc)

    And only after that i would see a need for ray reordering. (Which i initally thought to be FF already now, and the assumed complexity was a main reason to question RTX.)
     
    #1176 JoeJ, Jul 4, 2019
    Last edited: Jul 4, 2019
    Rootax likes this.
  17. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    11,302
    Likes Received:
    5,924
    Here I'd say AMD did "well" in performance/mm^2 largely by sacrificing performance/watt a lot, and by investing more in PCBs with higher-end voltage regulation components to bring their graphics chips well beyond their ideal performance/watt curves. A fully enabled Polaris 11 has a 35W TDP in a Macbook Pro with ~85% the performance of the desktop version that needs ~70W. A Vega 64 can be set to consume 180W at over 90% of its performance, but then it'd perform consistently lower than a GTX 1080 and the marketing team couldn't have that happening.
    But even then Vega 10 had around 55% more transistors than GP104, though that can somewhat be attributed to the fact that Vega 10 was designed for a multitude of loads (gaming, server, compute, etc.) and not just rasterization. And of course to the fact that AMD had been counting on Vega 10 to reach much higher clocks, probably closer to what Radeon VII hit on a new process.
     
  18. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,243
    Likes Received:
    1,244
    Well, that’s the reality of the PC market unfortunately. Reviewers have their part in this I feel since you often see ”winners” declared even when the differences are miniscule, often with words like ”dominates”/”crushes”/ and so on even when describing what in actual gameplay would be imperceptible.
    And that translates into market value. When another 15% performance is enough to shift your product into a different pricing bracket and correspondingly better margins, it’s not surprising that the chips are pushed as far as the cooling allows.
    I’m getting too old for the PC gaming market. It is geared towards the exciteable youth.
     
  19. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,246
    Likes Received:
    3,191
    Location:
    Finland
    It does improve performance, even if not leaps and bounds, on high end RTXs too so it's benefits are not limited to weak hardware
    Considering the timeframes it's pretty safe to assume that consoles use next gen RDNA instead of first gen, and I think AMD already confirmed somewhere 2nd gen RDNA includes RT-hardware (probably the TMU-thing they patented)
     
    milk likes this.
  20. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    11,302
    Likes Received:
    5,924
    We could be looking at 3 different RTRT implementations: Sony + Microsoft XBox + PC AMD.
    Though preferrably it should be only one, for developers' sanity.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...