AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Discussion in 'Architecture and Products' started by Kaotik, Jan 2, 2019.

Thread Status:
Not open for further replies.
  1. SimBy

    Regular

    Joined:
    Jun 21, 2008
    Messages:
    700
    Likes Received:
    391
    My bad, I thought we are comparing to 4K.
     
  2. I wouldn't look at the 5700XT's 4K results as means to compare N22 with the 2080Ti. The former is clearly bandwidth limited in that case.

    At TPU, the difference between the two is 46% at 1440p and 35% at 1080p.

    Navi 22 doesn't need to be faster than the RTX3070, it just needs to be within 5-10% (biting at the heels) to leave the 3070 in an uncomfortable position, especially if it's significantly cheaper.
     
  3. Isn't the 256bit GDDR6 Navi 21 going head to head against a RTX 3080 with 320bit GDDR6X?

    Higher bandwidth effectiveness on RDNA2 should be the least suprising factor at this point.
     
    Lightman likes this.
  4. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    All we have is AMDs 4K teaser. I don't know how to extrapolate that to NV22 class hardware.
     
    PSman1700 likes this.
  5. Erinyes

    Regular

    Joined:
    Mar 25, 2010
    Messages:
    808
    Likes Received:
    276
    And with 192 bit memory, it can offer 12 GB of VRAM against the 8 GB 3070, which is in a bit of a tight spot as it isnt an upgrade even after 4 years (compared to say GTX 1000 series or RX 400/500 series). I don't expect the 3070 to actually beat the 2080 Ti anyway, I'd expect it to be behind by 5-10%. It's gonna be a close fight.
    If anything, a 40CU part with a 192 bit interface should do even better than a 64/72/80 CU part with 256 bit.
     
    Lightman likes this.
  6. troyan

    Regular

    Joined:
    Sep 1, 2015
    Messages:
    609
    Likes Received:
    1,142
    The 3080 has 70% more bandwidth and 48% more (boost) FP32 than the 3070 and It will be ~35% faster. I dont think that bandwidth is such a huge factor outside certain workloads like raytracing.
     
    PSman1700 likes this.
  7. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    Yup, but I was thinking about treelet leaves, not of the whole BVH. Regardless, it was just a way to say that there is not a strict need to read the BVH nodes twice
     
  8. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    Ok, it’s probably true that the shader doesn’t need to inspect the contents of the node in order to schedule it. But that doesn’t seem to be a notable benefit of shader based scheduling given it’s also the case for Nvidia’s fixed function approach.

    AMD’s patent calls for storing traversal state in registers and the texture cache. It would seem the shader is responsible for managing the traversal stack for each ray and that stack presumably lives in L0. I don’t see how you would avoid thrashing the cache if you try to do anything else alongside RT. Unless of course you have an “infinite” amount of cache :-o
     
    Pete, PSman1700 and Lightman like this.
  9. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    Do better as in be less bandwidth bound?
     
    Pete and PSman1700 like this.
  10. Leoneazzurro5

    Regular

    Joined:
    Aug 18, 2020
    Messages:
    335
    Likes Received:
    348
    But how much are the GPUs bandwidth bound in reality? I mean, look at the projected performance level of the 3070 - same as 2080Ti even at 4K but with 73% of the latter's available bandwidth (and the same as the 5700XT). Rumors say that Navi21 will use 16 Gbps GRRD6 (+15% bandwidth compared to 14 Gbps) and while i'm skeptical about this "magic cache" I am not so sure that everything about performance level can be described with the available bandwidth.
     
    #3930 Leoneazzurro5, Oct 22, 2020
    Last edited: Oct 22, 2020
    BRiT and Jawed like this.
  11. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,834
    Likes Received:
    18,634
    Location:
    The North
    Aren’t ROPs typically bandwidth limited except at lower precision? I’d you’re writing out a lot of buffers this could matter.
     
  12. Leoneazzurro5

    Regular

    Joined:
    Aug 18, 2020
    Messages:
    335
    Likes Received:
    348
    Yeah but in RDNA2 RBEs are not physically connected to RAM controllers, they are clients of L2 cache instead.
     
    Pete, w0lfram, Lightman and 1 other person like this.
  13. Erinyes

    Regular

    Joined:
    Mar 25, 2010
    Messages:
    808
    Likes Received:
    276
    Yes, the bandwidth to flops ratio should be the best for the 40CU part, then the 64,72 and 80CU parts respectively. Assuming that the rumoured bus widths of 192 bit and 256 bit are true of course.
    Agreed. And as we saw with 5600XT, even with 25% less bandwidth, the performance hit was in the single digit performance range.
     
    Deleted member 13524 likes this.
  14. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,834
    Likes Received:
    18,634
    Location:
    The North
    curious, aren't all ROPS typically tied to caches past and current gen?
    IIRC the difference with RDNA is that compute is now tied in with the L2 cache, whereas with GCN it went directly to the memory controller. But I think ROPS are unchanged.

    ergo this older post by sebbbi:
    https://forum.beyond3d.com/posts/1934106/
    with respect to RDNA
    it does look like they changed how they accessed data however for the RBs.
     
    #3934 iroboto, Oct 22, 2020
    Last edited: Oct 22, 2020
    Pete likes this.
  15. Leoneazzurro5

    Regular

    Joined:
    Aug 18, 2020
    Messages:
    335
    Likes Received:
    348
    Well, we don't know if ROPs are unchanged. Someone pointed out here the opposite. But we will see. What I mean is that being physically decoupled by L2 cache the RBEs will be served first by the internal bandwidth and only if the data is not present in the internal cache then external memory is accessed. So yes, there will be some limitation due to bandwidth but there are also techniques for lowering the bandwidth needs within certain limits (data compression, larger caches, and so on).
     
    Pete, Lightman, BRiT and 1 other person like this.
  16. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    You may be right that it’s more balanced. In terms of absolute performance though it’ll be really interesting to see where the chips fall.
     
    PSman1700 likes this.
  17. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    Given the streaming nature of graphics workloads I understand GPU caches are mostly helpful for spatial locality on reads. Do the RDNA L1 and L2 caches also buffer writes to render targets and UAVs? i.e. are those caches facilitating multiple writes before flushing to vram.
     
    Pete likes this.
  18. Leoneazzurro5

    Regular

    Joined:
    Aug 18, 2020
    Messages:
    335
    Likes Received:
    348
    This is a good question, I think we will have an answer on the 28th
     
  19. PizzaKoma

    Newcomer

    Joined:
    Apr 29, 2019
    Messages:
    51
    Likes Received:
    86
  20. DDH

    DDH
    Newcomer

    Joined:
    Jun 9, 2016
    Messages:
    36
    Likes Received:
    39
    The comparison would really only be relevant if they were both priced the same or very close
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...