AMD Radeon RDNA2 Navi (RX 6500, 6600, 6700, 6800, 6900 XT)

Discussion in 'Architecture and Products' started by BRiT, Oct 28, 2020.

  1. Nappe1

    Nappe1 lp0 On Fire!
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,532
    Likes Received:
    11
    Location:
    South east finland
    Long time, no see... :)

    Does nowadays screen refresh stuff (Ramdacs are long gone, but that's where I am coming from back here.) need solid memory area to read for screen refresh or are they able to read individual tiles making the screen out of them?
     
    Pete likes this.
  2. pTmdfx

    Regular

    Joined:
    May 27, 2014
    Messages:
    417
    Likes Received:
    381
    RDNA inherited binning rasterizers from Vega, didn't it?

    Moreover, Linux driver patches do seem to hint that memory pages can be marked with LLC No Allocate. So in theory, the driver can mark pages of resources to skip LLC, either based on presets, or in a fancier way, live performance counters from the LLC.
     
    BRiT likes this.
  3. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    There are lots of other buffers to worry about too that are read multiple times each frame. I would think tiling is less helpful for chip wide L2/3 caches where there is no tile based affinity.
     
  4. marifire

    Newcomer

    Joined:
    May 13, 2007
    Messages:
    46
    Likes Received:
    41
  5. Malo

    Malo Yak Mechanicum
    Legend Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    8,931
    Likes Received:
    5,533
    Location:
    Pennsylvania
    wow, Nappe1. Haven't seen you for years!
     
  6. chris1515

    Legend

    Joined:
    Jul 24, 2005
    Messages:
    7,158
    Likes Received:
    7,966
    Location:
    Barcelona Spain
  7. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
    But what is a full fledged implementation of DS? No-one knows yet as to date, details have been fairly light. Does it mandate that data transfers directly from SSD to GPU via P2P DMA without any involvement from the CPU and that decompression of the IO stream is done on the GPU? Or is it (as Microsoft's commentary so far suggests) just an API for greatly lowering the system overhead of IO requests, with all that other stuff being a propriety Nvidia (RTX-IO) solution.

    I seriously hope it's the former, and the Anandtech commentary posted above give us some hope in that regard, but if it were then I would have expected AMD to make that very clear in their own announcement given the amount of attention RTX-IO has received. That fact that they didn't has me worried. This would have been a very. very easy thing for them to shut down in the reveal, but they didn't. I hope I'm just being overly pessimistic.

    I'd definitely disagree on this. Direct Storage is needed right now. Plenty of cross platform games will be supporting it within weeks on the XSX and if it were available on PC then those advantages would carry over for anyone that has an NVMe drive. The game doesn;t have to make NVMe or even a regular SSD a requirement for those that have them to be able to take advantage of faster load times.

    It's also worth noting that AC Valhalla does state an SSD as a requirement in all but the lowest of pre-sets and even there is recommends one. I expect other games to follow suit in short order.

    This is exactly where my thinking is right now. I was mostly settled on a 3070 before both launched but that performance and those prices have me thinking. That said, there are other factors to consider which we don't have enough information about yet to make a fully informed choice so I'm going to wait a little longer. Those being RT performance, Direct Storage capabilities (i.e. does RDNA 2 replicate what RTX-IO is doing), DLSS/AMD equivalent, and of course Nvidia's potentially very imminent 3070Ti.

    I agree wit the overall sentiment but if this forces NV and Intel to address the same weakness in the PC's dGPU based architecture then I'm all for it. The more our separate CPU's and GPU's with separate memory pools can act as if they are on a UMA (without the inherent draw backs) the better for the PC platform as a whole.
     
    Cuthalu, pharma and BRiT like this.
  8. Leoneazzurro5

    Regular

    Joined:
    Aug 18, 2020
    Messages:
    335
    Likes Received:
    348
  9. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
  10. Nappe1

    Nappe1 lp0 On Fire!
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,532
    Likes Received:
    11
    Location:
    South east finland
    yep, last time I posted here, it was 2013. Heck, one of my last threads I talked about "framing memories" in my and my girlfriend's new rental flat... and that was 2011! Even though we get married and moved to our own apartment, the chips are still framed and on the wall as display. Too bad that few people nowadays know that they are looking an unicorn and it's doo doo while watching them. :D
    From my real "active" years it has been almost 2 decades. It was 2002-2004 time frame when everything I was interested for went horribly horribly wrong.

    I did notice launch of Iris Pro, but it was not enough to get me coming back. However when there's a mention of remarkable amount on-chip RAM in GFX chip, you bet I am reading the information. :) I pretty quickly calculated the same maths as people have done here: 128MiB 4096 bits bus which is most likely divided to two 2048 bits wide parts. both serving 4 32bit wide external memory channels. I also started to wonder how they have designed the frame buffer writes and readouts so that it does not ruin the cache efficiency too much....

    So, any idea if this is ArtX style SRAM or perhaps Iris Pro style eDRAM? I am betting the first one, just because AMD/ATI's history and at least it used to be easier to approach than eDRAM.
     
    Kej, tinokun, Rodéric and 4 others like this.
  11. tunafish

    Regular

    Joined:
    Aug 19, 2011
    Messages:
    627
    Likes Received:
    414
    That's not the math. Most likely, it's split into 16 entirely separate cache slices, each of which serves 512 bits per cycle. It's very instructive to know that it's fundamentally quite similar to the L3 in modern Zen CPUs, just (probably) cache lines that are twice as long and with twice the bus width per slice.

    On GDDR6, the external memory channels are 16-bit wide, and there are two of them per chip. So one 8MB, 512-bit cache slice per memory channel.

    And yes, it's definitely SRAM. eDRAM is gone, it's not compatible with modern logic lithography.
     
    Lightman and Nappe1 like this.
  12. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,716
    Likes Received:
    2,137
    Location:
    London
    Just buy it. You'll be waiting a year+ before DirectStorage actually makes a difference to any games.

    How do you know NVidia isn't already doing this? Why would Intel be involved?
     
  13. Nappe1

    Nappe1 lp0 On Fire!
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,532
    Likes Received:
    11
    Location:
    South east finland
    Thanks for the info... It makes perfectly sense. Nevertheless, I am still interested how they cope with the frame buffers...
    Sorry being soooo outdated, but As I asked in my previous post, how the screen refresh works nowadays? in ramdac days, you had to have solid front buffer for the ram dac to read the scanlines for sending analog RGB values which electron tube then refreshed to the screen. Do you need such thing anymore or can the rendered tiles read straight to the screen?
     
  14. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,245
    Likes Received:
    4,465
    Location:
    Finland
    You sure they've split it into (more than two) slices? I don't think many of the theorized uses for it would really like small slices like that. I'd put my money on either 64 or 32 MB slices.
     
  15. tunafish

    Regular

    Joined:
    Aug 19, 2011
    Messages:
    627
    Likes Received:
    414
    Then you'd have to either deal with more than once access per slice per clock, or with very wide busses. It's much easier to just split the cache into slices that each conveniently serve a request per slice.

    It's hinted to in AMDs slide here:

    But they referred to the cache being based on what's used in Zen in the presentation, and Zen caches are split into 4MB slices (for Zen2 at least).

    (edit) and this has no impact on use. All addresses are spread across the slices based on few of the last bits in the address. Any user will evenly access all of the slices, except rops which are homed to a specific slice, probably.
     
  16. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
    I certainly hope it won't be that long before we see basic DS integration but we know both RDNA2 and Ampere support that so no worries there. If CPU bypass and GPU decompression are Nvidia exclusives though then yes I agree we could be looking at those timescales before we see games using it. And in fact usage is likely to be limited in that case anyway so arguably it won't matter that much anyway. Still, it'll make RDNA2 more attractive to me if it does support same functionality as RTX-IO as a fundamental part of Direct Storage.

    I assumed since you need a 5000 series CPU and 500 series motherboard to make this work then there is some specific enablement required on the CPU/platform side.
     
    Cuthalu likes this.
  17. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,716
    Likes Received:
    2,137
    Location:
    London
    NVidia marketing designed to make stupid people think it's something more than DirectStorage.
     
    Lightman, Erinyes, Krteq and 4 others like this.
  18. arandomguy

    Regular Newcomer

    Joined:
    Jul 27, 2020
    Messages:
    256
    Likes Received:
    364
    I'm actually not sure that is the case here. As someone commented earlier DirectStorage does not seem to have specifics with respect to decompression.

    For instance for the XBX Microsoft seems to be specific and separate out hardware decompression and DirectStorage as distinct parts of it's "Velocity Architecture"

    https://news.xbox.com/en-us/2020/07/14/a-closer-look-at-xbox-velocity-architecture/

    Also with how it's phrased in the DX blog it also suggests that how decompression is handled is a separate entity as well, while DirectStorage is just an API that is more efficient at handling data I/O requests - https://devblogs.microsoft.com/directx/directstorage-is-coming-to-pc/

    Similarly how it's phrased in the Ampere whitepaper RTX I/O is what they term for their GPU hardware decompression that DirectStorage can leverage.

    Now presumably AMD will have some mechanism in place that is similar (as it doesn't seem like a hard challenge), it could just be they chose not to name it (or haven't) yet.
     
    pjbliverpool and pharma like this.
  19. manux

    Veteran

    Joined:
    Sep 7, 2002
    Messages:
    3,034
    Likes Received:
    2,276
    Location:
    Self Imposed Exhile
    DirectStorage has to come with some set of recommended/supported algorithms. Otherwise there would have to be install time compression to the format preferred by user's system. This would potentially lead to all kinds of problems such as making the game engine development/optimization tricky/impossible as engine wouldn't know how the compressed data is laid out on disc. This would also lead to complications on updating games(decompress game, update, compress again). We will get to know the algorithms used once microsoft releases beta of directstorage some time next year.
     
    #319 manux, Oct 29, 2020
    Last edited: Oct 29, 2020
  20. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
    Stupid people and the entire gaming media prior to yesterday it would seem. Unless I've missed a publication, or Microsoft announcement that explains that RTX-IO is specifically doing nothing outside of the base Direct Storage functionality. Because I've certainly seen plenty of articles suggesting the opposite.

    Here's Microsofts explanation of what Direct Storage is. They go into quite a bit of detail here but no mention of GPU based decompression or direct GPU to SSD data transfers:

    https://devblogs.microsoft.com/directx/directstorage-is-coming-to-pc/

    And here's Nvidia's explanation of RTX-IO:

    https://www.nvidia.com/en-gb/geforce/news/rtx-io-gpu-accelerated-storage-technology/

    Note the plurality of API's. This isn't a case of RTX-IO = Direct Storage re-branded. These are two separate technologies working in tandem

    Here is what TechPowerUp thinks of it:

    https://www.techpowerup.com/271705/...stack-here-to-stay-until-cpu-core-counts-rise

    I guess even Digital Foundry would fall into the stupid category:

    To be clear, I'm not saying that direct data transfers from SSD to GPU along with GPU based decompression aren't a fundamental and mandatory requirement of Direct Storage support - I would be very happy if they are. But I am saying that none of the information we've had on it to date suggests that it is, and much of that information at least hints that it may not be.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...