"next generation RDNA architecture" with hardware accelerated ray tracing

Discussion in 'Architecture and Products' started by A1xLLcqAgt0qc2RyMz0y, Dec 13, 2019.

Thread Status:
Not open for further replies.
  1. A1xLLcqAgt0qc2RyMz0y

    Veteran Regular

    Joined:
    Feb 6, 2010
    Messages:
    1,456
    Likes Received:
    1,161
    https://www.tomshardware.com/news/xbox-one-series-x-revealed

    Any idea what "hardware accelerated ray tracing" AMD has actually implemented?
     
  2. BRiT

    BRiT Verified (╯°□°)╯
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    16,054
    Likes Received:
    15,061
    Location:
    Cleveland
    There's patents linked on these forums about AMD's RT implementation.
     
    Frenetic Pony likes this.
  3. Jay

    Jay
    Veteran Regular

    Joined:
    Aug 3, 2013
    Messages:
    2,738
    Likes Received:
    1,845
    Believe there's some links to MS patents on helping RT also.
     
    PSman1700 and Frenetic Pony like this.
  4. A1xLLcqAgt0qc2RyMz0y

    Veteran Regular

    Joined:
    Feb 6, 2010
    Messages:
    1,456
    Likes Received:
    1,161
    Could you point out where to see these links.
     
  5. A1xLLcqAgt0qc2RyMz0y

    Veteran Regular

    Joined:
    Feb 6, 2010
    Messages:
    1,456
    Likes Received:
    1,161
    Are you saying that MS "gave" or "helped" AMD to implement Ray Tracing in their hardware by donating MS patents to AMD?
     
  6. Leovinus

    Newcomer

    Joined:
    May 31, 2019
    Messages:
    116
    Likes Received:
    48
    Location:
    Sweden
    Here is one on AMDs TEXTURE PROCESSOR BASED RAY TRACING patent. Which looks interesting as it doesn't rely on dedicated hardware like nVidia, but rather makes it up to developers how many texture units should be dedicated to the purpose. A more unified and programmable approach if I understand the implementation correctly.
     
    w0lfram likes this.
  7. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,515
    Likes Received:
    934
    For semi-custom designs, customers have the opportunity to provide their own IP blocks to be integrated into the final SoC, but they remain the owners of the IP, and AMD can't use it for other products (at least not without a separate agreement). That's not to say that this is what's happening for RT on the next Xbox, but it's a possibility.
     
    w0lfram, pharma, PSman1700 and 2 others like this.
  8. PSman1700

    Veteran Newcomer

    Joined:
    Mar 22, 2019
    Messages:
    2,766
    Likes Received:
    942
    And its VRS maybe.
     
  9. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    805
    Likes Received:
    833
    Location:
    55°38′33″ N, 37°28′37″ E
    There is dedicated hardware for loading BVH tree nodes and testing ray intersections.
    It reuses control flow logic from shader units to the TMUs, as well as their caches and memory buses.
    It also makes it possible to bypass fixed function intersection testing with shader code in some cases.

    OTOH, it seems like BVH tree is built on the host CPU, just like NVidia RTX.
     
    Alexko, w0lfram, Lightman and 3 others like this.
  10. Frenetic Pony

    Regular Newcomer

    Joined:
    Nov 12, 2011
    Messages:
    505
    Likes Received:
    221
    The reuse seems neat, though too many mentions of "fixed function" for my taste. Already went through all that with rasterization and the more programmable you made it the more devs were able to do. What if you want to do b-splines for hair? Spheres for bounding, they take less time to rebuild the acceleration tree, and that's already a major bottleneck/active area of research.

    Still, overall it's just speculation. Is this how AMD does it, is it something completely different, is it two different ways for Sony and MS, did MS do their own functional block design and are they having it included in XSX (just typing that me think of middleschoolers making an online name). We'll find out, but what about the other things RDNA 2.0 will probably have? MS just updated their Direct X specs to include Mesh and Amplification shaders, I'd bet for reasons involving their console. Variable rate shading in RDNA 2 and both consoles is a given, but it'd be interesting to see if Mesh Shaders were absent from the PS5 (bad news for Sony AND consumers).
     
    w0lfram and milk like this.
  11. JoeJ

    Regular Newcomer

    Joined:
    Apr 1, 2018
    Messages:
    981
    Likes Received:
    1,108
    Ahhm... what's the advantage of mesh shaders? I fail to understand.

    My thinking is like: I want displacement mapping, progressive meshes, or any other stuff related to LOD, requiring dynamic meshes.
    But for that, isn't it more attractive to generate indices / vertices in compute, reuse them over multiple frames and draw them indirectly?
    With mesh shaders i'd need to do the same work every frame. Does not feel attractive, aside from memory savings. What do i miss here?
     
    milk likes this.
  12. Rootax

    Veteran Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    1,530
    Likes Received:
    884
    Location:
    France
    It's way to technical for me, but I saw a few devs on twitter, like Sebastian Aaltonen, being very happy with mesh shaders (more explanations in the replies) :
     
  13. JoeJ

    Regular Newcomer

    Joined:
    Apr 1, 2018
    Messages:
    981
    Likes Received:
    1,108
    Yeah... getting rid of VS and tessellation stuff is great. This really feels restricted and i never liked any of this. But that's no game changer and does not address my argument of repetive and redunant per frame work.

    iq's comment is interesting: "It's a nice move towards removing more rasterization-specific hardware (input assembler) and move us all to pure compute and tracing. I like it, but it also means at some point drawing a triangle is going to require understanding groups, threads, barriers, and complex systems..."

    To me it feels more the opposite. Understanding groups / barriers etc. is the foundation of parallel programming and has to be understood, while isolated thread abstractions like VS, PS, GS and now RT always feel restricting and wrong to me.
    And ofc Mesh Shaders are incompatible with RT. Actually his tracing argument would be more in line with my proposal to cache mesh data in main memory because RT requires this anyways.

    So i still feel like missing something, and i assume it would be no big issue if some future hardware lacks the feature.
     
    egoless likes this.
  14. Leovinus

    Newcomer

    Joined:
    May 31, 2019
    Messages:
    116
    Likes Received:
    48
    Location:
    Sweden
    Excuse my wording. There is obviously fixed functions. What I meant to convey is that it is not divorced from the rest of the hardware as is the case with nVidia. As I understand it, and granted I'm a layman, the fixed hardware is built into the TMUs themselves as a hybridised solution. Offering a somewhat more unified approach than nVidia's fixed hardware implementation. By which logic any increase or decrease in TMU units would suppose a linear increase or decrease in RT capabilities. Additionally, as the RT functions can be engaged or bypassed at will, it's up to the developer how much of the cards resources should be dedicated to RT. Making for increased flexibility.

    I.e. a card based on this architecture would have RT capabilities regardless of SKU and practicality (say a low power mobile chip). It would only ever have more, or less. As I understand the patent.
     
    PSman1700 likes this.
  15. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    805
    Likes Received:
    833
    Location:
    55°38′33″ N, 37°28′37″ E
    Intersection testing does use the massive memory bandwidth and large caches of the fixed function texture filtering units, but the patent does not indicate any particular ratio between TMUs and RT ALUs.
    I read it that RT units are implemented as separate blocks on the same memory and control bus - so different SKUs may have arbitrary numbers of TMUs and RT units.
     
    Leovinus and BRiT like this.
  16. JoeJ

    Regular Newcomer

    Joined:
    Apr 1, 2018
    Messages:
    981
    Likes Received:
    1,108
    I don't think so. RT 'blocks', or 'cores' take chip area and only serve one purpose. The developer can only adapt to what's given here.
    But increased flexibility is given from the option to have programmable traversal, which first gen RTX probably lacks.
    This probably allows e.g. stochastic continuous LOD transition from simple discrete models. So we can scale to GPU power, performance targets, various scene complexity, etc. Super important to make RT practical.
    On the other hand this blocks CUs from being used for other tasks during traversal. Could be solved with additional FF blocks that handle 'default' traversal. The patent dos not rule such options out.
     
    Leovinus likes this.
  17. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    805
    Likes Received:
    833
    Location:
    55°38′33″ N, 37°28′37″ E
    Basically, massively parallel processing with compute shader-like scheduling that is not limited by issue rates of fixed-function geometry hardware.
    https://forum.beyond3d.com/threads/direct3d-mesh-shaders.61286/

    Mesh Shaders do not run in the raytracing pipeline, they are in the traditional rasterization pipeline just like vertex shaders.

    As for meshlets, they're just lists of vertices/primitives which still resolve to individual triangles, and it's rather easy to convert them into traditional index buffers. So I guess it's not a problem to directly consume meshlets for BVH tree creation and ray-triangle look-ups.
     
    #17 DmitryKo, Dec 15, 2019
    Last edited: Dec 15, 2019
    Newguy, w0lfram, PSman1700 and 5 others like this.
  18. JoeJ

    Regular Newcomer

    Joined:
    Apr 1, 2018
    Messages:
    981
    Likes Received:
    1,108
    Ah, thanks. I have missed the video of the presentation and after watching this i can see it's a win in any case, even with RT or presistant LOD processing in mind.
    Pretty great :)

    But what i'm really excited about is how close this is to GPU driving it's own work. They would only need to add support for wider workgroups, and then we could dispatch compute from compute and pass data efficiently through on chip memory.
    I could save at least half of BW this way, and get rid of my small workload problems, together with indirect dispatches that end up doing nothing.

    Guess this great future is not that far away... :D
     
  19. Frenetic Pony

    Regular Newcomer

    Joined:
    Nov 12, 2011
    Messages:
    505
    Likes Received:
    221
    I began writing, then realized I'd seen an MS blog post explaining this exact thing in far more detail than a forum post should even be: https://devblogs.microsoft.com/dire...on-shaders-reinventing-the-geometry-pipeline/

    No digging through videos and tweets if anyone wants, less sync and more polys are always cool.
     
    #19 Frenetic Pony, Dec 15, 2019
    Last edited: Dec 15, 2019
    JoeJ and BRiT like this.
  20. jlippo

    Veteran Regular

    Joined:
    Oct 7, 2004
    Messages:
    1,454
    Likes Received:
    588
    Location:
    Finland
    There was also nice videoblog on writing simple Vulkan renderer which had mesh shader pipeline. (From Arseny Kapoulkine with sources in Github etc.)

    (Episode 5 starts with Mesh Shaders.)
     
    w0lfram, Lightman, CeeGee and 2 others like this.
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...