AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Discussion in 'Architecture and Products' started by Kaotik, Jan 2, 2019.

Thread Status:
Not open for further replies.
  1. DDH

    DDH
    Newcomer

    Joined:
    Jun 9, 2016
    Messages:
    36
    Likes Received:
    39
    He says something like "software based real time ray traced global illumination" around the 10:10 mark
     
  2. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    15,134
    Likes Received:
    7,680
    I'm not doubling the power consumption of the memory. You're right about memory channels, but I'm also not claiming a linear doubling is exact. it's just an estimate. We don't even know how much power the gpu in ps5 can draw. My guess is it's over 100W for just the gpu portion of the apu, not including external memory chips or the PS5 pcb, or any other components.
     
  3. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    15,134
    Likes Received:
    7,680
    So add in 20 CUs (38% increase) and 300 MHz (17% increase), then add that 40-50W for the PCB and you have a 72CU 2.1GHz RDNA2 gpu (mind you with a bigger memory bus). Just naively you're going to be well over 260W-270W because of the clock increase.
     
    BRiT likes this.
  4. Rootax

    Veteran

    Joined:
    Jan 2, 2006
    Messages:
    2,401
    Likes Received:
    1,845
    Location:
    France
    Even then, nVidia would'nt have release a new génération slower (or as fast as) than the old one (even if it was the top of the line)
     
    pjbliverpool likes this.
  5. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    Even if that’s true they certainly would’ve aimed much higher than TU102’s 4608 units for any chance at the crown.
     
  6. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    3,244
    Likes Received:
    3,408
    It's not that they would, it's that you'd get a lot less of a performance increase by making just a wider Turing, even with higher clocks. So it is possible that AMD expected that instead of what Ampere turned out to be. Still I wouldn't count on this. They both tend to know quite a lot about each other well in advance of each generation release.
     
  7. Rootax

    Veteran

    Joined:
    Jan 2, 2006
    Messages:
    2,401
    Likes Received:
    1,845
    Location:
    France
    Yeah but don't forget the node jump too, even if it's not 7nm tsmc. I don't know, seems like a real dumb thing if true...
     
    Lightman likes this.
  8. Erinyes

    Regular

    Joined:
    Mar 25, 2010
    Messages:
    808
    Likes Received:
    276
    But the XB1X consumed ~170W at the wall in gears, and XSX consumes ~210W. Unless you're suggesting the CPUs consume next to zero power, the APU budget shouldn't even be close to 200W.
    And I'm sure Lisa Su told you that personally?
    You mean you're actually taking data from consoles using an almost identical architecture, on an almost identical process, likely on a slightly worse silicon bin, and suggesting we extrapolate that to PC GPUs?? You can't do that! Because....reasons.
    Well since the RDNA2 HW seemingly cannot do RT concurrently, I'd say it shouldn't have much of an impact at all, if any.
     
    #3868 Erinyes, Oct 20, 2020
    Last edited: Oct 20, 2020
    Lightman likes this.
  9. chris1515

    Legend

    Joined:
    Jul 24, 2005
    Messages:
    7,157
    Likes Received:
    7,966
    Location:
    Barcelona Spain
    RDNA2 GPU can do RT concurrently. They can do ray/intersection and shading in parrallel.
     
    DegustatoR likes this.
  10. Leoneazzurro5

    Regular

    Joined:
    Aug 18, 2020
    Messages:
    335
    Likes Received:
    348
    RDNA2 cannot do texturing concurrently AFAIK. It depends how much power RT units are taking. Which is not known.
     
    Lightman likes this.
  11. SimBy

    Regular

    Joined:
    Jun 21, 2008
    Messages:
    700
    Likes Received:
    391
    If you paid attention to the hints coming out of AMD (Wang) and Cerny (PS5) you would know that they designed RDNA2 as a 'multi gigahertz clock frequency' architecture. Increasing the clock speeds was literally one of design goals. This whole narrative that Sony and now AMD pushed clock speeds up the last second is honestly insulting.

     
  12. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    3,244
    Likes Received:
    3,408
    2.0 and 2.4 are both "multi gigahertz clock frequency" but the resulting power consumption of a chip on these can be drastically different.
     
  13. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    I don't know really. Maybe they really just hit their power limits or some current limit at those points in time? I am no Furmark expert by any means.
     
  14. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    As a texture-type instruction, we know that a BVH instruction cannot issue in parallel with a texture/vmem operation, but is that the same as them not working concurrently? Outside of that initial few cycles, it could be hundreds of cycles before the buses used by the BVH instruction are needed by it. Are we sure a wavefront cannot issue a memory operation or maybe another BVH? Texturing and memory ops can be issued freely until a waitcnt instruction is encountered and not enough have resolved.
     
    NightAntilli, Krteq and BRiT like this.
  15. SimBy

    Regular

    Joined:
    Jun 21, 2008
    Messages:
    700
    Likes Received:
    391
    That's not what I was arguing with. I said they didn't bump the clocks last second to compete with the 'mighty' Ampere. Clocks are this high by design and have absolutely nothing to do with Ampere. Same goes for PS5 and XSX comparisons.
     
  16. Leoneazzurro5

    Regular

    Joined:
    Aug 18, 2020
    Messages:
    335
    Likes Received:
    348
    Possibly that is the case. But, in any case no one at the moment knows how much power these RT unit draw and their average utilization so declaring that they will dramatically increase power usage is as wrong as saying they will not affect power usage at all.
     
    Erinyes likes this.
  17. Erinyes

    Regular

    Joined:
    Mar 25, 2010
    Messages:
    808
    Likes Received:
    276
    I think we all know that. However claiming that AMD did not intend to clock x.xx ghz (rumoured) speed and only did so as a reaction to Nvidia, is speculation at best.
    To clarify, I wasn't trying to suggest that some RT cannot be done concurrently at all, and perhaps could have worded it better. I was more trying to point towards it probably not having a significant impact on power consumption.
     
    Cuthalu likes this.
  18. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    3,244
    Likes Received:
    3,408
    You doubt that AMD will adjust their clocks according to what NV has launched already?
    They can of course lower them just as well as increasing them but both possibilities are essentially a given at this point. They would be stupid not to.
     
    Cuthalu likes this.
  19. Erinyes

    Regular

    Joined:
    Mar 25, 2010
    Messages:
    808
    Likes Received:
    276
    Of course they probably would adjust the clocks based on what they've seen from NV, but by how much? You can't magically change silicon, PCB and cooler design overnight to dissipate 50-100W more. Testing, validation, production and distribution take a lot longer than the approx 4-6 weeks since Ampere has been out.
     
  20. pTmdfx

    Regular

    Joined:
    May 27, 2014
    Messages:
    416
    Likes Received:
    379
    Even normal load-stores are quite liberal — operations can be freely reordered, and only RF writeback is in program order. The texture load-store path has been supporting varying latency and a huge swarm of capabilities since GCN anyway, say for example, address coaleasing or the lack thereof can cause a load instruction to take a varying number of cycles to complete, even though multiple load instructions can be issued back-to-back. RDNA enhanced it further by adding a low-latency path bypassing the samplers, and RDNA 2 BVH intersection seems to be merely a (new) cherry on the "filtering/pre-processing" pie.

    ---

    Regarding the on-going thread about the power usage though, I don't see anything contentious... As the patent describes, the intersection engine is basically an alternative path to texture filtering, operating on packed BVH node data. Ray-box and ray-tri testing seem quite straightforward logic, so likely no "power drainage" to be expected... At worst the CU can issue bunch of intersections, issue a vmcnt wait and eventually clockgate the ALU datapaths, if no other kind of kernels is running in parallel.

    Nvidia persumably does the whole traversal process in the fixed function hardware, so one might argue that they could have an edge in power usage in potentially keeping the CU/SM off. But it is uncertain whether it matters with the prevalent use of async compute to fill gaps, and whether the actual saving does make a dent in the grand power consumption.
     
    w0lfram, dskneo and Jawed like this.
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...