AMD: Navi Speculation, Rumours and Discussion [2019]

Discussion in 'Architecture and Products' started by Kaotik, Jan 2, 2019.

  1. del42sa

    Newcomer

    Joined:
    Jun 29, 2017
    Messages:
    172
    Likes Received:
    93
  2. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    3,133
    Likes Received:
    1,792
    https://videocardz.com/newz/sapphire-registers-radeon-rx-5950-5900-xt-rx-5850-5800-xt-series-at-eec
     
    Lightman and BRiT like this.
  3. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,441
    Likes Received:
    2,159
    Location:
    Finland
    Just like Naples was released such a long time before Summit Ridge and Rome such a long time before Matisse? Oh wait...
     
    Cuthalu likes this.
  4. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    556
    Likes Received:
    250
    Ehm, what?
    All 7+ stuff is 2020.
     
    Cuthalu likes this.
  5. BRiT

    BRiT (╯°□°)╯
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    13,495
    Likes Received:
    10,361
    Location:
    Cleveland
    Are we talking Calendar year or Fiscal year on those slides?
     
  6. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,441
    Likes Received:
    2,159
    Location:
    Finland
    AMDs fiscal year is the same as calendar year
     
    BRiT likes this.
  7. Ryan Smith

    Regular

    Joined:
    Mar 26, 2010
    Messages:
    621
    Likes Received:
    1,084
    Location:
    PCIe x16_1
    No inside info. But by adding 2021, if AMD takes until 2021 to deliver the hardware, they will say that they delivered it on time as promised. With vague roadmaps you should always take the most conservative interpretation, because that's what the vendor will take (otherwise it wouldn't be vague). Then you can be pleasantly surprised if they beat it.

    Also, GPU roadmaps are non-linear.
     
    Kej, DavidGraham, del42sa and 4 others like this.
  8. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,106
    Likes Received:
    1,071
    For what it’s worth, I caught a tidbit straight from TSMC regarding their 5nm process. They were on track for volume production as per previously, but added that the HP variant would be ready for volume production in the later part of next year.
    Of course, they said nothing about who or what that volume production was for.
     
    TheAlSpark likes this.
  9. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,441
    Likes Received:
    2,159
    Location:
    Finland
    About 99.99% certainly one of the mobile SoC manufacturers, just like MediaTek is first with 7nm+ (which is already in volume production)
     
    TheAlSpark and pharma like this.
  10. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,106
    Likes Received:
    1,071
    No.
    They are the target market of the regular 5nm process that is already in risk production and that is scheduled for volume production start around March 2020. What was interesting here was the addition of a time for 5nm HP volume production, a process variant suitable for GPUs or desktop/server CPUs. There is a fairly limited number of companies interested in being on the leading edge adopting such a process.
     
    anexanhume likes this.
  11. anexanhume

    Veteran Regular

    Joined:
    Dec 5, 2011
    Messages:
    1,687
    Likes Received:
    915
    My guess is FPGAs, 4th gen EPYC, or AMD’s supercomputer custom parts. All high margin products. I believe 2021 was their target for the supercomputer.
     
    #1071 anexanhume, Jun 27, 2019
    Last edited: Jun 27, 2019
  12. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,189
    Likes Received:
    3,129
    Location:
    Well within 3d
    I ran across a Phoronix article about commits for GFX10 enablement changes for Linux drivers.
    https://www.phoronix.com/scan.php?page=news_item&px=RadeonSI-Navi-Merge-Pending
    NGG does make an entrance, and primitive shaders are mentioned.
    There's a fair bit of plumbing to support changes that can allow for shaders to be flagged as being compiled for NGG, with TES, VS, and GS among the types that can be flagged for being primitive shaders.
    There's been an additional merging of stages to support various shader types, as the geometry shader stage has been generalized enough to slot in as the vertex shader:

    https://gitlab.freedesktop.org/mesa...iffs#5b30d5f9f14cd7cb6cf8bc05f5869b422ec93c63
    (from si_shader.h)
    Code:
    * API shaders           VS | TCS | TES | GS |pass| PS
    * are compiled as:         |     |     |    |thru|
    *                          |     |     |    |    |
    * Only VS & PS:         VS |     |     |    |    | PS
    * GFX6     - with GS:   ES |     |     | GS | VS | PS
    *          - with tess: LS | HS  | VS  |    |    | PS
    *          - with both: LS | HS  | ES  | GS | VS | PS
    * GFX9     - with GS:   -> |     |     | GS | VS | PS
    *          - with tess: -> | HS  | VS  |    |    | PS
    *          - with both: -> | HS  | ->  | GS | VS | PS
    *                          |     |     |    |    |
    * NGG      - VS & PS:   GS |     |     |    |    | PS
    * (GFX10+) - with GS:   -> |     |     | GS |    | PS
    *          - with tess: -> | HS  | GS  |    |    | PS
    *          - with both: -> | HS  | ->  | GS |    | PS
    *
    * -> = merged with the next stage
    
    
    The number of stages in use doesn't necessarily change, based on what's enabled. In some places, there's a reduction of the last VS stage, though that was a pass-through in prior gens. Fully emulating certain features leverages the GDS and its ordering operations.

    In DSBR-related news, there's a few minor GFX10-specific changes in the code that was introduced with Vega.

    From some skimming, it also appears that there's at the very least a Navi 14:
    (si_texture.c)
    On another note, there's a new addition to the LLVM GFX10 bug feature list, dealing with branch offsets of 0x3f being unsafe in some way, requiring compiler workaround.
    Given this seems to be one off from 64 and related to instruction fetch, and there's already a bug related to controlling the instruction prefetch, this may point to more intensive rework of the instruction pipeline's internals. Not sure how frequently this would come up, though the workarounds may be a bit kludgy.
     
    w0lfram, Digidi, TheAlSpark and 2 others like this.
  13. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,106
    Likes Received:
    1,071
    The really nice thing about having a pure play foundry at the cutting edge of lithographic (and packaging) technology is that their processes are open to everyone. (Even Intel!) If your product will likely be profitable, then off you go. And if you aren’t able to leverage the technology on offer, maybe your competitor is.
     
  14. Digidi

    Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    229
    Likes Received:
    99
    Whats the different to Nvidias Turing? Are Primitive Shaders now always on? And whats the difference to Nvidias Mesh shader?
     
  15. iMacmatician

    Regular

    Joined:
    Jul 24, 2010
    Messages:
    775
    Likes Received:
    202
    Could Apple use 5 nm HP?
     
  16. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,441
    Likes Received:
    2,159
    Location:
    Finland
    If they really plan to replace Intel with their own CPUs in MacBooks, maybe
     
  17. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    556
    Likes Received:
    250
    Maybe for bigger Intel U-series replacement SoCs, and that's ? for now.
     
  18. PizzaKoma

    Newcomer

    Joined:
    Apr 29, 2019
    Messages:
    45
    Likes Received:
    72
    Malo, Dictator, rSkip and 7 others like this.
  19. Ike Turner

    Veteran Regular

    Joined:
    Jul 30, 2005
    Messages:
    1,884
    Likes Received:
    1,758
    Lightman and milk like this.
  20. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,189
    Likes Received:
    3,129
    Location:
    Well within 3d
    The API shaders at the top of the table are mapped to internal shader stages executed by the hardware. I haven't seen a similar listing for what is done internally by Nvidia.

    The shaders that are compiled as primitive shaders are flagged as being such, so the option to compile them as normal still exists. The automatic primitive shader concepts first discussed by AMD focused on culling, and the automatic path worked by using dependence analysis of a shader to extract operations in a vertex or other shader and place them in an early culling phase ahead of the rest of the shader. If for some reason the compiler could not separate the position calculations from the rest of the shader, it wouldn't be compiled as a primitive shader. If there was shader code that mixed usage of position and attribute data, or perhaps if there was a mode like transparency that prevented a lot of culling from working, this may be a reason for the compiler to avoid redundant work.

    It's not clear if this new iteration of NGG has added features versus the concepts introduced in Vega.
    If it's similar, then there are some differences from Nvidia's task and mesh shaders.
    Nvidia's path is explicitly separate from the standard geometry pipeline with tessellation and other shaders, with the general argument that outside of certain cases they are more effective. Mesh shading is heavily focused on getting good throughput and efficiency by optimizing the representation and reuse of static geometry. Task shaders can perform a level of decision making and advance culling by being able to vary things like what LOD model the mesh shaders will use, or how many mesh shaders will be launched. There's a more arbitrary set of primitive types that can be fed into that pipeline, and the process exposes a more direct way to control what threads are launched to provide the necessary processing.

    Primitive shaders exist within the standard geometry pipeline, which includes tessellation, vertex, and geometry shaders. It's not going to require balancing between pipeline types by the programmer. There's no mention of the sort of reuse or optimization of static geometry, which points to more work being done every frame despite the much of it not changing.
    The decision-making of the shaders is more limited, since they are different ways of expressing the standard shader types. They can do the same things more efficiently or with more culling, not do different things like change what model is used or explicitly control the amount of thread expansion. The primitive types used seem to be more standard formats rather than a more generalized set of inputs.

    That doesn't rule out that there can be some overlap or changes going forward. Presentations on task and mesh shading mention the possibility of adding culling elements to mesh shaders similar in concept to what AMD proposes to mesh shaders, and AMD alluded to possible future uses of primitive shaders that might allow for more complex behavior. Possibly, the more generalized GS stage may hint at things becoming more flexibile as far as what kind of data is passed through the pipeline and how it is mapped in terms of threads in the shader array.


    This seems consistent with the BVH texture instructions mentioned in an earlier LLVM commit. I've speculated about what facets of pre-Navi architectures best mapped to the BVH traversal process, and that it seemed like it benefited by hardware that had its own independent scheduling and data paths that didn't work in lock-step with the SIMD execution path.
    At the time I wondered if either the shared scalar path or texturing path could be evolved to handle this, and each had certain features that might be useful depending on the level of programmability or raw performance.
    The texturing path already does a lot of automatic generation of memory accesses and internal scheduling for more complex filtering types, and already handles a memory-dominated workload.
    The scalar path was shared hardware in past GPUs, is associated with sequencing decisions for a thread, and had its own data path. However, it was more heavily used and needed more execution capability at the time, and with Navi it's become less separate.
     
    milk, DavidGraham, Prophecy2k and 5 others like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...