AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Discussion in 'Architecture and Products' started by Deleted member 13524, Sep 20, 2016.

  1. chavvdarrr

    chavvdarrr Veteran

    1 year later, more power draw and less gamespeed per mm2 ... people Do expect something improved after so many PR events and announcements
     
    A1xLLcqAgt0qc2RyMz0y likes this.
  2. CarstenS

    CarstenS Legend Subscriber

    sebbbi seems to think, that ROP-/L2-rework can save lots of cache flushes previously necessary [strike]specifically on deferred shading engines[/strike]: https://forum.beyond3d.com/posts/1987712/
    Note that he posted this before receiving his Vega FE - so it's a guess, however educated it may be.
     
  3. jacozz

    jacozz Newcomer

    Or maybe, Vega:s is first and foremost built to compete with GP100, not necessarily GP102/GP104. In other words, built for AI and datacenter and render farm workloads. That's where the big money is anyway.

    Vega then need to fight GP100, GP102 and GP104 with one chip. That's a tall order.
     
  4. jacozz

    jacozz Newcomer

    Someone please explain why GCN and apparently NCU is limited to a maximum of 4 shader engines? What's the pro and cons with such an limited architecture?
     
  5. MDolenc

    MDolenc Regular

    That's true and it goes the same with better DCC compression found in Polaris. But to see a benefit of this you'll need to pit Fury X vs Vega FE in a situation where Fury X will be bandwidth starved. In a game.
     
    ieldra likes this.
  6. CarstenS

    CarstenS Legend Subscriber

    Do frequent cache flushes not also place a heavier tax on maintaining a high GPU occupancy? IOW - more register-state in flight.
     
  7. Sniper Elite 4 at 4K seems to be such a case, where a 38% core clock increase in Vega results in 4% better performance at relatively low FPS numbers, and Fiji gets better/similar performance than Vega at the same clocks:

    [​IMG]
     
  8. sebbbi

    sebbbi Veteran

    L2 cache flushes hurt even when you are not bandwidth bound. In this particular example case, the whole GPU needs to wait until the L2 cache flush is done before it can start executing the next shader. I would assume that frequent RT->texture transitions hurt Vega less than Fiji.

    DCC obviously also helps, since it allows skipping decompress operations (which stall the GPU for much longer times than cache flushes). Publicly available DCC documentation about GCN3/4/5 however is pretty non-existent. This is the only thing available http://gpuopen.com/dcc-overview/. I would like to see more detailed DCC document of AMD PC hardware in the future.
     
  9. Digidi

    Digidi Regular

    You don't need more than 4 shader engines. Best example is comparing Nvidia GP102 and GP104. If you look at Polygonoutput test of Beyond3d suite you see no difference between GP102 and GP104 when culling comes into play.
    http://www.pcgameshardware.de/Titan...hmark-Tuning-Overclocking-vs-1080-1206879/#a5

    http://techreport.com/review/31562/nvidia-geforce-gtx-1080-ti-graphics-card-reviewed/3

    So limitations are not made bye Rasterizer. The Culling is the Issue. PcgamesHardware say in the article, that they want to check this behaviour, but they never wrote an answer about this!?

    Also if you look clocked normalized Fiji don't look so bad there.
     
    Last edited: Jul 6, 2017
  10. ImSpartacus

    ImSpartacus Regular

    To be clear, I don't think we have any confirmation of a 4 shader engine limit on GCN/NCU except back in the Hawaii days with GCN 2 (1.1 in Anandtech terms).

    It's just that every AMD GPU since then has happened to have no more than 4 shader engines.

    Earlier this year, Anandtech commented/speculated on the potential shader engine limit (removal) with respect to Vega:

    "As some of our more astute readers may recall, when AMD launched the GCN 1.1 they mentioned that at the time, GCN could only scale out to 4 of what AMD called their Shader Engines; the logical workflow partitions within the GPU that bundled together a geometry engine, a rasterizer, CUs, and a set of ROPs. And when the GCN 1.2 Fiji GPU was launched, while AMD didn’t bring up this point again, they still held to a 4 shader engine design, presumably due to the fact that GCN 1.2 did not remove this limitation.

    But with Vega however, it looks like that limitation has finally gone away. AMD is teasing that Vega offers an improved load balancing mechanism, which pretty much directly hints that AMD can now efficiently distribute work over more than 4 engines. If so, this would represent a significant change in how the GCN architecture works under the hood, as work distribution is very much all about the “plumbing” of a GPU. Of the few details we do have here, AMD has told us that they are now capable of looking across draw calls and instances, to better split up work between the engines."

    [​IMG]

    If I had to speculate as a naive layman, I would say that if there is some material amount of R&D necessary to remove that limitation, then maybe AMD is betting that they'll transition to MCM before they ever need to build a GPU with more than 4 shader engines. That is, we'll see something like a dual 64CU (or dual ~48CU, etc) card soon enough that it doesn't make sense to waste precious R&D dollars to remove that limitation in the mean time.
     
    jacozz likes this.
  11. CarstenS

    CarstenS Legend Subscriber

    Fiji was reticle size limited, IIRC. In the sense, that they maxed out die size, so that the interposer could be made still using single instead of double exposure.
     
    looncraz likes this.
  12. mczak

    mczak Veteran

    The primitive shader refers to vs+gs being executed as one shader (tesselation shader stages also get merged with others, with tesselation there's one shader pre-tesselation and one post-tesselation). This cannot be disabled in the driver, it has to be active at all times. (Potentially with extensions exposing this you could do some things more efficiently.)
    It can only make a difference if geometry and/or tesselation shaders are in use, however.
     
    DavidGraham likes this.
  13. Rys

    Rys Graphics @ AMD Moderator Veteran Alpha

    About the driver comment: it's normal and completely expected for there to be common code in a GPU driver that applies to some or all of the GPUs a driver supports, alongside the specifics for the GPU being driven. That's hopefully just a given. So it was just a guiding hand to not conflate any commonality with it running the driver for a different ASIC, and then reading things into that.

    I'd have said that regardless of working for AMD or not, since the above is true for all GPU vendors.

    I want to talk about Vega and RX here as much as everyone else since I'm a GPU enthusiast, but that's not in my wheelhouse (unless you're an NDA'd developer of course!), so I can't go into specifics.
     
  14. lanek

    lanek Veteran

    No problem, send me the NDA´s papers... i sign them right now .. ( i joke..)
     
  15. Clukos

    Clukos Bloodborne 2 when? Veteran

    Not sure if posted here:

     
    BacBeyond, Newguy, Cat Merc and 3 others like this.
  16. Malo

    Malo Yak Mechanicum Legend Subscriber

    Hence @Rys comment regarding Vega supporting.... everything :) It also means there's a lot more parity between Nv and AMD now which should mean developers should be able to start targeting these features more.
     
  17. Alexko

    Alexko Veteran Subscriber

    Do you reckon you could convince the appropriate helmsman to drop by?
     
  18. MDolenc

    MDolenc Regular

    Sure, but again this is not likely something that happens automatically. There's a "fast geometry shader" logic on NV part as well and that's something that has to coded for specifically (via NvAPI not standard D3D).

    Yes but how do you spot this in a FPS number? :smile: From a game of which you have no idea how many RT->texture transitions it's doing (or anything else for that matter). It's basically getting to a discussion about specifically targeted benchmarks to just to show that Fury X != Vega FE, which I think is ridiculous on so many levels.
     
  19. sebbbi

    sebbbi Veteran

    Simple: FPS number is higher when there's less stalls and flushes :)

    But I will wait until Vega RX launch reviews. I am sure AMD will spill more details about their architectural changes regarding to gaming then (ROP flushes = gaming).
     
  20. Genotypical

    Genotypical Newcomer

    not how it was intended. If interpreted as comment to an equal rather than someone you can make demands of, its more a casual "come on man help us out here".

    and yes.
     
Loading...

Share This Page

Loading...