AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Discussion in 'Architecture and Products' started by Deleted member 13524, Sep 20, 2016.

  1. MDolenc

    Regular

    Joined:
    May 26, 2002
    Messages:
    696
    Likes Received:
    446
    Location:
    Slovenia
    There's been a patent some pages back about splitting up the 16 wide SIMD to save power.
    Found it.
     
    BRiT likes this.
  2. Malo

    Malo Yak Mechanicum
    Legend Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    8,931
    Likes Received:
    5,533
    Location:
    Pennsylvania
    What would the drawbacks be since it wasn't done before?
     
  3. pTmdfx

    Regular

    Joined:
    May 27, 2014
    Messages:
    417
    Likes Received:
    381
    The alleged diagram probably means Vega has one 8-wide, two 4-wide and one 2-wide pipeline, which surprisingly pretty much follows the patent that was linked above. This gives a variable wavefront size from 8 wide to 32 wide, assuming the same 4-cycle cadence. But it also means each "NCU" would get only 18 lane. So one might expect multiple of these NCUs to form a larger block that shares at least the LDS.
     
    #503 pTmdfx, Jan 4, 2017
    Last edited: Jan 4, 2017

  4. Someone has taken a screenshot of the settings:

    http://videocardz.com/65343/amd-demos-star-wars-battlefront-on-ryzen-and-vega-at-ces2017

    FXAA and the FOV is very small at 55º (which is why they're showing it in 3rd-person, I guess), so I think it'll be very hard to find a comparable test in the web.
     
    Lightman likes this.
  5. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    Complexity is one drawback, in terms of steps in the execution loop, hardware dedicated to scheduling, register file allocation/access choices, and potentially peak throughput if following the patent diagram exactly (only 14 lanes in vector units, only one SIMD's worth in a CU).
    Wavefronts can dynamically change width based on branch divergence, and the patent admits there is cost and uncertainty in deciding whether or not to change SIMD allocation to react to it.

    I'm not clear if that diagram is legitimate. It's ambiguous in various aspects, and what it shows could be rather vanilla. However, more adventurous implementations (high-performance scalar, split files, shared access to files, dynamic detection of branch divergence, SIMD+issue unit sharing, weighing ALU versus memory-limited, turbo) can ramp complexity, the baseline power consumption, and the cost of mistakes.

    AMD may also be looking into other changes, such as the how storage is allocated for a wavefront versus the worst-case upfront allocation currently done. Whether that meshes with what's here is unknown.

    The patent casts a decently wide net, with every parameter being physically or dynamically variable: number of scalar, high-performance scalar, and vector units, their actual widths versus partial gating, etc.
    That rather simplistic diagram leaves off most of the interesting elements, unless they aren't there. I kind of hope it's not a marketing slide, and is just someone trying to explain part of the idea. The level of polish makes me hope it's not a marketing slide if only because it's a little too MS Paint, and that particular summary is one of the least interesting or differentiated of the implementations.

    I don't follow the portion about not wasting SIMD space in the variable scenario. The visual language still seems to indicate 4 independent SIMDS, but unless SIMD lanes are migratory or AMD has discovered a 8-4-2-4 pattern to wavefront coverage, I don't see how it saves SIMD space. The 18-lane thing doesn't quite fit unless quads stopped being a thing.

    Also, isn't Next Generation Compute Unit shortened to NGCU?
     
  6. pTmdfx

    Regular

    Joined:
    May 27, 2014
    Messages:
    417
    Likes Received:
    381
    Looks more like some kind of review guide or white paper if it is real. Slides are usually full of fanciness, aren't they?

    Some kind of migration or forwarding seem to be the case if it is real, and as implied by "not wasting space" with variable width SIMD. If it is just clock gating, it could say just power saving in 16 lane SIMDs. This might also explain the smaller number of hardware lanes in an NCU (complexity in data path and instruction scheduling).

    Are quads still a concrete concept in the CU domain though? They are essentially four consecutive work-items.

    There is no obligation in forming abbreviations with all the first letters though. Next-generation Compute Unit stands for NCU as well.
     
    #506 pTmdfx, Jan 4, 2017
    Last edited: Jan 4, 2017
  7. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    Maybe a review guide, although then I still hope not. That might be down to a personal bias against automotive analogies, however.
    AMD's whitepapers have been classier than that.

    There are some elements to the design that show optimizations for data swizzling between quads, and it's a reasonable expectation in a graphics context that a lot of work will be coming in a granularity of 4. A physically two-wide SIMD drawn in a similar position as a formerly independent 16-wide is creating a scenario where there's over-subscription when a quad needs to fit, or under-subscription if well-packed graphics wavefronts have to ignore it.

    The marketing may have been served well if that hyphen were added. That's more of a nitpick where I think it adds an iffy impression, like the MS-Paint level of the graphic in general.
     
    entity279 likes this.
  8. pTmdfx

    Regular

    Joined:
    May 27, 2014
    Messages:
    417
    Likes Received:
    381
    But the alleged diagram doesn't imply the instruction pipelining though. If the four-cycle lockstep execution is here to stay, that means at minimum the SIMD would be running an 8-wide wavefront, which fits two quads.
     
  9. hurleybird

    Newcomer

    Joined:
    Feb 22, 2012
    Messages:
    37
    Likes Received:
    7
    The semi truck illustration is just something someone made on Reddit anyway. The patent shows an 8 + 4 + 2 + 1 + 1 = 16 example configuration.

    I thought the same thing at first too, but apparently 55 is the default.
     
  10. pTmdfx

    Regular

    Joined:
    May 27, 2014
    Messages:
    417
    Likes Received:
    381
    Too bad. Still hope the variable SIMD thing would be in the real Vega. Less than 17 hours to go.
     
  11. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    That would split a quad across clocks, which may not have been necessary before with operations that do work on a quad granularity like interpolation or the quad-swizzle DDP ops. Then there's some elements of the GPU's graphics hardware that work on quad granularity as well. They could be buffered, but seemingly add complexity just to be different.
    I would be curious as to whether the other wider SIMDs do the same thing, or does the CU throw a different execution loop just for one SIMD.

    That's good in my opinion, because I hope it's inaccurate enough to keep Vega interesting--just not too inaccurate.

    Although the diagram doesn't give two of those ALUs a vector register file to draw from.
     
  12. itsmydamnation

    Veteran

    Joined:
    Apr 29, 2007
    Messages:
    1,349
    Likes Received:
    470
    Location:
    Australia
    Do you people actually Game? that is standard vertical FOV for all DICE games since atleast BF3 maybe even BF:BC2
    You also have to consider the map when looking at other benchmarks endor is one of the more taxing GPU maps.
     
    RootKit and Ike Turner like this.
  13. revan

    Newcomer

    Joined:
    Nov 9, 2007
    Messages:
    55
    Likes Received:
    18
    Location:
    look in the sunrise ..will find me
    I double that!
    Vertical Fov 55 and 85 Horizontal Fov is standard for SW Battlefront and the Endor map is the most demanding SW map (all that vegetation costs some fps)
    More than 60 fps will be very good, better than a GTX 1080 anyway...
     
  14. Malo

    Malo Yak Mechanicum
    Legend Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    8,931
    Likes Received:
    5,533
    Location:
    Pennsylvania
    Why would a FOV slider be for vertical FOV? that makes no sense.
     
    Gubbi likes this.
  15. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    I would actually prefer variable wavefront sizes realized by a variable amount of looping with a narrower SIMD (like vec4). Okay, it stays a bit more granular (if one keeps the latency=troughput=4 cycles one would get wavefronts sizes of at least 16), but one could keep a lot of the other stuff intact. For the smaller wavefronts one needs relatively more scalar ALUs in the CU (optimally still one per 4 vALUs). But that should be a relatively small investment.
    One needs to increase the scheduling capacity per SP though, as each small vALU needs its own instructions. But it could work out in terms of power consumption as larger wavefronts should still dominate and one could gate the scheduling logic for 75% of the time for the old fashioned 64 element wavefronts. In case of smaller 16 or 32 element wavefronts, the increased throughput (potentially factor 4) justifies the increased consumption of the scheduler.
    Being able to execute wavefronts of any size on any vALU (just over a variable amount of cycles) may avoid most of the problem of wavefront migration between different vALUs and register files. And it reduces the complexity of scheduling the workload to a set of different vALUs. Would appear as the more elegant solution to me.
     
  16. I played the beta demo extensively but now I just got it for the PS4 (15€ IIRC) to get access to the X-Wing VR demo.
    Didn't get the game during release because I thought it had ridiculously low value. I still do, even at 15€ with the VR demo, but I just had to try flying a X-Wing in VR.

    Regardless, it never crossed my mind that the FoV option shown in those settings was for vertical FoV. I've never seen a game with that setting before.
    What do they call horizontal FoV?
     
  17. itsmydamnation

    Veteran

    Joined:
    Apr 29, 2007
    Messages:
    1,349
    Likes Received:
    470
    Location:
    Australia
    I dont have SW:BF installed but here is BF4 and BF1

    [​IMG]
    [​IMG]


    edit:
    They dont only one option to choose called FOV you have to hover over it for the actual description. I think its based off things like eyefinity because i can game in 1 or 3 screen modes and not have to touch anything settings wise. if you changed horizontal FOV the game would be a mess each time i changed.
     
    #517 itsmydamnation, Jan 5, 2017
    Last edited: Jan 5, 2017
  18. Malo

    Malo Yak Mechanicum
    Legend Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    8,931
    Likes Received:
    5,533
    Location:
    Pennsylvania
    ok so it's keeping the aspect ratio and scaling the fov for both vert+ and hor+ when you change the setting? First time I've seen fov scale for both scales, weird.

    I think the point was though that 55 was the default setting and so irrelevant for the performance comparison as benchmarks would also be 55.
     
  19. iamw

    Newcomer

    Joined:
    Jul 20, 2010
    Messages:
    24
    Likes Received:
    46
  20. seahawk

    Regular

    Joined:
    May 18, 2004
    Messages:
    511
    Likes Received:
    141
    The Truck graphic, when real, just seems to be a band aid for a quite inefficient design to me. This would not reduce peak power consumption and not idle, only the typical gaming power consumption, but this would only be achieved by trading available processing power for lower power consumption. It makes sense for a design which rarely is able to use al SIMDs.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...