AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Discussion in 'Architecture and Products' started by ToTTenTranz, Sep 20, 2016.

  1. Digidi

    Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    225
    Likes Received:
    97
    @Rys what are you measuring in the beyond3d Suite at the Polygoneoutput? The triangles handover to gpu, or the triangle which the gpu really is drawing?
     
  2. Rys

    Rys PowerVR
    Moderator Veteran Alpha

    Joined:
    Oct 9, 2003
    Messages:
    4,156
    Likes Received:
    1,433
    Location:
    Beyond3D HQ
    By default, PCGH present the results for the 100% culled test (using strips, so a modern GPU's peak throughput ideally) and the 50% culled test (using lists). In both cases the geometry is always fully submitted to the GPU with no host-side culling.
     
    Kej, BRiT, CarstenS and 1 other person like this.
  3. BacBeyond

    Newcomer

    Joined:
    Jun 29, 2017
    Messages:
    73
    Likes Received:
    43

    Great thanks!

    Latency and ALU look very good and improved vs Fiji.

    Polygons have a massive increase from 2.2k -> 6k but still well shy of 1080 (which appears higher than Titan XP???) at 11k. @ 50% its about a 50% increase from 3.9k -> 5.9k which is above Titan XP @ 5.4k. 1050 vs 1050 Vega is 87% faster at 100% culled which is a good improvement over Fiji, but appears way behind Pascal, though it seems like clocks are directly related to it since the 1080 is higher than Titan XP.

    Vega seems slower on Texture fill than Fiji though for some reason at the same clocks, going from 71k -> 89k on Fury X.

    Memory bandwidth does seem to be a huge killer though as its much lower than Fury X was for random and only ties it for blacks. Way behind Pascal and likely limiting it heavily in gaming.

    Can't wait to see RX results in a few weeks and see how much of this is going to change, or stay the same.
     
  4. Digidi

    Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    225
    Likes Received:
    97
    Thank you for the answer Rys. If you use the drop down menue at the left there are more results. Like list 0% culling. On the right drop down menue you will find more gpus.

    What I missing is Strip with 0% culling. Or does it makes no sense?
     
  5. Love_In_Rio

    Veteran

    Joined:
    Apr 21, 2004
    Messages:
    1,444
    Likes Received:
    108
    Yes, there is something iffy with the new pixel engine.
     
  6. Anarchist4000

    Veteran Regular

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    What if Vega doesn't have TMUs? With the 2xFP16(INT16?) and 4xINT8 they could be filtering with the shader cores. Then lower bandwidth and/or register pressure slowing things down. With everything seemingly programmable that makes sense. Could apply to ROPs as well. Still leaves the question of what's taking up all the space.

    This is probably the real killer. How exactly is this measured? Might be a weird measuring error due to Infinity. Tying a cluster to a particular channel might not align well to the testing.
     
    Heinrich4 and CarstenS like this.
  7. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,797
    Likes Received:
    2,056
    Location:
    Germany
    The option is there though to make visible the other tests - as is the case with almost all other sub-tests. I've only culled a few texturing results which i felt show redundant information.

    AMDs wording was, IIRC, much more carefully chosen. Something along the lines of being able to work on 11 triangles concurrently, as per foot notes of the slide deck - i vividly remember the discussion here. edit: looked it up: "Vega is designed to handle up to 11 triangles per clock with 4 geometry engines" and one of the common assumptions here was that this was because the four geometry engines could share information, so that vertices could form an adjacent polygon strip comprised of up to 11 triangles. And the product specs for Vega FE also mention 4 triangles per clock.
     
    #2867 CarstenS, Jul 11, 2017
    Last edited: Jul 11, 2017
    Cat Merc, pharma and DavidGraham like this.
  8. AlBran

    AlBran Ferro-Fibrous
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    20,658
    Likes Received:
    5,758
    Location:
    ಠ_ಠ
    Would HBM2's Legacy/PseudoChan modes have some impact here?
     
  9. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    When looking at the FE clicked at 1050, the results are identical to Fury X. So while they have obviously improved clock speeds, the first order pipeline structure seems to be (unsurprisingly) unchanged.

    The random texture unit results are strange. This is essentially just another BW test, isn't it?
     
    #2869 silent_guy, Jul 11, 2017
    Last edited: Jul 11, 2017
  10. Digidi

    Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    225
    Likes Received:
    97
    Is there also a polygontest with strip and 0% culling?
     
  11. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,749
    Likes Received:
    2,516
    So, during PCGH testing, the card does indeed throttle down heavily (to about 1269MHz under heavy load). They also found that 1.2V is used by default for 1600MHz stage. Thus I feel the discussion about other clocks is rather academic at this point. Fact of the matter is, Vega FE consumes large amount of power @1600MHz sustained clocks, that in order for it not to throttle under, it needs unlocking the Power Target, through which power consumption is increased even further.
    http://www.pcgameshardware.de/Vega-...elease-AMD-Radeon-Frontier-Edition-1232684/2/
     
  12. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    To the extent that texture rate and fill BW impact gaming performance, these results invalidate the "it's a pro GPU, not a gaming GPU" argument, and steer the conclusion towards "there's a HW performance issue with a number of units", don't they?

    Which leaves the question whether or not it can be solved for the RX.
     
    kalelovil likes this.
  13. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,749
    Likes Received:
    2,516
    Giving your numbers, we at least know Vega achieves some of it's goals in geometry processing, (it's 80% faster than Fiji clock for clock in the 100% culling strip), which defeats the Fiji drivers argument for good this time.
    AMD didn't confirm the number of TMUs as of yet, this should be a straight forward information, supposedly they are the same number as Fiji, unless they are less, in which case this could explain the scores we are seeing.
     
  14. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,797
    Likes Received:
    2,056
    Location:
    Germany
    Not that I know of. I have nothing omitted, except the mentioned redundant texture fillrate tests.
    edit 17.07.2017: Turns out, hidden deep in the script files, where one script calls tests off of the other, there indeed are a couple more tests hidden. Now it remains to be seen how useful their results are.
    Yes, compressible (one color) vs. basically non-compressible (rand.) textures.

    That's actually been done with Polaris already.
    No, they did not. I guess everyone assumed automatically, the "quad TMU per CU" did not change. When I first saw the results and then repeated re-runs showed the same numbers, I was pondering about maybe ineffectiveness of or issues with texture cache. Nvidia did unite that with L1 data cache... maybe there's an unsolved contention here.
     
    #2874 CarstenS, Jul 11, 2017
    Last edited: Jul 17, 2017
    Digidi, pharma and DavidGraham like this.
  15. Anarchist4000

    Veteran Regular

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    I'll need to double check, but I recall from drivers the scalars used 5 of 16 registers for addressing each wave. That could be where the 11 triangles come from. Broadcasting limitation to VALUs perhaps?

    Did that test suite have any of the filtering tests from way back? Curious how well that aligns to the theoretical flops.
     
  16. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,797
    Likes Received:
    2,056
    Location:
    Germany
    You lost me here.
     
  17. Anarchist4000

    Veteran Regular

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    I recall some tests from a while (long while) ago testing filtering on various texture formats. Depth, HDR, INT8 rates etc for point sampling, bi/trilinear, anisotropic, etc. Theory being the TMUs had better filtering capability than ALUs. Might confirm a lack of TMUs if the different rates align to the ALU ratios.
     
    CarstenS and Malo like this.
  18. Genotypical

    Newcomer

    Joined:
    Sep 25, 2015
    Messages:
    38
    Likes Received:
    11
    I think it might be pointless to discuss vega fe like it was a finished product and the current findings are representative.

    my theory is AMD put out vega fe to do what it can currently do, for whatever reason. it was not ready for gaming. they didn't have whatever software supported most of the new gaming relevant hardware ready for launch. This can include the bios.
     
  19. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    Don't use these results, it's an old Polaris driver!

    That's not true if the texture units are held back by memory BW. As can be seen by the fact that the numbers are the same for FE clocked at 1050 and at 1600.

    It's an expensive, finished, released-to-market product. Even if you don't want to make RX Vega conclusions, the results are still interesting on its own.

    You'd have a point if only fill rate were an issue: you could blame the not enabled tiler for that. Maybe.
    But that doesn't hold for the texture units. AFAIK, there are no tiler consequences there.
     
  20. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,797
    Likes Received:
    2,056
    Location:
    Germany
    Alas, that's a rather ancient OpenGL Test. Will see if I can run it tomorrow in the office. But IIRC the results have been... strange for a couple of other cards a few years back, so I stopped using it on a regular basis. I still don't see, however, how I can correlate certain filtering modes to ALUs. Except the results between filtering modes differ wildly from the one in Fiji/Polaris - which the ones tested with the modern B3D suite do not indicate.
     
    Digidi and Anarchist4000 like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...