AMD Vega Hardware Reviews

Discussion in 'Architecture and Products' started by ArkeoTP, Jun 30, 2017.

  1. ArkeoTP

    Newcomer

    Joined:
    Mar 9, 2017
    Messages:
    18
    Likes Received:
    22
    CarstenS and pharma like this.
  2. AnomalousEntity

    Newcomer

    Joined:
    Jun 6, 2016
    Messages:
    38
    Likes Received:
    25
    Location:
    Silicon Valley
    So ALUs do math in INT8 cause my output format is 8-bit. Have you ever written a pixel shader?
     
  3. Clukos

    Clukos Bloodborne 2 when?
    Veteran Newcomer

    Joined:
    Jun 25, 2014
    Messages:
    4,462
    Likes Received:
    3,793
  4. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    SIMD units make up a portion of the pipeline that instructions are issued to, so they are part of it in the sense that there needs to be something that executes instructions in a pipeline.
    A SIMD instruction is still an instruction, but the FLOP count per instruction is not directly related to IPC. If that were the case, Intel's Knights Landing core would be considered as having higher IPC than the desktop x86 cores.

    The SI portion of SIMD is Single Instruction, and IPC would be more concerned with what happens in terms of the instruction stream than the MD portion, which can be scaled horizontally within a pipeline's execution stage without disrupting how the pipeline handles code flow, instruction issue, hazards, or stall conditions.
    A major motivation for having SIMD at all is that it amortizes the expensive hardware concerned with IPC over more data.

    I'm trying to find more examples of where AMD used the term IPC for GCN besides Vega. You can find any number of architectural descriptions for IPC for Zen and other CPU cores, although those are superscalar cores that actively work to extract utilization out of one instruction stream.

    GCN has for generations defined a ceiling IPC of 1, with any gains found in multithreaded throughput or measures to avoid stalls that would drive instruction issue below 1. Most of the marketing has been about utilization of the hardware and some token measures for single-threaded performance. That Vega's marketing made such a pointed reference to IPC this time around has more implications in part because that's not what GCN has been about.

    There are other measurements, such as throughput and utilization of peak that can capture the sort of performance GCN has targeted without invoking IPC and all that it brings up.
     
    silent_guy, Cyan and ieldra like this.
  5. AlBran

    AlBran Ferro-Fibrous
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    20,726
    Likes Received:
    5,819
    Location:
    ಠ_ಠ
    I thought they needed to use one of the AA options to even enable async.
     
  6. ArkeoTP

    Newcomer

    Joined:
    Mar 9, 2017
    Messages:
    18
    Likes Received:
    22
    Unless it was changed in an update, async was only active when using either no AA or TSSAA. Any of the other AA options disable async.
     
    pharma and ieldra like this.
  7. bdmosky

    Newcomer

    Joined:
    Jul 31, 2002
    Messages:
    167
    Likes Received:
    22
  8. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    Even if they'd equalize clocks. There are probably differences in memory timings between HBM1 and HBM2 that are bigger than the current 5% difference in clock speeds.

    The current settings are sufficient for the purpose of this benchmark.
     
  9. AlBran

    AlBran Ferro-Fibrous
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    20,726
    Likes Received:
    5,819
    Location:
    ಠ_ಠ
    Thanks. Curious they wouldn't want to enable TSSAA anyway.
     
  10. Clukos

    Clukos Bloodborne 2 when?
    Veteran Newcomer

    Joined:
    Jun 25, 2014
    Messages:
    4,462
    Likes Received:
    3,793
    A fair amount of PC gamers don't really grasp what temporal AA is and equate it to "that's a console thing where's my MSAA!!!" :)

    I just cringe when I see people running Sli/CF setups (or high-end single gpu) and turn off temporal AA at 4k because "you don't need AA at that resolution anyway".
     
  11. Cyan

    Cyan orange
    Legend Veteran

    Joined:
    Apr 24, 2007
    Messages:
    8,572
    Likes Received:
    2,292
    we shall see how everything unfolds. For now as long as Vega gives me 4k60 in all my games I'd be happy to get it. That's AMD minimum goal with Vega, 4k60 after all. I've saved 400€ for now to get a 4k capable gpu in the future, but I am not in a hurry.
     
  12. ArkeoTP

    Newcomer

    Joined:
    Mar 9, 2017
    Messages:
    18
    Likes Received:
    22
    But you may want to turn off TAA more often than not with a mGPU setup as temporal techniques aren't AFR friendly and can cause problems with scaling and frame pacing.

    But yeah, you always need more AA*
    Until you reach the limits of irresponsibility with something like 32xS HSAA and 8x SGSSAA combined but at that point, your 9 year old game is running at 10 fps on a 1080 Ti with 10+ gigs of memory usage and you're probably doing it 'cause you're bored

    Increasing spatial resolution helps everything somewhat but it's not the solution to end all solutions. I have a friend who dislikes playing BF4 because the temporal aliasing in that game is really bothersome even at 4K with 200% res scaling. Easy to say that he was delighted by the addition of TAA to BF1.

    I definitely know people like this who bash on modern analytical and temporal methods because they do the unholy PC sin of introducing blur to the image. Common signs of these including championing SMAA 1x and not realising a screenshot has blurry AA until someone points out it has blurry AA.

    Wait, we're going way too offtopic, are we? :p

    Perhaps I should create a separate thread for pure appreciation of anti-aliasing and what it has done for us.
     
    Cat Merc likes this.
  13. BacBeyond

    Newcomer

    Joined:
    Jun 29, 2017
    Messages:
    73
    Likes Received:
    43
    How can you tell what clocks it was running at?
     
  14. BacBeyond

    Newcomer

    Joined:
    Jun 29, 2017
    Messages:
    73
    Likes Received:
    43
  15. gamervivek

    Regular Newcomer

    Joined:
    Sep 13, 2008
    Messages:
    715
    Likes Received:
    220
    Location:
    india
    Does he know the person(s) running these benchmarks because the top score has 1630Mhz only and the results have a 15% spread, too high for an overclock unless AMD have eked out another clockspeed bump or the other cards were running substantially below 1630Mhz in which case 1630Mhz Vega is better than the stock 1080.

    The two different benchmarks I've seen were on different CPUs, so that might easily affect a 720p benchmark.
     
  16. hkultala

    Regular

    Joined:
    May 22, 2002
    Messages:
    284
    Likes Received:
    6
    Location:
    Herwood, Tampere, Finland
    IPC == Instructions Per Cycle. But per thread or per core(CU)?

    Each GCN core(CU) can fetch multiple instructions per cycle (from different threads).
    It can also issue multiple instructions at same clock cycle.

    So, per core, the IPC can be >1.

    Per thread it's limited to 1 due only fetching single instruction per thread.
     
  17. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    Productivity meaning Spec ViewPerf 12.1? Or did I miss more of the „serious“ tests?

    FWIW, on our testing bench, we're basically tying their Fury X results. On that same bench, a Fire Pro W9100 scores 78,79 in SNX-02, the test which result gamersnexus uses to assert vertex-superiority of Vega about Fiji.
     
    #137 CarstenS, Jul 6, 2017
    Last edited: Jul 6, 2017
    Cat Merc and Malo like this.
  18. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    GCN instructions are 64 wide. Executed in 4 cycles using a 16 wide SIMD. Maximum IPC is 1/4 per lane. CU has four SIMDs. But these are independent (each execute different set of waves). If you want a CU to execute 64 instructions (= 128 flops) per clock, you need to have four waves running on the CU (one per SIMD). This is 10% of the SIMD occupancy (max 10 waves per SIMD to hide latency). Fortunately all common instructions have latency of 1, so single wave per SIMD is actually enough to fully utilize the SIMD... assuming of course that there's no memory operations (including groupshared memory). GCN doesn't need high occupancy to fill the pipelines, it needs high occupancy to hide memory latency.
     
    T1beriu, Lightman and BRiT like this.
  19. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    As Sebbi noted, it's 1/4 for vector utilization. There are specific cases where the instruction buffer can churn through at 1, but those skip the rest of the pipeline. I was thinking in terms of what it logically appears as to the software, but IPC is more of a statement about what the implementation is actually doing. I must need more caffeine if I'm lapsing on that concept.

    Not in the way the term IPC has been specifically used. It's effectively 1/4 per stream of execution through the pipeline. There are other figures for instruction throughput, but watering down the definition of IPC generally only happens when marketing needs to hide something.
     
    Malo likes this.
  20. leoneazzurro

    Regular

    Joined:
    Nov 3, 2005
    Messages:
    518
    Likes Received:
    25
    Location:
    Rome, Italy
    I think the term "IPC" is improperly used here. I think for most of the people the right word could be "efficiency", that is (effective calculation per cycle)/(maximum theoretical instruction per cycle)
     
    Cat Merc, ArkeoTP and Ryan Smith like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...