AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Discussion in 'Architecture and Products' started by ToTTenTranz, Sep 20, 2016.

  1. Ethatron

    Regular Subscriber

    Joined:
    Jan 24, 2010
    Messages:
    859
    Likes Received:
    262
    I made a diff in Photoshop of the screenshots, and I see no Edge AA. Regardless, even if we find 1 clear undisputable MSAA edge, I would say the result totally not deserves the feature be called "MSAA on".
     
    Grall likes this.
  2. Malo

    Malo Yak Mechanicum
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    7,032
    Likes Received:
    3,104
    Location:
    Pennsylvania
    The next step would be to compare it on Nvidia, or even a 480.
     
    BRiT likes this.
  3. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,788
    Likes Received:
    2,593
    I will do a comparison tomorrow on my 1070, after I get back home.
     
    pharma and BRiT like this.
  4. dirtyb1t

    Newcomer

    Joined:
    Aug 28, 2017
    Messages:
    31
    Likes Received:
    27
    My first post ^_^.
    Considering that vega is a compute first platform and judging it based on this, where are the full run of OpenCL, Rendering, and Deep Learning benchmarks?
    I would also like to dig far more deeper into what exactly HBCC is and what its capabilities are. Is anyone aware if HBCC is exposed to compute APIs? What makes it different from pinned memory that gets DMA'd back and forth in systems memory which is available on Nvidia consumer cards 1050-1080? Going through the various flavors of cards on Radeon's website, you have the Pro, Instinct, and the RX Vega. To highlight a feature that they have gimped on the RX Vegas, only the Pro/Instinct cards list RDMA functionality. It is not listed for the $1,000 Vega Frontier Edition or the RX Vega 56/64. So, for compute, what makes RX Vega (based on a compute first architecture) a standout card vs. a 1070/1080/1080ti ? I don't see it but would love to dive into the technical details if someone does.

    I feel like Vega as an architecture is being advertised w.r.t their top of the line Pro card's features whereas these features don't translate across their product line.
    I mean, will RX Vega even have AMD DirectGMA (allowing two cards within the same chassis to communicate and coordinate)? Correct me if I'm wrong but even Consumer line GTX cards have this functionality...
    Radeon is marketing Vega as a compute first platform but I keep seeing feature parity of less features when I compare RX Vega (56/64) to Nvidia's GTX cards.

    Hope this is a good first post and hope I can contribute something to this community in the times ahead.
     
    Rootax likes this.
  5. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,788
    Likes Received:
    2,593
    Ok ran the game on my 1070, used the internal benchmark at 1080p & 1440p, DX11, max settings with Temporal AA:

    Performance at 1080p:
    0X MSAA 60 fps
    2X
    MSAA 45 fps
    4X
    MSAA 32 fps
    8X
    MSAA 16fps

    Performance at 1440p:
    0X MSAA 42 fps
    2X
    MSAA 30 fps
    4X
    MSAA 20 fps
    8X
    MSAA 9 fps

    Sample @1440p 8XMSAA no TAA:
    [​IMG]


    Sample @1440p 8X MSAA + TAA:
    [​IMG]
     
    #4025 DavidGraham, Sep 4, 2017
    Last edited: Sep 4, 2017
    Kej, Lightman, T1beriu and 3 others like this.
  6. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    Also it's worth noting that depth-only fill rate can be different than MSAA sample rate. For example Xbox 360 has 2x depth-only fill rate, but 4x MSAA sample rate. Thus on Xbox 360, you can enable 4xMSAA with no loss of fill rate. But you don't have 4x depth-only fill rate for shadow map rendering. You can however render shadow maps at 2x2 lower resolution with 4xMSAA, but that gives you jittered result (as MSAA samples aren't in ordered grid). Similar trick can be used on modern consoles, and is especially handy with custom MSAA sampling pattern (ordered grid). However rendering at 2x2 lower res with 4xMSAA makes pixel shading quads 2x2 larger. This is because each quad is constructed from the same sample of 2x2 pixel neighborhood (not 2x2 samples of one pixel). Thus quads become 2x2 larger in screen space. This results in more quad overshading because of small triangles. Also the memory layout with MSAA differs from 2x2 higher resolution image (this is both a negative and a positive thing, depending of your use case).

    Vega lots of changes regarding to rasterization. If there's MSAA performance bottleneck, it is hard to know the exact reason. MSAA increases render target BW and footprint. It might be that the new tiling rasterizer isn't as efficient with MSAA (smaller tiles fit). It might be that the new ROP caches under L2 are a liability (trash L2 with high MSAA). It might be that the new DCC modes offer lower MSAA compression rate (and this cost overwhelms the advantages of being able to avoid decompress + sample compressed). Or it might be simply that the game tested needs to perform lots of other extra work when MSAA is enabled. MSAA doesn't only affect geometry rendering pass anymore. With deferred rendering and modern HDR post processing pipeline, MSAA is much more complex to support. Efficient MSAA is developers responsibility now, and not all games get it right. MSAA is often added as brute-force ultra PC setting for those people who own top tier GPUs. Do we have results from multiple games (both forward rendered and deferred), or is this result from a single game?
     
    3dcgi, BRiT, Grall and 5 others like this.
  7. Arzachel

    Newcomer

    Joined:
    Jul 23, 2013
    Messages:
    27
    Likes Received:
    21
    GPU rumour/clickbait sites are as old as consumer GPUs. Instead of ranting about straw-milennials, maybe get a fidget spinner to calm your nerves :D
     
  8. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    Vega has some advantages, but none of them are seen in current games or current software. It's hard to predict how much these new features will matter in the close future (when Vega is relevant).

    For gaming, the biggest change in Vega is full DirectX feature level 12_1 support. Both Intel and Nvidia had already released two GPU generations with these features. These features allow implementing more efficient algorithms for many rendering problems. Now that AMD also supports these features, we can expect more game devs to start adding support to their games. AMD has slightly higher DX12 feature level support than Nvidia regarding to some aspects. Resource heap tier 2 (vs 1), conservative raster tier 3 (vs 2) and support for stencil ref from PS. My prediction is however that none of these slight advantages will have an actual impact during the Vega life time. But the common 12_1 features (that are supported by all IHVs) will certainly be used in some games, and that's when Vega will prove to be highly superior compared to previous AMD cards. There are also some other new gaming related features (such as primitive shaders) that are not yet exposed to developers. Whether these features will be used during Vega life time (on PC) remain to be seen. Evolution of PC graphics APIs hasn't been that fast lately. I don't expect big changes that only benefit one IHV. Consoles could of course be a good proving ground, but it will take some time until Vega base GPUs reach consoles.

    For compute, Vega has double rate fp16 support. This is handy for some compute problems, such as deep learning. Nvidia's consumer cards don't have this feature (it's limited to professional P100 and V100 + mobile). Vega also has a brand new CPU<->GPU memory paging system. The GPU on-demand loads data from DDR4 to HBM2 at fine granularity, when a page is accessed. This allows 8 GB Vega to behave like a GPU with much larger memory pool, assuming that the memory accesses are sparse. This is a common assumption in games and also in many professional software. Usually games only access a tiny fraction of their total GPU memory per frame and the accessed data changes slowly (smooth camera movement and smooth animation). This could be a big game changer if games start to require more than 8 GB of memory soon. If a 8 GB Vega behaves like a 16 GB card with this tech, it could greatly extend the life time of the card. The same is of course true for many compute problems. However Nvidia Pascal also has a similar GPU virtual memory system for CUDA applications, but their tech doesn't work for games. CUDA has an additional page hint API to improve the efficiency of the automated paging system. Developer can tell the system to preload data that is going to be accessed soon. Whether or not the automated paging system will be relevant during Vega life time depends on software and games you are running. Xbox One X console with 12 GB memory (and a 24 GB devkit) is coming out very soon. Maybe we need more than 8 GB on PC soon. Let's get back to this topic then.

    Currently the appeal of Vega is mostly forward looking. There's lots of new tech. AMD was lagging behind the competition before, but now they have a GPU that has lots of new goodies, but no software is using them yet. It remains to be seen whether AMDs low PC market share is a roadblock for developers to adapt these techniques, or whether AMDs presence in the consoles helps AMD to get high enough adoption for their new features.
     
    #4028 sebbbi, Sep 4, 2017
    Last edited: Sep 4, 2017
    Kej, Cat Merc, Aaron Elfassy and 12 others like this.
  9. dirtyb1t

    Newcomer

    Joined:
    Aug 28, 2017
    Messages:
    31
    Likes Received:
    27
    Wow Sebbi, thank you for the detailed reply !
    Really looking forward to probing this feature on the compute side once it becomes more accessible in software. This really is a big deal so I hope Radeon does it justice and allows full raw access and I hope the performance numbers match !

    So, is the big difference between Vega and Nvidia that Radeon allows for this paging system to work in games? Otherwise, what is this technically? regular pinned memory w/ paging/page management? Or is this more like the DirectGMA feature that was found exclusively on FirePro cards? I hope this becomes more detailed on Sept 13/14th when the ProSSG/Instinc cards launch. I'm really at a loss for why these heavily marketed features haven't gotten a proper technical walk through by Radeon to make clear what this hardware is and is capable of.

    Yeah, I'm on the development side so all I need is proper drivers/api's/access. That being said, as you state, a good amount of software/drivers/tools need to be available before this can begin.
    Please provide any detail about HBCC that you can. I really want to understand what's behind the marketing hype asap !
     
  10. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    I am an ex-Ubisoft senior rendering lead, and I have been working with virtual texturing systems since Xbox 360. This is one of the key areas of my interest. We almost beat id-software's Rage to be the first fully virtual textured game. I am really interested about these new hardware virtual memory paging systems. It is mind blowing how quickly the GPU data sets have increased recently. One additional texture mip level increases memory footprint by 4x. But the highest mip level is only accessed by the GPU when the texture is very close to the screen, and usually only a small part of the texture is accessed at highest level. Paging this data on-demand will be even more important in the future. GPU memory needs to get faster to scale up with the computational increases, but in order for the memory to be fast enough, it needs to be close to the chip, meaning that the memory size can't scale up infinitely. In order to support larger data sets, you need to have a multi-tier memory system. Big chunk of DDR for storage and fast small HBM pool is a perfect solution for games, as long as you can page data from DDR to HBM on demand at low latency. Intel has already done this with their MCDRAM based cache on their Xeon Phi processors and Nvidia has P100 and V100 with hardware virtual memory paging. I am thrilled that this is actually happening in consumer space so quickly. Nvidia's solution is CUDA centric, and geared towards professional usage. I don't know whether their hardware could support automated paging on OpenGL, DirectX and Vulkan or whether it only supports the CUDA memory model (which doesn't use resource descriptors to abstract resources). AMD has a product right now in the consumer space that works with existing software, so that's really good news. This kind of tech seems to work in practice also for gaming.

    We still don't have good benchmarks to analyze how well AMDs HBCC works in games. Mostly that's because there's still no games that would need more than 8 GB of GPU memory. Maybe there will be some 4 GB Vega 11 models (it's AMDs lower end Vega chip) released soon, allowing testing in more memory constrained scenarios before games before demanding enough to overcommit the 8 GB Vega.
     
  11. itsmydamnation

    Veteran Regular

    Joined:
    Apr 29, 2007
    Messages:
    1,298
    Likes Received:
    396
    Location:
    Australia
    I am not at home right now but in heaven 4.0 with vega I tested msaa and in it 8x only sees about a 30% drop vs no aa. It also seems like it isn't smashed by tessellation like older amd products. Forcing tessellation factor to 2x only makes a very small different vs application preference.
     
    #4031 itsmydamnation, Sep 4, 2017
    Last edited: Sep 4, 2017
    Kej, digitalwanderer, Alexko and 3 others like this.
  12. Rootax

    Veteran Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    1,179
    Likes Received:
    581
    Location:
    France
    I'm so f... weak. I told myself "My Fury X is good enough, even if some games I love are not very happy with only 4gb"... I had a very good deal on a Vega FE air cooled (approx. same price as RX56, around 450euros, brand new, never opened) so I took it and ordered a EK waterblock... New driver for FE are supposed to be out on september 13, with the ability to run RX drivers too (The presentation is not very clear if it's only for pro cards or not https://pro.radeon.com/en-us/announ...tion-vega-based-radeon-professional-graphics/) , so, I guess I'm in... I'll have to drain my loop but... Oh and I hope the RX drivers will see the 16gb of vram.
     
  13. Rasterizer

    Newcomer

    Joined:
    Aug 4, 2017
    Messages:
    29
    Likes Received:
    9
    Thanks for the comments, though they are still a fair bit beyond my level of understanding of this stuff. I understand that there can be lots of reasons MSAA performance could be bottlenecked, but I was in a sense asking the opposite question: IF a GPU has a rasterization rate bottleneck, would such a bottleneck in turn be likely tank MSAA performance?

    I would be very eager to see whatever results you can contribute for Vega in Unigine Heaven 4.0 at differing levels of MSAA and tessellation, whenever you have time. Here are the results I've already seen, which appear to correlate with Vega's underwhelming performance in known tessellation heavy games like GTA V and Watch Dogs 2, as well as Vega's MSAA issues:
    [​IMG]
     
  14. Anarchist4000

    Veteran Regular

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    There isn't much to it from a programming perspective. Think of it like programming for L3 on a CPU. It should occur transparently, just a matter of system configuration and perhaps some cache priming.

    Pinned memory with improved granularity and page management. Being able to intelligently evict pages is significant. It works in games as it's transparent and more efficient depending on how it ties in with MMUs for moving data. It should be able to request a page from the memory controller without involving a process. The controller would see it as another CPU accessing data, so less latency and overhead.

    DirectGMA was a direct transfer between devices. With linked adapters and some configuration it should still be there. How it interacts with HBCC will depend on implementation as HBCC VRAM probably shouldn't be mapped directly. Resources should be backed by system memory/SSG/SAN, but VRAM partitioned so part of it may be mapped. Think 2GB(framebuffer, shared, etc) VRAM with a 6GB HBCC victim cache.

    More documentation would be helpful, but I think AMD is still tweaking the implementation based on Linux commits. At the very least there is added overhead with smaller page sizes as huge pages yielded significant(~10%) improvements on Linux.
     
  15. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    9,996
    Likes Received:
    4,570
    And here I thought I had made a good deal on my RX V64...
     
    CarstenS likes this.
  16. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,788
    Likes Received:
    2,593
    Vega still shows signs of early CPU limitation:

    http://www.eurogamer.net/articles/digitalfoundry-2017-what-does-it-take-to-run-destiny-2-at-1080p60
     
  17. gamervivek

    Regular Newcomer

    Joined:
    Sep 13, 2008
    Messages:
    715
    Likes Received:
    220
    Location:
    india
    #4037 gamervivek, Sep 4, 2017
    Last edited: Sep 4, 2017
  18. Malo

    Malo Yak Mechanicum
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    7,032
    Likes Received:
    3,104
    Location:
    Pennsylvania
    Well I doubt AMD have done any serious work on their DX11 driver just for Vega. I don't see any reason why this would have changed.
     
  19. gamervivek

    Regular Newcomer

    Joined:
    Sep 13, 2008
    Messages:
    715
    Likes Received:
    220
    Location:
    india
    Vega does do better than what you'd expect with a CPU bottleneck, TPU has Fury X inching closer to Vega cards as the resolution increases rather than the other way round.
     
  20. roybotnik

    Newcomer

    Joined:
    Jul 12, 2017
    Messages:
    18
    Likes Received:
    14
    That's a steal! Very nice :). Long-term, it shouldn't be any different from having a RX card, just with the option of using pro drivers.

    Those new pro drivers can't come soon enough. I bought my FE at launch, and after 2 months the only options other than launch day drivers are the buggy beta drivers, which are missing the 'gaming mode' toggle... Which is crucial since wattman is pretty much a necessity at the moment.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...