AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Discussion in 'Architecture and Products' started by ToTTenTranz, Sep 20, 2016.

  1. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    And the 8 polygons/clock come from where, when the slide says 11? I'm not aware that anybody from AMD mentioned 8 prims/clock. Every techsite also just came up their own ideas (if they didn't show just the slides). AMD actually provided only very little information what is really going on. And I'm pretty sure quite a few didn't get the provided information really straight, even after talking with guys like Mike Mantor.
    As said before, when Mantor talks about that “with the right knowledge you can discard game based primitives at an incredible rate", he isn't talking about the "traditional" geometry pipeline. He talks about the software culling approach with a primitive shader, which should be able to be even faster, given the size of the shader array (it should actually be able to be in the same order of magnitude as the discard speed for fragments/pixels [potentially several tens per clock as peak]). Therefore, these statements don't give a hint how fast Vega will be with existing software (i.e. without culling in a primitive shader).
    No it doesn't. Just consider the ~20-25% higher clockspeed of the RX480 when comparing it with the R9-290 or FuryX and it doesn't show an effect at all in this test.
    This whitepaper was cited here already. It really makes only a case for a pretty specific scenario, it didn't claim to increase the geometry throughput in a more fundamanetal way as the Vega slides are doing.
    They are only involved if someone uses the new possibilities with the primitive shader. Otherwise, it shouldn't play a role. The mentioned limit of 11 prims/clock should not apply for a shader based solution (at least not as peak, but if you have a huge amount of unique geometry with some vertex attributes, you run into memory bandwidth constraints ;)).
    Equal performance isn't 2x the performance (especially considering the clock speed advanatge of the RX480). :lol:
     
    Lightman likes this.
  2. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    You are deciding to say Ryan at PCPer is being deliberately misleading and exaggerating regarding his section on the polygons/clock and the Primitive Shader.
    I did the quote to point out Ryan was told directly; have you seen any other article using the wording as part of being informed in the preview "with the right knowledge"?
    Anyway other articles mention up to x2.75 (giving 11polygons/clock) and also some mention requiring coding/API, so the 2x figure theory when wrapped fits and cannot see how Ryan would make that up.
    Earlier he was quoted specifically regarding the Primitive Shader and wrapped in the driver performance, this was by Razor in response to my post where I mentioned reading about it having requirements in a few places.
    I know of journalists that had direct conversations with Raja/Mike/Scott pertaining to CES/Preview.
     
    #822 CSI PC, Jan 16, 2017
    Last edited: Jan 16, 2017
    Razor1 likes this.
  3. pTmdfx

    Newcomer

    Joined:
    May 27, 2014
    Messages:
    249
    Likes Received:
    129
    One can already cull triangles in shaders before it hits the primitive assembly "to achieve incredible rate", and Gipsel has been making this point since the very beginning. Frostbite even had a presentation that was all about culling in GDC 2016. This does not clash with the fact that AMD claims their four geometry engines can handle up to 11 primitives per clock, which is hardware fixed-function

    Nothing is wrong with the words of PCPer or TechReport — neither of them claimed what is being inferred.
     
    ToTTenTranz and Gipsel like this.
  4. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    I know, but this was a continuation of what Ryan at PCPer said earlier and was quoted on, specifically Razor's response to one of my posts.
    I did not expect those to be forgotten already, anyway I edited my post to provide more clarity.
    To quote again the aspect on the Primitive Shader by Ryan @ PCPer.
    The 1st part I quoted just recently was to show that he was informed directly "with the right knowledge".
     
    #824 CSI PC, Jan 16, 2017
    Last edited: Jan 16, 2017
  5. pTmdfx

    Newcomer

    Joined:
    May 27, 2014
    Messages:
    249
    Likes Received:
    129
    That section contradicted with what AMD claims at the very beginning. Up to 2x versus over 2x. Moreover, "with the right knowledge” in that context was clearly referring to the use of primitive shader, not Ryan being told "with the right knowledge”.

    Edit: The "directness", on the other hand, is something you inferred. "told it to me" is a very generic context. I could very well sit in the venue's corner, and in this context AMD was still "telling me stuff".

    Punctuation matters.
     
    #825 pTmdfx, Jan 16, 2017
    Last edited: Jan 16, 2017
    ToTTenTranz, Silent_Buddha and Gipsel like this.
  6. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Just on the point of the Primitive Shader, it is that controlling performance up to x2.75 with regards of the 11polygons/clock but importantly various sources mention it needs API-coding/SDK to achieve.
    That is how you get over 2x in slides, but up to 2x is when wrapped by driver (if going by Ryan's article with my POV), but performance gains is all theory anyway and will be interesting real results from the wrapped solution with a game.

    And fine even better because if you accept the part where he says "with the right knowledge" as part of the primitive shader; you also need to consider "as AMD told it to me" that goes with the sentence "with the right knowledge" his point has validation and he is the only one to mention capability of being wrapped and 2x.
     
    #826 CSI PC, Jan 16, 2017
    Last edited: Jan 17, 2017
  7. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    lets put an end to this, its obvious only with primitive shaders AMD can attain a high level of triangle culling at the levels they are talking about, period. There is no question about it everyone, AMD has stated this by two different AMD employees to two or more websites, so why is anyone questioning that? Ask AMD why they are saying what they are saying. We don't know how many geometry engines Vega has, nor do we know anything else about it, but looking at Doom's performance, is any underwhelmed? I sure am, Doom Vulkan should be a game that AMD cards against equal nV cards should have 15% more performance? So going by that......
     
    pharma likes this.
  8. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    2,778
    Likes Received:
    2,565
    So you are saying Polaris has no architectural advantages over Fiji?

    Who said Polaris has 2X the general performance of Fiji? We are discussing Geometry performance here, Polaris is delivering these numbers because of the mentioned Geometry enhancements.

    That remains to be one theory though, one that is backed up by logic, nothing else, which doesn't make it any more plausible than the other theory (which is also logical and is backed by AMD's own hints). Until we have a solid concrete info, I find both theories valid at this point.
     
    Razor1 and Malo like this.
  9. firstminion

    Newcomer

    Joined:
    Aug 7, 2013
    Messages:
    217
    Likes Received:
    46
    It's not obvious, the data is sparse and second-hand.
     
    ToTTenTranz and Gipsel like this.
  10. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Some other info.
    Regarding one article that had clarification from Scott Wasson:
    Some may not know this (more so 2nd half of the 1st sentence) so thought it worth posting.
    Cheers
     
    sebbbi and Razor1 like this.
  11. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    #831 Razor1, Jan 17, 2017
    Last edited: Jan 17, 2017
  12. Lightman

    Veteran Subscriber

    Joined:
    Jun 9, 2008
    Messages:
    1,804
    Likes Received:
    475
    Location:
    Torquay, UK
    It isn't to me as there is more to GPU than performance in one or even few games. Besides, Vulkan will not improve AMD's performance as much in 4K Ultra as it would in lower resolutions.
    One more point to make is that AMD themselves said in one of interviews that they are not putting any effort in optimizing OpenGL driver for Doom as they have Vulkanl. You can then switch the tables and say that Vulkan for AMD = OpenGL for nVidia.

    We simply need more data points and can't assume that what AMD is showcasing is representative of final performance. We all know how 'easy' it is for one manufacturer of GPU's to steal launch thunder by just updating BIOS before release and tweaking it slightly one way or another. It kind of would be suicidal to show final performance few months before release, no matter if good or bad. For all we know, this could be full VEGA chip or severely castrated one.

    Same really goes for Zen or any other pre-production sample.
     
    pharma and Razor1 like this.
  13. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY

    Well I agree with more data points, definitely need that, but at this point, I think its pretty safe to assume AMD is only going to show use the best they can. We saw that with Polaris, with performance and perf/watt with both P10 and P11 (6 months and 8 months prior to launch), saw that with Fiji, and all other prior launches, unless they are gracing us with a new view of their marketing lol, I have doubts.
     
    Lightman and pharma like this.
  14. Putas

    Regular Newcomer

    Joined:
    Nov 7, 2004
    Messages:
    392
    Likes Received:
    59
    Surely you know we were in the context of geometry performance. But why not go there, when it comes to numbers it only has advantage in bandwidth utilization, by any other measure it can be merely a die shrink.
     
  15. seahawk

    Regular

    Joined:
    May 18, 2004
    Messages:
    511
    Likes Received:
    141
    The only thing we know is that peak performance is 2.75X up. I guess it is save to say that this will require primitive shaders. How much Vega will do in other scenarios is unclear.
     
    #835 seahawk, Jan 17, 2017
    Last edited: Jan 17, 2017
    CarstenS likes this.
  16. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    pharma likes this.
  17. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,798
    Likes Received:
    2,056
    Location:
    Germany
    Sorry for double-posting, but it just occured to me, that AMD actually did NOT say, Vega was designed with four geometry engines. Only that in Vega, four geometry engines could handle 11 polygons. Vega's physical implementation could have more than those 4, if they're trying and understate a characteristic for once. A little far-fetched, I know, but still possible.

    The text of the slide says:
    New Programmable Geometry Pipeline
    Over 2X peak throughput per clock


    The respective footnote (full):
    Geometry throughput slide: Data based on AMD Engineering design of Vega. Radeon R9 Fury X has 4 geometry engines and a peak of 4 polygons per clock. Vega is designed to handle up to 11 polygons per clock with 4 geometry engines. This represents an increase of 2.6x. VG-3
     
    #837 CarstenS, Jan 17, 2017
    Last edited: Jan 17, 2017
    firstminion, Razor1 and Lightman like this.
  18. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Yep which is why I am wondering if we will see that change with Vega 20 with regards to the number of geometry engines, and a way to increase the CUs/ALUs/etc or just for better load-balance as it will also have 1/2 FP64.
    Maybe it will be a technical production milestone in Vega 20 towards Navi *shrug*.
    Cheers
     
  19. seahawk

    Regular

    Joined:
    May 18, 2004
    Messages:
    511
    Likes Received:
    141
    Could mean 4 for Vega 11 and maybe 8 for Vega 10. But could also mean 4 for Vega 10 and 2 for Vega 11.
     
  20. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    Nobody said something about deliberately misleading anyone. But from my experience it is pretty common (in all fields, i.e. I'm talking now in general terms), that when someone tries to convey some information bits without explaining the whole context in detail (which one often tends to do, because it is clear for oneself), the audience will try to place the information in some context, but not necessarily the correct one. ;)
    As said before, a fixed limit of 11 primitives/clock doesn't make much sense for a shader based solution, which (when fed "with the right knowledge" from a developer) could be discarding "primitives at an incredible rate", which would be even higher then the stated 11 primitives/clock. pTmdfx already linked the Frostbite presentation from the last GDC. There they state with a smiley, that they are sure, the devs present in the room can come up with code to discard triangles significantly faster with the shader array on the XB1 than the fixed function hardware can handle them (indicating that it is a low barrier). And Vega 10 has more than 5 times as many shader resources as the XB1. As I said: a shader based solution can shoot for discardings tens of triangles per clock as peak, not just 11. And the concrete number for a specific implementation of course depends on the specific implementation, so giving such a concrete number, doesn't make much sense in the first place.
    And reading up a bit how they are doing this in the Frostbite engine, they basically use a compute shader which processes and culls the geometry. The processed geometry data is then fed into some passthrough shader in the graphics pipeline. What Mike Mantor mentioned was, that the primitive shader gets now more flexibility what data it can access and process, much like a compute shader (i.e. it is not limited to the functionality of the traditional shader stages). And it can be bound to the graphics pipeline as a replacement for the passthrough shader stages just reading the data created by the compute shader from memory, which potentially increases the efficiency of the whole process. That would be a proper context of the stuff Mike Mantor said.

    Actually, in my opinion the second one doesn't make sense, as it is not backed up by logic and neither by AMD's slides. It is pretty clear, that shader based culling can potentially handle more than 11 triangles per clock as peak number. And strictly speaking, giving a concrete number for shader based culling is somewhat pointless anyway, as it is dependent on details of the implementation (the shader and in extension the whole engine) and not the hardware.
    Edit:
    On the other hand, the information on the slides most likely pertain to the fixed function pipeline (they don't make much sense otherwise). And there AMD stated explicitly 11 triangles per clock with 4 geometry engines (see CarstenS' quote above). A shader based solution would be basically independent of the number of geometry engines. ;)
     
    #840 Gipsel, Jan 17, 2017
    Last edited: Jan 17, 2017
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...