Remember I quoted Ryan at PCPer who stated it can be wrapped pertaining to the driver package and that would in theory provide a performance boost, this came as part of the section where it seems he was given additional information by AMD.No.
It is a (conservative) estimate of the potential additional benefit of using a primitive shader compared to the traditional pipeline. He explicitly said so. It is very likely, that this is a comparison of Vega with primitive shader vs. Vega without using it. Later in the interview, he says that Vega offers a geometry throughput uplift compared to previous generations also without using a primitive shader (while refusing to quantify it or provide specifics).
Edit:
The better part of the video for your point would be actually starting at the 36 minute mark, when he gets confronted directly with the footnote mentioning the 11 triangles per clock.
He admitted to have been unaware of that footnote, struggled a bit by saying he thinks it's not applying to a specific product and an example what Vega could do in a configuration with 4 geometry engines (reinforcing a bit the point CarstenS was making [that AMD didn't exactly say, that Vega10 has 4 geometry engines, could be a different number], if it was not just hedging from his side). He then came back to one of the "talking points" of the Vega reveal, the primitive shaders (giving some credence to the idea, that this number is really pertaining to that), but basically saying "it's difficult" to explain how one arrives at the number of 11 triangles per clock (allegedly realistic and possibly taking into account multiple constraints like memory bandwidth [I mentioned that before]).
So maybe he was not fully briefed about what exactly is on the slides and was not willing to reveal any specifics. Or someone at AMD pulled some shaky number out of the air with the help of some halfbaked rules of thumb and put them on that slide. As explained, that number and also putting it on that slide doesn't make much sense in that case, as it would be somewhat fundamentally flawed.
I appreciate you do not put much weight on what he said due to different opinion on its context, but he has been absolutely correct when he also stated to fully utilise it must be accessed by API-coding-SDK, and this is backed up in the interview Razor linked as Scott mentions it there.
The 11polygons listening to Scott is a theoretical maximum when/if using 4 Geometry Engines, because he goes on to say "realistically", sort of reminds me of Async Compute and the difference between theoretical maximum and real world.
He did say "I think that is an example and not a specific product detail about the number of Geometry Engines in a particular chip" - Could be construed to mean either there are multiple designs (one being traditional 4 Geometry Engines) and this information is to come out later, or that all the various chips have more than 4 Geometry Engines.
I think we will see multiple types; a standard version still with 4 Geometry Engines as the 1st one is meant/rumoured to be 4096 cores and with the same CU count (known by released info on FP32/clock), if you can increase the Geometry Engines (and associated functions) then it makes sense to also increase the CU count and all that entails as the limitation has been removed.
Cheers
Last edited: