AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

no-X · Jan 15, 2017

Razor1 said:
I can show you videos of AMD employees saying these things

Please, show us.

Razor1 · Jan 15, 2017

no-X said:
Please, show us.

sorry wasn't video, it was the slides, the one slide they stated over 2x the geometry througput per clock.

Guess what, what did they say about polaris?

2x right?

I highly doubt without doing a good deal of work, dev's are not going to get more then 2x out.

Same story line different game man that is all it is.

Just like the high bandwidth cache, trying to give reason to why 8gb is enough vs 12 gb, which 8gb is more than enough right now anyways.

Smoke and mirrors.

I am almost certain that the primitive shaders are going to be used else where (consoles) which they will give a lot of benefit on consoles much more than PC's. That was the purpose they were introduced not this, cause it sure looks like a good deal of work for programmers, which they never needed to worry about on the other IHV's hardware.

firstminion · Jan 15, 2017

Don't say something you can't back up.

Razor1 · Jan 15, 2017

firstminion said:
Don't say something you can't back up.

Looking at the end notes they are comparing it to Fiji.

Yes I can back up what I'm saying.

Techreport is also incorrect as well?

http://techreport.com/review/31224/the-curtain-comes-up-on-amd-vega-architecture/2

To accomodate developers' increasing appetite for migrating geometry work to compute shaders, AMD is introducing a more programmable geometry pipeline stage in Vega that will run a new type of shader it calls a primitive shader. According to AMD corporate fellow Mike Mantor, primitive shaders will have "the same access that a compute shader would have to coordinate how you bring work into the shader." Mantor also says that primitive shaders will give developers access to all the data they need to effectively process geometry, as well.

This is AMD stating primitive shaders are needed to get the best culling performance of tris.

Infinisearch · Jan 15, 2017

Razor1 said:
This is AMD stating primitive shaders are needed to get the best culling performance of tris.

IIRC enabling a geometry shader even if it null reduces GPU performance by quite a bit... I think that quote is stating that this is the problem they fixed with primitive shaders not necessarily culling.

Razor1 · Jan 15, 2017

Infinisearch said:
IIRC enabling a geometry shader even if it null reduces GPU performance by quite a bit... I think that quote is stating that this is the problem they fixed with primitive shaders not necessarily culling.

Same article just above.

"the same access that a compute shader would have to coordinate how you bring work into the shader." Mantor also says that primitive shaders will give developers access to all the data they need to effectively process geometry, as well.

AMD thinks this sort of access will ultimately allow primitives to be discarded at a very high rate. Interestingly, Mantor expects that programmable pipeline stages like this one will ultimately replace fixed-function hardware on the graphics card. For now, the primitive shader is the next step in that direction.

Gipsel · Jan 15, 2017

Razor1 said:
nV has no such limit, its culling is limited by the rops its got. They have fixed function units to do all of this type of work.

I'm pretty sure triangle culling is not done or limited by the ROPs. And as I was hinting, who says AMD isn't implementing geometry and pixel pipelines which can mimic some features of nV's solution? Vega definitely brings some changes in that area.

Razor1 said:
What I think is happening is AMD is using its shader array through primitive shaders and when the pressure gets too high on the shader array for doing such work, triangle culling count is decreased thus why the hard limit of 11 tris per clock culling.

Right, you think. I think your description doesn't make too much sense. If you do stuff in a shader, you are basically free to do it the way as you want (and potentially also exceeding the cull rates the fixed function hardware is capable of).

Razor1 said:
The calculations for what is culled have to be done in the shader array,

Not necessarily.

Razor1 said:
AMD doesn't seem to have fixed function units to do this work.

You know that from what?

Razor1 said:
We know Polaris and prior don't have it, and Vega, they are saying primitive shaders have to be used,

Nobody said that. Primitive shaders can be used to offer added flexibility. That's what I got from the slides and statements.

Razor1 said:
Too late to add something in when they saw nV's advantages,

When got Fermi released? If my memory serves me, it must be almost 7 years ago, isn't it?

Razor1 said:
AMD couldn't do the way nV is doing it because I think fundamentally will change the GCN architecture too much

The GCN core architecture isn't really affected by that.

Razor1 said:
Looking at the end notes they are comparing it to Fiji.

Yes I can back up what I'm saying.

Techreport is also incorrect as well?

http://techreport.com/review/31224/the-curtain-comes-up-on-amd-vega-architecture/2

This is AMD stating primitive shaders are needed to get the best culling performance of tris.

That's not what is written by techreport. You must be reading something else. And the slide doesn't even mention "primitive shaders".

Razor1 · Jan 15, 2017

Gipsel said:
I'm pretty sure triangle culling is not done or limited by the ROPs. And as I was hinting, who says AMD isn't implementing geometry and pixel pipelines which can mimic some features of nV's solution? Vega definitely brings some changes in that area.
Right, you think. I think your description doesn't make too much sense. If you do stuff in a shader, you are basically free to do it the way as you want (and potentially also exceeding the cull rates the fixed function hardware is capable of).
Not necessarily.
You know that from what?
Nobody said that. Primitive shaders can be used to offer added flexibility. That's what I got from the slides and statements.
When got Fermi released? If my memory serves me, it must be almost 7 years ago, isn't it?
The GCN core architecture isn't really affected by that.
That's not what is written by techreport. You must be reading something else. And the slide doesn't even mention "primitive shaders".

Techreport got their info from AMD, it was AMD rep that told them what they stated.
Links to Techreport's article, I have posted, and quotes from AMD in those articles I have quoted.

Mantor is an AMD rep, think he is a technical manager?

So why someone like that tell techreport or anyone for that matter, if it wasn't the case?

Gipsel · Jan 15, 2017

Razor1 said:
Techreport got their info from AMD, it was AMD rep that told them what they stated.
Links to Techreport's article, I have posted, and quotes from AMD in those articles I have quoted.

You are (mis)interpreting the stuff said there. The didn't say the things you claim they said.

Edit:
And your coloring of the techreport quote shows me, that you didn't get the meaning of what was said:
"Mantor also says that primitive shaders will give developers access to all the data they need to effectively process geometry, as well." (with your division between colors).
This sentence doesn't mean, the shader has access to all data and they need this shader to process geometry effectively. The phrase "all the data they need" belongs together and the data access is needed, so the shader can work more efficiently as without it (i.e. in the current pipeline with multiple shader stages instead).

Razor1 · Jan 15, 2017

Gipsel said:
You are (mis)interpreting the stuff said there. The didn't say the things you claim they said.

what

"the same access that a compute shader would have to coordinate how you bring work into the shader." Mantor also says that primitive shaders will give developers access to all the data they need to effectively process geometry, as well.

AMD thinks this sort of access will ultimately allow primitives to be discarded at a very high rate. Interestingly, Mantor expects that programmable pipeline stages like this one will ultimately replace fixed-function hardware on the graphics card. For now, the primitive shader is the next step in that direction.

They didn't say they need the primitive shaders to help with culling performance what is that in red?

What is in blue?

They are saying it, people have been reporting as such too.

why else do they have slides saying greater than 2x of geometry through put over Fiji, 2x is what it will have, and that is the same as Polaris to begin with the rest of the performance will have to be done by the devs. 2.6 is the max that is it.

Gipsel · Jan 15, 2017

Razor1 said:
They didn't say they need the primitive shaders to help with culling performance what is that in red?

Indeed, they didn't say that there. They give a general outlook, how graphics processing will develop: gradually less and less fixed funtion and more programmable stuff. It's just a continuation of the trend of the last decades and was also stated by others on numerous occasions.
Btw., do you remember the talks of intel about a software rasterizer using the power of the large shader array allowing higher performance than a limited size block of fixed function hardware? It's going in roughly the same direction, that this added flexibility of using programmable hardware to the purpose at hand may allow a higher performance solution than sticking to sets of FF hardware which basically puts hard limits on the amount of work done and also the way it is done. So they envision a future, where a programmable geometry pipeline can offer an even higher throughput than what the traditional way of doing it allows (without saying how high this throughput is).

Razor1 said:
What is in blue?

As I said in the edit above. It allows more flexible use of shaders for geometry stuff. It doesn't say anything about that one has to use it (for instance to reach 11 triangles per clock or something; a shader can potentially discard triangles at an even higher rate). There is the trend of moving large parts of the rendering to compute shaders. Endorsing this and enabling it to interact better with the remaining fixed function stuff appears only logical.

Razor1 said:
They are saying it, people have been reporting as such too.

Again, nobody said what you are claiming.

Razor1 said:
why else do they have slides saying greater than 2x of geometry through put over Fiji,

Because 11/4>2?
Or because some internal benchmark shows this?

Razor1 said:
2x is what it will have, and that is the same as Polaris to begin with the rest of the performance will have to be done by the devs. 2.6 is the max that is it.

You are distorting the facts here. Polaris has significant advantages in some limited circumstances. AMD never claimed 2x across the board. AMD's claims for Vega are stronger. And the obviously reworked (fixed function!) geometry and pixel pipelines give some credibility to that. The added flexibility with the primitive shaders come on top of that.

Infinisearch · Jan 15, 2017

Razor1 said:
Same article just above.

Fair enough but I still doubt culling improvements are exclusive to primitive shaders. It would be a mistake to depend on something that needs explicit software support... especially something that needs to exposed to developers through developer proprietary extensions.

Razor1 · Jan 15, 2017

Gipsel said:
Indeed, they didn't say that there. They give a general outlook, how graphics processing will develop: gradually less and less fixed funtion and more programmable stuff. It's just a continuation of the trend of the last decades and was also stated by others on numerous occasions.
Btw., do you remember the talks of intel about a software rasterizer using the power of the large shader array allowing higher performance than a limited size block of fixed function hardware? It's going in roughly the same direction, that this added flexibility of using programmable hardware to the purpose at hand may allow a higher performance solution than sticking to sets of FF hardware which basically puts hard limits on the amount of work done and also the way it is done. So they envision a future, where a programmable geometry pipeline can offer an even higher throughput than what the traditional way of doing it allows (without saying how high this throughput is).
As I said in the edit above. It allows more flexible use of shaders for geometry stuff. It doesn't say anything about that one has to use it (for instance to reach 11 triangles per clock or something; a shader can potentially discard triangles at an even higher rate). There is the trend of moving large parts of the rendering to compute shaders. Endorsing this and enabling it to interact better with the remaining fixed function stuff appears only logical.
Again, nobody said what you are claiming.
Because 11/4>2?
Or because some internal benchmark shows this?
You are distorting the facts here. Polaris has significant advantages in some limited circumstances. AMD never claimed 2x across the board. AMD's claims for Vega are stronger. And the obviously reworked (fixed function!) geometry and pixel pipelines give some credibility to that. The added flexibility with the primitive shaders come on top of that.

I suggest you look at Polaris's white paper, they stated something like 3.x times more

They were talking about batman and tessellation.

Edit::

Here ya go

http://radeon.wpengine.netdna-cdn.c...is-Architecture-Whitepaper-Final-08042016.pdf

Pages 7-10

Razor1 · Jan 15, 2017

Infinisearch said:
Fair enough but I still doubt culling improvements are exclusive to primitive shaders. It would be a mistake to depend on something that needs explicit software support... especially something that needs to exposed to developers through developer proprietary extensions.

Oh yeah, I expect it to have 2x the improvement, the rest is going to come from developers, which puts it still way behind on whats already out there though.

Gipsel · Jan 15, 2017

Razor1 said:
I suggest you look at Polaris's white paper, they stated something like 3.x times more

They were talking about batman and tessellation.

As I said: significant advantages in some limited scenarios.

firstminion · Jan 15, 2017

Razor1 said:
Looking at the end notes they are comparing it to Fiji.

Yes I can back up what I'm saying.

Techreport is also incorrect as well?

http://techreport.com/review/31224/the-curtain-comes-up-on-amd-vega-architecture/2

This is AMD stating primitive shaders are needed to get the best culling performance of tris.

What I got from that slide is: 4 GE can now handle up to 11 poly and then you could use actual shaders (on CU...) to go over that. From your other posts, Mantor hopes to remove the GE altogether in the future.

Razor1 · Jan 16, 2017

firstminion said:
What I got from that slide is: 4 GE can now handle up to 11 poly and then you could use actual shaders (on CU...) to go over that. From your other posts, Mantor hopes to remove the GE altogether in the future.

Nah its 2x over and then extra, so 4 tris per clock for Fiji,
Goes to 8 tris, and then to get to 11, primitive shaders are needed. Thats why in that same slide deck they say over 2x the through put. And the foot notes say over Fiji.

Its pretty much the same increase Polaris has.

Keep this in mind though, as a chip gets wider, it should be able to show more improvements than what Polaris showed, so absolute performance wise Vega should be able to show more than Polaris when it comes to actual application performance than what Polaris showed.

Gipsel · Jan 16, 2017

Razor1 said:
Nah its 2x over and then extra, so 4 tris per clock for Fiji,
Goes to 8 tris, and then to get to 11, primitive shaders are needed. Thats why in that same slide deck they say over 2x the through put. And the foot notes say over Fiji.

Its pretty much the same increase Polaris has.

Please show me a benchmark where Polaris has 2x the throughput of Fiji, Hawaii or even Tonga (the smallest chip with 4 engines). You will have to go to some special scenarios to achieve that. AMD never claimed >2x geometry throughput for Polaris in such general way as they are doing it with Vega (with Polaris it's basically limited to very small triangles [20 triangles per pixel or something in that range] and tessellation and even then it never exceeds 1 prim/clock per engine, i.e. 4 per clock on Polaris 10). So I would be very careful with the claim that Vega will basically offer the same as Polaris in that regard.
And that primitive shaders are needed for the stated 11 prims/clock, you just made up. This is nowhere to be found in the slides or statements from AMD.

CSI PC · Jan 16, 2017

Razor1 said:
Techreport got their info from AMD, it was AMD rep that told them what they stated.
Links to Techreport's article, I have posted, and quotes from AMD in those articles I have quoted.

Mantor is an AMD rep, think he is a technical manager?

So why someone like that tell techreport or anyone for that matter, if it wasn't the case?

Also to add Scott Wasson was giving clarifications after the preview to various journalists, pretty sure he clarified the Primitive Shader to some.
Cheers

Razor1 · Jan 16, 2017

Gipsel said:
Please show me a benchmark where Polaris has 2x the throughput of Fiji, Hawaii or even Tonga (the smallest chip with 4 engines). You will have to go to some special scenarios to achieve that. AMD never claimed >2x geometry throughput for Polaris in such general way as they are doing it with Vega (with Polaris it's basically limited to very small triangles [20 triangles per pixel or something in that range] and tessellation and even then it never exceeds 1 prim/clock per engine, i.e. 4 per clock on Polaris 10). So I would be very careful with the claim that Vega will basically offer the same as Polaris in that regard.
And that primitive shaders are needed for the stated 11 prims/clock, you just made up. This is nowhere to be found in the slides or statements from AMD.

In synthetic tests, Polaris gets 2x geometry through put if I'm not mistaken over previous gen cards.