AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

You also need a product on the shelves to sell. It's like AMD's DX12 lead was meaningless because they couldn't capitalize on it until Nvidia had better support for it. AMDs feature set can be great on paper, but if Nvidia brute forces things at a quicker rate, then I don't see how Nvidia isn't still ahead at that point.

Vega's superior (DX12) feature set doesnt help them. Maxwell and Pascal supports everything, too. Pushing for example Tiled Ressource Tier 3 will put Pre-Vega-GCN into a disadvantage...

For me Vega looks like nVidia's G80: Most advance architecture from a DX level but only available at the high end.
 
Why would Vega be late when they skipped releasing a high end Polaris?

So it's all conjecture at this point.
Shouldn't have any effect. The driver work will be for the architecture and Polaris exists. Different product tiers are comparatively simple. In this case I'm saying they chopped the board development costs of 490 in favor of Threadripper as an example.

Thank you - even though i'd love to see and follow that informed discussion myself, I hope you will relay important updates to us normal people here. :)
Nothing like that, just not quite meeting my standards to share. Sent you a link though, but Gipsel's link should cover it. Bit lite on hard evidence atm.

If Vega is on-time and its drivers are almost ready but just need another month of development, then why did AMD release Vega today instead of just waiting a month to do it properly and without the PR nightmare?
There are uses beyond graphics AMD likely feels are important.
 
I am not sure I can follow his assertion here. He obviously comes from a Vulkan background, were maxPerStageDescriptorUniformBuffers is limited to 12 on Nvidia hardware, as far as i could google quickly. In DirectX however, Wikipedia and Microsoft bot talk of limits of 14 - which is in place for Tier 2 already, which itself is a prerequisite for DX12 FL12_0. So either Nvidia is emulating that already (and is seemingly doing ok wrt to performance) or the limitation of 12 is some kind of artificial driver limit for Vulkan.

I remain not totally convinced.

Nothing like that, just not quite meeting my standards to share. Sent you a link though, but Gipsel's link should cover it. Bit lite on hard evidence atm.
Thanks for the PM link! :)
 
What non-graphical uses not only do not require decently working drivers but are also important enough to justify the PR nightmare and the tainting of the public's perception about Vega?
All the deep learning and HPC stuff on Linux with the ROCm stack that would also involve Epyc, Instinct, SSG, etc. The really high margin stuff that may also sell their CPUs that AMD has been chasing.

While they could have disabled graphics, there may still have been a demand from game devs to start testing with what exists. Some devs may have access to different drivers or capabilities under NDA.
 
If Vega is on-time and its drivers are almost ready but just need another month of development, then why did AMD release Vega today instead of just waiting a month to do it properly and without the PR nightmare?
Cuz thats not how AMDs PR works :LOL:
 
Vega's superior (DX12) feature set doesnt help them. Maxwell and Pascal supports everything, too. Pushing for example Tiled Ressource Tier 3 will put Pre-Vega-GCN into a disadvantage...
Supporting is one thing, actually gaining performance and programming benefits is another. The obvious example is the increasing use of asynchronous tasks, which GCN excels at and Maxwell actually loses performance on, Pascal with small gains. Having the driver return flags for a feature and emulating it through brute force will just mean developers won't use it.
 
All the deep learning and HPC stuff on Linux with the ROCm stack that would also involve Epyc, Instinct, SSG, etc. The really high margin stuff that may also sell their CPUs that AMD has been chasing.
Currently those areas belong to Nvidia.
It makes no sense to damage the card's gaming reputation (its bread and butter) because of areas where AMD isn't yet present anyway, and all because they really didn't want to simply wait another month for decent drivers. That's a flawed proposition.
 
Currently those areas belong to Nvidia.
It makes no sense to damage the card's gaming reputation (its bread and butter) because of areas where AMD isn't yet present anyway, and all because they really didn't want to simply wait another month for decent drivers. That's a flawed proposition.

time isn't on their side when it comes to breaking in those markets.
 
Albeit it has to be said the "shade once" part (i.e., HSR) cannot work with the UAV counter used in this code (as that's certainly a side effect which would change the rendered result). But that should work independently from the actual binning part I'd hope...
Haven't look at the test code, but I believe the key things need to be considered are the following:
1. Does the test have depth test enabled?
2. If yes, does it have [earlydepthstencil] set? (don't know the OpenGL equivalent)
3. If yes, does it render from front to back or back to front?

1. If depth test is not enabled, the UAV write must be executed for every pixel (atomic or not, it doesn't matter). In this case, the hardware can't do any HSR and it can't skip any pixels because of ordering (later draw covering existing pixels doesn't allow skipping anything). It could still do tiling (locality optimization), but DirectX guarantees that the ROP writes will be done in order, meaning that the latest drawn triangle will remain (but all must be drawn because of the UAV write). The end result is that you want to maintain the draw order, but you don't need to maintain it globally (per tile is enough). UAV doesn't change that. ROV doesn't change it either (it only provides local ordering).

2. Conceptually DirectX pipeline has depth test at the end. GPU is allowed to execute pixel shader for every pixel and then discard results when ROPs combine the pixels together. Some old GPUs did exactly like that. Depth test however allows the GPU to skip some pixel shader invocations, but UAV write complicates things. Programmer can force depth test before pixel shader invocation by using [earlydepthstencil] attribute. This allows the GPU to skip pixels that fail depth test before pixel shader, and makes it possible to skip UAV writes based on depth test (including atomic counters). However draw order still matters...

3. Even if [earlydepthstencil] is enabled, the GPU might still do UAV writes (including atomic counter adds) because of draw order. Early depth test culls nothing if geometry is drawn back to front. Public documentation about [earlydepthstencil] is non-existing. I don't remember exactly whether it had similar strict ordering rules as late depth test and blending (must do in triangle submit order), or whether it can in some cases bend the rules. The problem is that some depth test modes (greater, less) are order dependent. The result would be non-deterministic when triangles overlap exactly if submit order isn't respected (= flickering). IIRC AMD has an OpenGL extension to disable this ordering guarantee (you get a slight perf boost).

Before drawing conclusions, we need to understand whether HSR is allowed for this test application. Tiling itself shouldn't be a problem, as UAV order is not guaranteed (and even ROVs only guarantee order per pixel == only local tiled order matters, global ordering not required).
 
Workstation performance is as bad as gaming performance. Getting beating by a GTX1080 with Quadro drivers doesnt look better...
The link you have posted shows 10 graphs comparing Vega FE and P5000 results. Vega FE is faster than P5000 im half of them, so both cards are very comparable. Try to compare price / performance.
 
Before drawing conclusions, we need to understand whether HSR is allowed for this test application. Tiling itself shouldn't be a problem, as UAV order is not guaranteed (and even ROVs only guarantee order per pixel == only local tiled order matters, global ordering not required).
Exactly that was the more or less general conclusion: Tiling/binning should work in that test (if not something is too conservative and disables it because of the UAV), HSR should be disabled.
 
Back
Top