AMD: Speculation, Rumors, and Discussion (Archive)

Status
Not open for further replies.
The lack of perspective is notorious.

If we were complaining about tesselation or excess geometry being underutilized back in 2011, I wonder what the argument would be.
Main difference being that nvidia hardwires sub-pixel triangles into hairworks and other effects in gameworks, which serves only to cripple performance on AMD hardware and their own kepler/fermi cards, so we're not even talking about IQ-enhancing features anymore.

Now we have async compute that brings unprecedented performance boosts to the overwhelming majority of the console+PC market, has been acclaimed as such by pretty much all high profile developers (except for Epic because Tim and Jen are bros4life?), it's bringing major performance boosts to 4 year-old cards, but it's somehow a bad thing.


Yet you bring up an article that is well know to be incorrect in their assumptions with tessellation in Crysis?
I would rather have better mesh quality right now by the way and no that isn't due to tessellation actual mesh details. Never liked the pixel stretching of having one pixel of a texture stretched over multiple pixels of a monitor, just makes normal maps and lighting ugly.

Never stated it was bad, its just not the end all be all of performance! how hard is that to grasp? You think Pascal can't do it, you think nV is hiding something, its the same two guys here in this thread talking about Aysnc and the glories of it and then saying it can't be done in this https://forum.beyond3d.com/threads/no-dx12-software-is-suitable-for-benchmarking-spawn.58013/page-8 thread, on Pascal...... Wow, come on can you be any more transparent. And yet we are 100% sure Pascal can do it without the performance penalty that Maxwell had. MDolenc test has shown this, the same test that we used when this whole thing about async was blown up to ridiculous marketing endeavors.
 
Last edited:
I hope that the images are not blocked as the site.

10_amd_press_07.jpg
10_amd_press_11.jpg
04_amd_msi_02.jpg
05_amd_xfx_01.JPG.jpg


2010083403_6IK2EZUC_00_amd_waiting_00.jpg


00_amd_waiting_03.jpg


01_amd_sapphire_01.jpg


01_amd_sapphire_03.jpg


02_amd_asus_01.jpg


02_amd_asus_02.jpg


05_amd_xfx_02.jpg
06_amd_powercolor_01.jpg
06_amd_powercolor_02.jpg
07_amd_his_01.jpg
08_amd_giga_01.jpg
10_amd_press_01.jpg
 
Last edited:
So Vulkan/DX12/Metal are now "Mantle in disguise wreaking havoc on non AMD HW". Gotcha...can't wait to see those moving goalposts once Nvidia gets its shit together in the near futur and the disguised Mantle suddenly becomes cool. It's like being on Neogaf in here lately..
Yes it's a pretty straight forward logical equation. When high level optimizations beat low level optimizations in a certain hardware, it means something is wrong with the low level optimizations or it's just a facade. Do you have another explanation for this? Please enlighten us.

And Yes causing havoc meaning degrading performance without increasing image quality in the slightest. DX10 was famous for this, it ended up despised, ignored and was quickly replaced with DX10.1 and then DX11.
 
Love the ASUS info sheet in front of their RX480 with the wrong values against the fields for 'Base Clock' and 'Boost Clock', also the use of the term CUDA cores :)

EDIT: What's a semi-passive fan?
What? You mean I won't get a card with baseclocks @ PCI Express 3.0 and boostclocks @ OpenGL 4.5?
That's it, I'm cancelling my Strix RX 480 preorder! :runaway:
 
04_amd_msi_02.jpg


What? You mean I won't get a card with baseclocks @ PCI Express 3.0 and boostclocks @ OpenGL 4.5?
That's it, I'm cancelling my Strix RX 480 preorder! :runaway:

If we trust Asus, you will have too a new G-sync capable GPU soon. .. look like they have decide to economise on the marketing material and use the same for all brands and gpu's.


Its a little bit cheap from Asus.
02_amd_asus_02.jpg
 
Last edited:
Yes it's a pretty straight forward logical equation. When high level optimizations beat low level optimizations in a certain hardware, it means something is wrong with the low level optimizations or it's just a facade. Do you have another explanation for this? Please enlighten us.

And Yes causing havoc meaning degrading performance without increasing image quality in the slightest. DX10 was famous for this, it ended up despised, ignored and was quickly replaced with DX10.1 and then DX11.
It's quite simple really, AMD aimed at Mantle & co since dawn of GCN, and hoped it will be good enough for DX11 & co.
NVIDIA meanwhile did everything they could to get DX11 & co perfect, and in the process forgot to think forward, and what Mantle & co could bring to the table.
Changing to (partially) software scheduling brought great power savings for NVIDIA, but it also lost some flexibility compared to Fermi & GCN to my limited understanding, and there's probably other elements stacking on that, too.
 
It's quite simple really, AMD aimed at Mantle & co since dawn of GCN, and hoped it will be good enough for DX11 & co.
NVIDIA meanwhile did everything they could to get DX11 & co perfect, and in the process forgot to think forward, and what Mantle & co could bring to the table.
That's all good and beautiful in a fairy tale story, when the good deeds are rewarded and the bad ones are punished. It doesn't work like that in the tech world. There is no thinking forward when low level access yields worse performance than high level. Low access means adapting to the hardware at a deeper level, circumventing weaknesses and exploiting strengths, it should yield better fps. High level on the other hand shouldn't so that. And thus should be the bearer of worse fps. We have the situation reversed here. If an alien would look at this, he would think DX11 and OGL are low level APIs.
Changing to (partially) software scheduling brought great power savings for NVIDIA, but it also lost some flexibility compared to Fermi & GCN to my limited understanding, and there's probably other elements stacking on that, too.
Losing flexibility means a higher reliance on hand made low level code. Not the other way around. NVIDIA having better performance with the abstracted, automated code means they have higher flexibilty than you think.
 
So Vulkan/DX12/Metal are now "Mantle in disguise wreaking havoc on non AMD HW". Gotcha...can't wait to see those moving goalposts once Nvidia gets its shit together in the near futur and the disguised Mantle suddenly becomes cool. It's like being on Neogaf in here lately..

Yes, the goalpost movers in this forum have been hard at work since the RX 480 came out, and this has significantly worsened since all these async compute implementations started coming up. It's like they're feeling threatened somehow.
Just the fact that there's a thread created with the sole intent to badmouth something AMD claimed over a year ago about driver optimization towards VRAM savings, when there was already a thread dedicated to that in the second page of this forum, tells you just as much.



Yes it's a pretty straight forward logical equation. When high level optimizations beat low level optimizations in a certain hardware, it means something is wrong with the low level optimizations or it's just a facade. Do you have another explanation for this?
Your IHV has been a lot more successful by convincing developers to use proprietary high-level tools that intentionally cripple the performance on the competitor's hardware (and their own hardware from previous generations), than it could ever be by simply sticking to standards.
That's all the explanation needed.
 
Bringing the conversation back to AMD 480.
It is great that it has such a performance boost with Vulkan in Doom without noteably increasing the power consumption, has a nice effect on the perf/watt calculations (caveat though is context is limited by API).
Would be nice to see figures in comparison with previous model such as 380/390/390x/Fury, but so far I have only read one analysis looking at power draw between openGL and Vulkan and that was just 480.

Cheers
 
Last edited:
Changing to (partially) software scheduling brought great power savings for NVIDIA, but it also lost some flexibility compared to Fermi & GCN to my limited understanding, and there's probably other elements stacking on that, too.
What does scheduling instructions within SM have to do with scheduling of Draw/Dispatch commands (which I presume is what you mean by aiming into the async compute direction)? We have been over this: just because it contains word "schedule" doesn't mean the two are connected in any way.
 
Your IHV has been a lot more successful by convincing developers to use proprietary high-level tools that intentionally cripple the performance on the competitor's hardware (and their own hardware from previous generations), than it could ever be by simply sticking to standards.
That's all the explanation needed.
If you are capable of having an adult technical discussion instead of being an oversensitive bully that skirts around facts, you'd know that proprietary effects are just that, effects that increase visual quality while also costing performance on both vendors hardware, they never claimed otherwise. They never claimed to be standards that improve performance while offering that in selective manner. Or worse yet, causing fps degradation without an increase in visual quality and then claim they are doing so with close to the metal access!
 
Are the X-AMD flop cards performing closer to the X-nV flop cards in the Vulkan/Doom comparison?

e.g. is a 6TF AMD GPU performing what you'd see a 6TF nV GPU do?

Depends on which nvidia architecture you're comparing to.
Performance scales very close to linear if you compare compute capabilities in pretty much all GCN cards (except for multi-GPU solutions which are still not supported in the game for either vendor).
As for nvidia Pascal is well above the perf/flop, Maxwell 2 is above but within ~10% and Kepler is a complete crapfest where a GTX 780 Ti dips below a cutdown Pitcairn.

CAcEAiK.jpg
 
http://videocardz.com/62250/amd-vega10-and-vega11-gpus-spotted-in-opencl-driver

More sightings of Vega

The SI, CI, VI and GFX stand for GPU generations. The latest, yet unreleased architecture is GFX9, which includes Greenland, Raven1X, Vega10 and Vega 11. For quite some time Greenland was rumored to be just another codename for Vega10, but since it’s listed separately, we should assume that Greenland is something else, probably an integrated graphics chip.
 
Depends on which nvidia architecture you're comparing to.
Performance scales very close to linear if you compare compute capabilities in pretty much all GCN cards (except for multi-GPU solutions which are still not supported in the game for either vendor).
As for nvidia Pascal is well above the perf/flop, Maxwell 2 is above but within ~10% and Kepler is a complete crapfest where a GTX 780 Ti dips below a cutdown Pitcairn.

CAcEAiK.jpg
Benchmark above was done with Async Off (smaa was enabled).
http://gamegpu.com/action-/-fps-/-tps/doom-api-vulkan-test-gpu
DOOMx64_2016_07_12_23_27_48_126.jpg
 
It's quite simple really, AMD aimed at Mantle & co since dawn of GCN, and hoped it will be good enough for DX11 & co.
NVIDIA meanwhile did everything they could to get DX11 & co perfect, and in the process forgot to think forward, and what Mantle & co could bring to the table.
The process for putting together an industry standard API can be a protracted one, with differing logjams and compromises that generally do not get commented upon by the stakeholders after the fact.

One possible interpretation of the Mantle situation is that whatever ongoing efforts into the successor APIs there were had come to an impasse, and Mantle served as a way to break the impasse by putting an actual lower-level API into the market and drawing in developers.

Nvidia may have seen that a lower-level API would be useful in the future, however it was also coming from a different place where it also had some notable positions of advantage with what was already in place. The trend was that Nvidia was leading in driver resources and devrel, so it could get more of those benefits without ripping out its investments.
It's also possible that what Nvidia wanted, if it wanted a lower-level API, differed more than the other major stakeholders. Mantle or something resembling it would not have been the only way of going about things.

At least some of the performance increases with Vulkan versus OpenGL for AMD look to be examples where AMD was notably underperforming with similarly positioned Nvidia GPUs, so Vulkan's benefit is at least in part that it's bypassing a decent chunk of the traditional AMD driver performance tax, or at least inflicting some of the software immaturity and weak optimization penalties on competing silicon that had gained an insurmountable lead on the old APIs.
AMD's situation is such that even if Nvidia does eventually rectify these problems (there's enough money, talent, and inertia to figure something out), it's could still be a win if the costs for keeping up on the thick APIs was becoming too high for the weaker competitor. At least if AMD becomes second-best at Vulkan and the like, the reduced load on AMD might make that affordable.

Changing to (partially) software scheduling brought great power savings for NVIDIA, but it also lost some flexibility compared to Fermi & GCN to my limited understanding, and there's probably other elements stacking on that, too.
The primary scheduling change that comes to mind for Nvidia is the shift of dependence checking for ALU instructions when going from Fermi to Kepler, taking out of an extra layer of hardware monitoring and encoding it in the instruction stream.
That's below the level of the APIs, and not relevant for asynchronous compute or special operations exposed with intrinsics.
Poorly optimized instruction streams for this can affect the number of stall cycles due to dependences, leading to less effective use of ALU resources.
However, the general case should not present an insurmountable challenge, and there is little evidence that Nvidia is suffering in terms of getting performance per hardware FLOP. It's not like Fermi is even showing up for this particular fight, and Kepler to Maxwell to Pascal shows that they have stayed with this without showing a plateau effect due to this particular architectural feature.

It's not like AMD's instruction scheduling is actually flexible beyond constraining its execution loop so that dependences must resolve prior to the next issue cycle. The impact that might have in anything else that needs to fit into that execution loop or what changes can be made to it might be evidenced by what just happened to the RX 480's power ceiling. Hardware interlocks or decent static dependence checking are not the biggest problems out there.
AMD's choice also not a panacea, as this forum is rife of examples and commentary on how poor AMD's instruction generation is, particularly in the PC space and its non-presence in a lot of the compute space. If Nvidia is experiencing problems with just its GPUs needing hints on what instructions depend on others in a handful of cycles, other pitfalls apparently can make up for it on the other side.
 
Are the X-AMD flop cards performing closer to the X-nV flop cards in the Vulkan/Doom comparison?

e.g. is a 6TF AMD GPU performing what you'd see a 6TF nV GPU do?
If the numbers here are to be believed https://www.computerbase.de/2016-07...md-nvidia/#diagramm-doom-mit-vulkan-2560-1440 then 980ti which has 1.1 the TFlops of 390 is about 1.13 times faster in 1080p and 1.1 times faster in 1440p. However, Fury X is only 1.43 times a 390 and 1.3 times a 980ti rather than 1.68x and 1.52x, respectively.
 
Status
Not open for further replies.
Back
Top