AMD: Speculation, Rumors, and Discussion (Archive)

Status
Not open for further replies.
So we now officially live in an era when games will receive significant changes even months after release (multiple DX12 updates on Rise of the Tomb Raider, Vulkan update on Doom) and completely distinct rendering paths for each vendor.
Shader intrinsics are available on DX12 as well. And on NV hardware (though I don't think NV has equivalents for all of them).
Should make life of reviewers much easier. :smile:

Wasnt it allready the same ?. Specially on OpenGL.. between special optimization for a specifc vendor in games in DirectX ( that we could call "render path" )..

Remember when Mantle have been shown the presentation of Nvidia ( with Carmack ) of their OpenGL extensions set, claiming that they was already able to do the same, specialy regarding CPU overhead ? ( sadly for them, it was a bit before Vulkan have been shown ).

And i aggree this could put some big headache to reviewers... But i think this way have been open allrady sometimes ago ( and not forcibly for the good ).
 
Last edited:
AMD's gpuopen page on DCC also indicates that making a target shader-readable does impact what the compressor can do. There seems to be a subset of compression options that are aligned with what the CUs can read that are not as compact as the color pipeline can create when working independently.
 
http://www.eurogamer.net/articles/d...n-patch-shows-game-changing-performance-gains
In a tech interview with Digital Foundry (due to be published in full this weekend), the id team talk about the advantages of Vulkan and the potential of async compute in particular.

"Yes, async compute will be extensively used on the PC Vulkan version running on AMD hardware," lead programmer Billy Khan tells us. "Vulkan allows us to finally code much more to the 'metal'. The thick driver layer is eliminated with Vulkan, which will give significant performance improvements that were not achievable on OpenGL or DX."

This is seriously looking like Async on Nvidia GPUs is going to be an afterthought (or that both parties are scrambling to come up with something that doesn't' result in a performance hit on NV's HW)..
 
Wasnt it allready the same ?. Specially on OpenGL.. between special optimization for a specifc vendor in games in DirectX ( that we could call "render path" )..

Remember when Mantle have been shown the presentation of Nvidia ( with Carmack ) of their OpenGL extensions set, claiming that they was already able to do the same, specialy regarding CPU overhead ? ( sadly for them, it was a bit before Vulkan have been shown ).

And i aggree this could put some big headache to reviewers... But i think this way have been open way before ( and not forcibly for the good ).
I honestly don't think this was done on this scale. It's the "patch Thursday" or "episode everything" attitude. People used to ask them selves "what GPU" should I buy to run some game (current or upcoming). I guess a lot of people are questioning what to buy to be safe for BF 1. Can you give me an answer what to buy for Doom 2016? It's all in the patches that may or may not come.
And the fact that even standard API usage is being ifdefed out on NV ("async compute"). Makes me quite seriously wonder what would happen if one would fake vendor/device ids on NV. But you'd have to implement AMD AGS as well...
 
http://www.eurogamer.net/articles/d...n-patch-shows-game-changing-performance-gains


This is seriously looking like Async on Nvidia GPUs is going to be an afterthought (or that both parties are scrambling to come up with something that doesn't' result in a performance hit on NV's HW)..

Does it really matter when they are already way in the lead?

Aynsc compute doesn't mean much when your hardware is behind by a mile to begin with, AMD should improve their base performance first before asking developers to spend extra time and resources to improve performance on their hardware. I would rather like developers spending money on design, artwork, and other things and not to worry about IHV specific optimizations, wouldn't you?

That is the problem with the current situation, you have gameworks, then you have async, then you openGPU. Guess what, one of these curtails cost and effort but is only for one IHV (optimized), second increases cost and can be done for both IHV's in different paths (so double or more than double the work depending on IHV's gens), and the third just isn't up to par yet with the first so it will cost money and time from developers which cuts down on actual game development. Yes lets not forget shader intrinsic functions which will also be IHV specific. What is going on here, are we actually de evolving the nature of why graphics API's were done in the first place?

None of these things are ideal, but each IHV will push their stuff like always.
 
That doesn't make sense, Asynchronous Compute is a standard feature of DirectX and Vulkan. It's not IHV specific, merely that the design of one GPU is much better at it than the other.


Async has to be programming for each gen and each IHV is different ways, because there is no way for the program to know what is the best way to use the underutilized portion of the GPU's, and this does change based on the different GPU's since you don't have the same amount of cores, same amount of fixed function units, etc.
 
Async has to be programming for each gen and each IHV is different ways, because there is no way for the program to know what is the best way to use the underutilized portion of the GPU's
Have the game engine run an internal benchmark where it varies a set of parameters suited for different levels of hardware, to automatically determine a decent level of utilization. It might not be as good as hand tuning the code for every single card out there, but surely there are ways to make this a not completely manual process...
 
doesn't work that way either, its like the engine looking for potential bottlenecks in a frame, just doesn't work. Its simple it has to be hand done by the programmer as they are looking at their utilization amounts. Its what cuda programners do when they make something, they have to keep that in the back their minds to get best utilization, if they don't well ya will run into problems later on. Of course with graphics the programmer doesn't have that kind of control over the graphics portion of the pipeline (they can guesstimate though)

Actually yeah you can do it to some degree of what you saying but the resources needed to do it would just outweigh the benefits of the performance gain. IE, CPU processing to do something like that in software, just doesn't give benefits at the end.
 
Last edited:
Async has to be programming for each gen and each IHV is different ways, because there is no way for the program to know what is the best way to use the underutilized portion of the GPU's, and this does change based on the different GPU's since you don't have the same amount of cores, same amount of fixed function units, etc.

By that logic OpenCL is also bad as it needs manual tuning to get maximum performance out of different architectures. :rolleyes:
When was more control necessarily bad? We should've stuck to DX11 according to your logic.
 
By that logic OpenCL is also bad as it needs manual tuning to get maximum performance out of different architectures. :rolleyes:
Well, that's true. In many cases you can not write a singel OpenCL kernel and belive that you get maximum performance out of all architectures. You need to optimize for different vendors and different architectures.
 
By that logic OpenCL is also bad as it needs manual tuning to get maximum performance out of different architectures. :rolleyes:
When was more control necessarily bad? We should've stuck to DX11 according to your logic.


When you a single queue or single problem where you have complete control over that is much easier to deal with than when you don't have control over the graphics pipeline. Isn't this easy to understand. I thought this is B3D not another hardware forum?

Look at it this way, doesn't matter what you do, if you want maximum performance out of any hardware you have to hand tune correct? What you do think back in the day ASM programming was all about, if you want to get that 10-20% extra you could do it because the compiler can't do it for you, as compiler got better the programmer didn't need to worry about it anymore or as much and the resources needed to get that last few extra % wasn't worth it.

Compute programming everything that is done is extremely similar when its broken down. But when you have compute and graphics they are quite different from each other. Things don't just go honky dory here it is its all one nice package. Doesn't happen that way.

By my logic, screw consoles lol go back to when PC was the top end and the consoles were the bottom feeders. The PC's have the horsepower to make consoles look like a second grader who just got his grapes stomped on..... but we haven't see the true potential of what PC cards can really do till years later.

Base utilization of hardware, use what you have before you make others spend resources on making things better for youself. Its a short term goal that never pans out. A band aid if you will to the true problem. And when your hardware even with it doesn't really help to make it any better over all as its specific to each game, you end up loosing more that way.

And this is why its only PR and marketing for AMD. Its easy to see through and its not going to garner marketshare. Use that time, energy, money, on making hardware that can compete without all the fancy, making developers jumps through hoops to get that extra when its not needed.

If you look at nV's DX11 performance, guess what all that performance that AMD didn't get, hurt AMD BADLY, was it the base architecture that created it? Yeah, and it wasn't something AMD could have missed as the problem was know for how long? And for AMD to know that before us, since its their hardware is even more likely. Just because they put their eggs in one basket, namely next gen API's, and to make a mess of things for dev's in the short term because of their short comings doesn't do anything for the industry as a whole. Its just the same attitude any company has when they are down, they need to do something to keep the stink off of them.

Look at what nV did with the FX series guess where their dev initiative came from, yep when they had bad hardware, they pushed devs to use "special" features and programs they created....... "special" is the correct term for short bus kid I think lol.

At the end it didn't help nV, cause ya have a dev here or there doing something, but as a whole, most dev teams just didn't care for it. And this is where you get Gameworks from, hopefully OpenGPU will follow suite but looking at AMD's initiatives and how they have failed in the past, just don't put hopes on it.
 
Last edited:
Does it really matter when they are already way in the lead?

Aynsc compute doesn't mean much when your hardware is behind by a mile to begin with, AMD should improve their base performance first before asking developers to spend extra time and resources to improve performance on their hardware. I would rather like developers spending money on design, artwork, and other things and not to worry about IHV specific optimizations, wouldn't you?

That is the problem with the current situation, you have gameworks, then you have async, then you openGPU. Guess what, one of these curtails cost and effort but is only for one IHV (optimized), second increases cost and can be done for both IHV's in different paths (so double or more than double the work depending on IHV's gens), and the third just isn't up to par yet with the first so it will cost money and time from developers which cuts down on actual game development. Yes lets not forget shader intrinsic functions which will also be IHV specific. What is going on here, are we actually de evolving the nature of why graphics API's were done in the first place?

None of these things are ideal, but each IHV will push their stuff like always.

Here's the problem with that logic.

The work for async compute and shader intrinsics as well is already done. The developers HAVE to do it because both major consoles have it.

IE - it is no extra work for the developers to implement since they have to do it anyway.

PC exclusive developers may be able to get away with not using it, but multi-platform developers (the majority of AAA developers) will have to use it or risk falling behind their competition on consoles. Dx12 and Vulkan makes it far easier to port your console code (which uses Async compute, shader intrinsics, etc.) to PC.

Not using async compute would need more work than using async compute for the majority of AAA developers going forward.

In other words, developers have to spend more time making an Nvidia specific rendering path on PC than they do porting over the AMD specific rendering path from console. Either that or they port over their code and just hope that Nvidia properly supports Vulkan and Dx12.

Regards,
SB
 
intrinsic shaders and async compute shaders don't translate over to PC's as we have seen in the past EVEN on AMD hardware..... Just doesn't work well outside of the same chip or similar configed chip from consoles to pc's. So Dev's are stuck with a midranged GPU optimization path and still have to do more work on the other chips of the same gen smaller or larger versions.

This is what the problem has been and will continue to be as long there is no hardware implementation of utilization analysis isn't there if it can even be done with the current nodes.
 
AMD-Grafikchip-Architektur-Roadmap-2015-2018-neu.vorschau2.jpg

http://www.3dcenter.org/news/neue-amd-roadmap-bezeichnet-vega-als-highend-architektur
 
intrinsic shaders and async compute shaders don't translate over to PC's as we have seen in the past EVEN on AMD hardware..... Just doesn't work well outside of the same chip or similar configed chip from consoles to pc's. So Dev's are stuck with a midranged GPU optimization path and still have to do more work on the other chips of the same gen smaller or larger versions.

This is what the problem has been and will continue to be as long there is no hardware implementation of utilization analysis isn't there if it can even be done with the current nodes.

I wonder how much of that is developers having to do an Nvidia specific path and then just going with that instead of using the console rendering path. Especially if those developers are closely tied to Nvidia. In that case you'd lose much of the optimizations that exist on the consoles versions.

iD appear to have decided to use as much of the console rendering path as possible for their Vulkan rendering path for Doom. And it shows.

At worst, there is no performance gain on Nvidia hardware. At console resolutions and settings, however, there are gains on Nvidia hardware. And on hardware that uses GCN, the same as consoles, the gains are massive and universal.

IE - they didn't have to do a special GCN 1.0, 1.1, 1.2, 1.4, etc. rendering path with special tweaks despite the consoles only being GCN 1.0-1.1. Everything from the 7xxx series up to the newest Rx 4xx series benefits quite significantly at all resolutions and on all hardware.

It's relatively clear that Dx12 and Vulkan are primed to radically change how PC graphics are done at the AAA level. How long it takes for developers to come to grasps with that is the variable. I'm sure Nvidia partnered developers will be heavily encouraged not to use the console code. But iD has shown that if you do, there's potential for great performance gains on AMD and possibly even Nvidia hardware.

Regards,
SB
 
intrinsic shaders and async compute shaders don't translate over to PC's as we have seen in the past EVEN on AMD hardware..... Just doesn't work well outside of the same chip or similar configed chip from consoles to pc's. So Dev's are stuck with a midranged GPU optimization path and still have to do more work on the other chips of the same gen smaller or larger versions.

Asynchronous compute can be pretty variable across GPUs with differing graphics and compute organizations, but at least within AMD's offered intrinsics there should be more continuity. These are exposing GCN wavefront instructions, whose context is an individual compute unit and its pretty consistent parameters.
 
Well, that's true. In many cases you can not write a singel OpenCL kernel and belive that you get maximum performance out of all architectures. You need to optimize for different vendors and different architectures.

So it's the fault of OpenCL for being low-level, providing you more control and allowing you to tune for different architectures? Do you want languages like RenderScript to take over where you have less control? Then you can never get 100% performance out of any architecture. Take your pick.
 
Well, that's true. In many cases you can not write a singel OpenCL kernel and belive that you get maximum performance out of all architectures. You need to optimize for different vendors and different architectures.

You can't even write a single Dx9 or Dx11 rendering path and believe you get maximum performance out of all architectures. You'll get generally good performance across a wide range of hardware, but you'll never get maximum performance.

Regards,
SB
 
Status
Not open for further replies.
Back
Top