AMD GPU gets better with 'age'

You mention cpu demanding emulator, but then bring up gpu driver limitations? Do you have some repeatable, demonstrable examples of this? What emulators are we talking about? On what operating system?
 
If you follow the history of those games, NVIDIA was losing all of those battles until Mantle was shown to help. Then, as if by magic, NVIDIA whips out some new drivers to "fix" the problems that they purportedly never had, because their drivers were always less CPU dependant? Care to look up the definition of "revisionist history"?
Actually no, You've got it mixed up, AMD dropped the ball on DX11 because of Mantle, that allowed NV to focus it's resources on maintaining solid DX11 allover, now Mantle is defunct and AMD is left with scraps of both APIs.

Interestingly enough, the R260x (and 270x) both end up matching (or beating) the 750Ti in the lowest CPU case, and then exceeding the 750Ti in the higher CPU case. So somehow this proves that AMD's driver is "heavier" than NVIDIA's? The only single game where AMD loses (on both cards) is Call of Duty: Advanced Warfare. One data point does not a trend make.
Are you reading the same article? both suffered and did worse compared to 750Ti with stutters or less fps! In fact the situation is even much more pronounced once you switch to higher cards, like the R9 280, 285 and 960.
Oh hey, look, it's StarSwarm again.
Duh no, not just Star Swarm, Also 3D Mark API overhead test!
Nevertheless, I still find that AMD's drivers will perform better in the "general case", where NVIDIA couldn't be bothered to focus on a specific game or engine.
Actually that doesn't make any sense , the one with the bigger resources can maintain the support for both popular and unpopular titles, the one with less resources will only focus on the popular, which actually fits the description of AMD precisely, Very old titles? Obscure titles? Old cards? you are on your own, bug fixes and performance improvements are few and far in between.
 
Actually no, You've got it mixed up, AMD dropped the ball on DX11 because of Mantle, that allowed NV to focus it's resources on maintaining solid DX11 allover, now Mantle is defunct and AMD is left with scraps of both APIs.
Specifically because of Mantle, NV chose to optimize exactly the specific gam... err, synthetic benchmarks that were being used as a showcase. Star Swarm was the defacto example, because NV's performance in that synthetic engine test was bunk until they recognized the loss of internet-cred over it. And then here comes the NV PR machine: "Tada! Now Star Swarm is totally awesome, don't worry about our shite DX11 performance on that game before AMD dragged us through the mud with it..."

Oh, but DX11 was obviously A: their top priority and B: NV was the clear leader before all of that, right? I beg to differ. The "337.xx" driver was going to be their DX11 big bang, because up to that point they really had made no noise about this at all. Do you not remember these slides? http://wccftech.com/nvidias-dx11-driver-337xx-arrive-april/

Now, here's a pressing ask of your memory: Do you actually remember what came of those drivers? Of course, it will depend on what games you believe NV focused on. Here's one review:
http://www.extremetech.com/gaming/1...-shouldnt-trust-manufacturer-provided-numbers
Here's another:
http://www.tomshardware.com/news/nvidia-geforce-337.50-driver-benchmarks,26473.html

I'm sorry, were you telling me that NV focused hardcore on DX11 all over? Let's see, three percent (give or take) for Tomb Raider, two percent for BF4, Heaven can suck it, but BOOM 40% FOR STAR SWARM! I'm glad that REAL GAMES were so focused upon! All over, in fact, right?

Are you reading the same article? both suffered and did worse compared to 750Ti with stutters or less fps! In fact the situation is even much more pronounced once you switch to higher cards, like the R9 280, 285 and 960.
Are you serious? Let me quote the article directly:
However, on this weaker CPU, the GTX 750 Ti didn't fare so well either. It held its overall frame-rate level better, at the expense of additional stutter compared to the R7 260X.

It appears your memory is failing, not mine, to be blunt. Further, If NV's driver is so much lighter, why is it that in every Core i3 case (except two, I'll mention those in the next paragraph) the 260x and the 750Ti framerate is within 2fps of eachother? And to be clear, that 2FPS is not always ahead of AMD, nor is it always behind AMD. Where is this claimed lightweight-ness of the NV driver? If it was lighter, it would be performing BETTER, would it not?

Not much to prove your point there, sorry.

I also made sure to mention the two general exceptions to these cases: Far Cry 4 in which the 260X framerate starts on the i3 CPU ahead of the 750Ti by 4FPS (versus the 2FPS I quoted upstream) and Call Of Duty where AMD consistently loses regardless. In no case did NV's "lightweight driver" net it better framerate on the lower CPU, but a singular game optimization path did. How does that jive with your version of this story?

Duh no, not just Star Swarm, Also 3D Mark API overhead test!
Wow, 3DMark in a situation where NV was going to get bad press draw call press thanks to Mantle, because Mantle is now included.. Gee, I bet I didn't see that optimization coming, in precisely the same way they did it with Star Swarm! :rolleyes: But wait, I thought their focus was all over DX11, why does it seem oddly confined only to synthetic tests?

Actually that doesn't make any sense , the one with the bigger resources can maintain the support for both popular and unpopular titles
You would think that, but you'd be wrong. Remember the Tomb Raider debacle? Not sure how they didn't see that one coming. Deus Ex: Human Revolution comes to mind, which was a "smaller" title in the grand scheme of things and so far as I'm aware, still isn't fixed. Howabout The Forest? Small indie title, lag problems galore with NVIDIA cards. Don't think they ever actually fixed that one, the driver fails to move the clockspeed to the "3D clocks", even though the game is plainly performing in 3D.

Hey, remember that time when NVIDIA released a driver that burned up some of their customer's cards? That was badass. So much for lighter weight! :nope:

Additional Clarity:
I can only imagine the "AMD FANBOY" response I might get after the above. To the contrary, I'm simply an enthusiast of all video cards, but I'm not an enthusiast of bias hocked as truth. Just to make sure my position on video cards is crystal clear:
The next gen after 980 will be my next video card purchase, and I'm going back to Team Green after four iterations of Team Red. LETS MAKE IT HAPPEN NV!!
 
Last edited:
Starswarm is a bad example because it was coded in a way that nobody codes their games because performance is terrible in D3D (far too many draw calls). And NVIDIA was still able to bring DX11 performance up to par in spite of this.
 
I agree with your summation; Star Swarm was specifically written as a pathological worst case for D3D. NVIDIA obviously did some real, actual work to make their driver work so very well in this case.

Despite that effort, it did not transcend into "real games" -- likely because no other games are such pathological corner cases for draw calls.
 
I agree with your summation; Star Swarm was specifically written as a pathological worst case for D3D. NVIDIA obviously did some real, actual work to make their driver work so very well in this case.

Despite that effort, it did not transcend into "real games" -- likely because no other games are such pathological corner cases for draw calls.
It will be interesting to see how Ashes of the Singularity turns out. It is using the Nitrous engine, and while not pathological, Stardock/Oxide is not going easy on the draw calls there either.
 
Specifically because of Mantle, NV chose to optimize exactly the specific gam... err, synthetic benchmarks that were being used as a showcase. Star Swarm was the defacto example, because NV's performance in that synthetic engine test was bunk until they recognized the loss of internet-cred over it. And then here comes the NV PR machine: "Tada! Now Star Swarm is totally awesome, don't worry about our shite DX11 performance on that game before AMD dragged us through the mud with it..."
Ok, so what? I don't see anything bad here, the improvements they made on that front translated into other titles as well!
Are you serious? Let me quote the article directly:
However, on this weaker CPU, the GTX 750 Ti didn't fare so well either. It held its overall frame-rate level better, at the expense of additional stutter compared to the R7 260X.
If you are going to do some quoting then don't cherry pick it, please also quote how many statements in that article that referred to the better performance of NV GPUs with low end CPUs .. please also remember that the this specific quote you gave is related to the performance of A10-5800K (an abysmally weak CPU) and not the Core i3. You also completely ignored the situation with 280X, 280 and 960.

DigitalFoundry:
Comparing AMD and Nvidia performance on the high-end quad-core i7 with the middle-of-the-road dual-core i3 processor. The results are stark. Both 260X and 270X lose a good chunk of their performance, while Nvidia's GTX 750 Ti is barely affected. The situation is even more of an issue on the mainstream enthusiast 1080p section on the next page. There's no reason why a Core i3 shouldn't be able to power an R9 280, for example, as the Nvidia cards work fine with this CPU. However, the performance hit is even more substantial there.

Wow, 3DMark in a situation where NV was going to get bad press draw call press thanks to Mantle, because Mantle is now included.. Gee, I bet I didn't see that optimization coming, in precisely the same way they did it with Star Swarm! :rolleyes: But wait, I thought their focus was all over DX11, why does it seem oddly confined only to synthetic tests?
You asked for multiple cases and examples, I gave you several games and two synthetic tests, you waved them goodbye and then asked for a proof. You are not getting any if you are not willing.

You would think that, but you'd be wrong. Remember the Tomb Raider debacle? Not sure how they didn't see that one coming. Deus Ex: Human Revolution comes to mind, which was a "smaller" title in the grand scheme of things and so far as I'm aware, still isn't fixed. Howabout The Forest? Small indie title, lag problems galore with NVIDIA cards. Don't think they ever actually fixed that one, the driver fails to move the clockspeed to the "3D clocks", even though the game is plainly performing in 3D.
Again you are not making any sense, I don't need to waste my time or yours listing all the cases in which AMD failed to deliver good performance to this day. Again large resources means better coverage over the entire gaming spectrum, and not vice versa.

Hey, remember that time when NVIDIA released a driver that burned up some of their customer's cards? That was badass. So much for lighter weight! :nope:
I see, this isn't a discussion about overhead now, it's a bragging contest.
 
The game performance of the 750 and the 260/270 were identical on the i3 processors.. The end. If NVIDIA were actually better on lower processors, they would've scored better benchmarks. THey didn't.

Why do you keep replying, when you can read that result just like I can?

Since you want to bring up selective quoting, why don't you selectively quote this:
Further, If NV's driver is so much lighter, why is it that in every Core i3 case (except two, I'll mention those in the next paragraph) the 260x and the 750Ti framerate is within 2fps of eachother? And to be clear, that 2FPS is not always ahead of AMD, nor is it always behind AMD. Where is this claimed lightweight-ness of the NV driver? If it was lighter, it would be performing BETTER, would it not?

Not much to prove your point there, sorry.

I also made sure to mention the two general exceptions to these cases: Far Cry 4 in which the 260X framerate starts on the i3 CPU ahead of the 750Ti by 4FPS (versus the 2FPS I quoted upstream) and Call Of Duty where AMD consistently loses regardless. In no case did NV's "lightweight driver" net it better framerate on the lower CPU, but a singular game optimization path did. How does that jive with your version of this story?
 
Has anyone tried Mantle Starswarm vs. nVidia's updated DX11 Starwswarm using GPUs of the same range (750Ti and R9 270), together with a really slow CPU like a dual-module Bulldozer at 2GHz?
 
The 750Ti seems to be mostly compared to the 260 in the "initial" reviews that have come up. Later reviews seem more optimistic about pitting the 750 against the 270, if only because the price has come down on the 270 enough to make the comparison viable.
 
As the article notes, these benchmarks only isolate one very specific performance part. Can somebody help me out putting this in perspective?

In the DX11 benchmark, they have about 1M draw calls, going up to 8M with DX12.
That's an impressive acceleration, of course, but I'm actually surprised with the 1M. In various GDC and other presentations, they always talked about keeping it below 3000 or so per frame to keep overhead in check. 1M/60fps would be 16K draw calls per scene. That doesn't sound so bad to me?

And the logical follow up question: on the i5, it goes from 1M to 8M for Nvidia and from 1M to 13M for AMD. Is this expected to make meaningful differences in practice, or is 8M/s already way past the point where game engines are going to be, even with DX12?
 
As the article notes, these benchmarks only isolate one very specific performance part. Can somebody help me out putting this in perspective?

In the DX11 benchmark, they have about 1M draw calls, going up to 8M with DX12.
That's an impressive acceleration, of course, but I'm actually surprised with the 1M. In various GDC and other presentations, they always talked about keeping it below 3000 or so per frame to keep overhead in check. 1M/60fps would be 16K draw calls per scene. That doesn't sound so bad to me?

And the logical follow up question: on the i5, it goes from 1M to 8M for Nvidia and from 1M to 13M for AMD. Is this expected to make meaningful differences in practice, or is 8M/s already way past the point where game engines are going to be, even with DX12?

I don't have much insight to offer, but I can point out that this demo is fairly simple, and each individual draw call should be relatively cheap, since it just draws an object without any apparent advanced shading or feature (texturing, shadows, bump mapping, reflections, ambient occlusion, etc.). Besides, making draw calls is the only thing it does, as opposed to games which, well, have to run an actual game.

So I'd say that the ability to just barely make 16K simple draw calls per frame at 60FPS without really doing anything else is pretty consistent with the 3K figure given for games as a guideline, especially since I think AMD mentioned that really good developers can push it to 5K or so.
 
And the logical follow up question: on the i5, it goes from 1M to 8M for Nvidia and from 1M to 13M for AMD. Is this expected to make meaningful differences in practice, or is 8M/s already way past the point where game engines are going to be, even with DX12?
After a certain point its more about reducing CPU utilization at a given drawcall load per frame than it is about having more draws per frame. Also more draws per frame would mean less gpu stalls waiting for work - drawcalls with small amounts of triangles.
 
After a certain point its more about reducing CPU utilization at a given drawcall load per frame than it is about having more draws per frame. Also more draws per frame would mean less gpu stalls waiting for work - drawcalls with small amounts of triangles.
Ok, I was mostly looking at it from a pure GPU point of view, which is probably too limited. Right now, I assumed that a lot of games are mostly GPU limited, not CPU limited, in which case moving to DX12 won't be a major improvement. That's only for straight ports though, with the same amount of draw calls. Over time, things will shift towards more draw calls because they can. But where's the boundary where, as a game writer, you optimize for less draw calls for better performance and where it's just the natural way of doing things.
 
Well I did say after a certain point, its just that point is now alot more variable than before. More drawcalls per frame means more freedom to submit geometry how you want to without worrying about staying in a drawcall budget, which means somethings which were off limits for certain scene densities/types before aren't, also submitting smaller batches of triangles per call becomes feasible.
 
In various GDC and other presentations, they always talked about keeping it below 3000 or so per frame to keep overhead in check. 1M/60fps would be 16K draw calls per scene. That doesn't sound so bad to me?
16K is possible in DX11 if you don't do much state changes between the draw calls. However 16K isn't that much, once you consider the fact that this is your total draw call budget, not the amount of visible objects you can render at 60 fps.

Example:
- On average every object has two different materials (2 draw calls for the g-buffer)
- You use cascaded shadow mapping. On average every object gets rendered to 1.5 cascades (because of the overlap).
- Approximately one shadow mapped local light affects every location. Thus on average each object gets rendered to one local light shadow map.

Total (average) draw calls per object = 2 + 1.5 + 1 = 4.5
Total amount of visible objects (at 60 fps) = 16K / 4.5 = 3.5K

In Trials Evolution (60 fps Xbox 360 game) we achieved 10K draw calls per frame (= 600K per second). The background was almost empty (we had 2 kilometer view range), because 10K is not enough to fill the background with objects when you have fully dynamic lighting (no baked shadows at all). Shadow maps took almost 70% of our draw call budget. Games will use more dynamic shadow casting light sources in the future, meaning that even 100K draws per frame (6M per second) is not going to be enough.
 
Also more draws per frame would mean less gpu stalls waiting for work - drawcalls with small amounts of triangles.
It depends. GPUs are bad at very small draw calls. AMDs current sweet spot seems to be around 256 vertices per draw call (~256 primitives). Anything below that slows down your triangle rate.

For example 290X is running at 1 GHz and processing 4 primitives per clock. This is 4 gigaprimitives per second. To achieve the best possible triangle rate you should not render more than 4G / 256 / 60 = 260K draws per frame (at 60 fps). This is slightly above the 3d mark results (they achieved 477K at 30 fps).

Divide this 260K with the 4.5 (from in my last post), and you get 57K visible objects (at 60 fps). This is a big improvement over the current games, but not unreasonably high. Oxide said that they improved Star Swarm performance by additional batching. Our results also show that (GPU driven) batching allows you to go as small as 64 vertices per object (and still achieve close to 100% primitive unit utilization). This will allow you to render roughly 4x higher object count (1 million at 60 fps) on AMD GPUs.
 
Back
Top