No DX12 Software is Suitable for Benchmarking *spawn*

Huh, SLI 1080s don't get a benefit in the game... Driver error?

DX11 path is too recent to be considered in the latest drivers and the generic AFR method probably doesn't work (though that website is so shady/weird/cringy I wouldn't be surprised if they never even tried).
Same goes for Crossfire.

UWP doesn't support SLI yet does it ?
Those results refer to the Steam version.





BTW, comparing to the DX12 version, AMD cards gain a substantial amount of performance when using DX12 whereas nvidia maxwell stays flat or loses performance:

3yfWk2H.jpg



As always, maxwell users should steer clear of the new APIs.
 
Last edited by a moderator:
So empirical data isn't evidence? We've had graphics of memory usage and framerates from a variety of cards and sources. We've had videos of assets slowly loading in. There was ample evidence to make the claim and the outcome obvious to any informed user. Not difficult to put the pieces together when the results show exactly what one would expect given the scenario.


What we have (from Digital Foundry, GameGPU, ComputerBase, and benchmark.pl):
  • VRAM measurement from FuryX: it shows the GPU VRAM is only 3.5GB full @1080p, despite that the card trails 4GB Polaris GPUs. (GameGPU)
  • Computerbase citing no presence of VRAM bottleneck @4GB, and testing with High settings @1080p, Medium @1440p and 4K. Still FuryX trails Polaris by a big margin this time (Computerbase)
  • 970 leading 390 by a significant margin. (DF, ComputerBase)
  • 480 massively leading 390. (DF, ComputerBase)
  • 1060 leading the 480. (ComputerBase, DF, benchmark.pl)
  • 480 8GB leading a 470 4GB by just 8% @1440p Ultra. (benchmark.pl)

  • 780Ti 3GB slightly leading 290X 4GB @1080p Ultra (GameGPU)



What you proposed:
2 bottlenecks, VRAM one and geometry one. VRAM explains how Polaris trumps Fury, and it starts at 4GB. Geometry explains NV leading AMD significantly despite VRAM deficiencies in some situations. You also based your argument on the 1060 3GB having worse fps than the 470 4GB. A valid concern, Despite this data point being not confirmed or tested properly, just a youtube video+ performance bars added in, If we indluge this point it only means that there is VRAM bottleneck which starts @3GB. A normal behavior in most modern games.


What we don't have:
We don't know whether the 480 and 390 used in most previous benchmarks were of the 8 or 4GB variety.
However we can extrapolate that Computerbase is definitely testing the 8 GB version of the 480, because it is comparing it to the 6GB version of the 1060.
Still, this makes little difference, the 8GB 480 already leads the 4GB slightly, which goes against the VRAM bottleneck starting @4GB, so does the 970 (3.5GB) leading the 390. And the 780Ti 3GB leading the 4GB 290X.

So until we have more data point, it's obvious the VRAM bottleneck is not playing any major role in the current performance metrics we have today.
 
Take a closer look, iro. This is the steam version, and other cards get benefits from SLI.
Sadly, the developers stated Steam version doesn't support SLI or CF.
http://www.dsogaming.com/news/quant...-be-dx11-only-will-not-support-mgpus-systems/
BTW, comparing to the DX12 version, AMD cards gain a substantial amount of performance when using DX12 whereas nvidia maxwell stays flat or loses performance:
These results are from the old build of the game which was patched later to fix Maxwell performance. A direct head to head comparison is needed from the latest build of the game. Also some of the visual options underwent optimizations, you could turn the scaler for example to run the game at native resolution.

EDIT: Some DX12 results from the latest DX12 build:
TITAN-X-345-76.jpg

http://www.hardwarecanucks.com/foru...vidia-titan-x-12gb-performance-review-14.html
 
Last edited:
These results are from the old build of the game which was patched later to fix Maxwell performance. A direct head to head comparison is needed from the latest build of the game. Also some of the visual options underwent optimizations, you could turn the scaler for example to run the game at native resolution.

EDIT: Some DX12 results from the latest DX12 build:
TITAN-X-345-76.jpg

http://www.hardwarecanucks.com/foru...vidia-titan-x-12gb-performance-review-14.html

That one actually shows similar scaling compared to the one ToTTenTranz linked. The only card that is roughly the same is the Fury X. Everything else is either different or stock (980/970) versus overclocked (980/970 superclocked). One has Titan X - Pascal while the other has Titan X - Maxwell 2.

So it doesn't really prove that things significantly improved for Nvidia cards relative to AMD cards in Quantum Break. All that it shows is that Maxwell/Maxwell 2 cards are worse than GCN cards are worse than Pascal cards in the DX12 version of Quantum Break which doesn't even really do much to take advantage of DX12.

Regards,
SB
 
That one actually shows similar scaling compared to the one ToTTenTranz linked. The only card that is roughly the same is the Fury X. Everything else is either different or stock (980/970) versus overclocked (980/970 superclocked). One has Titan X - Pascal while the other has Titan X - Maxwell 2.
What has changed is the relative performance of the 980Ti, TweakTown found the Maxwell TitanX (which is faster than 980Ti) slower than FuryX with the old build. In the new build (HardwareCanuks) 980Ti is much closer to the Fury X than before, which means that Maxwell TitanX is even closer as well.

So it doesn't really prove that things significantly improved for Nvidia cards relative to AMD cards in Quantum Break.
True, it changed slightly, but it changed none the less.

All that it shows is that Maxwell/Maxwell 2 cards are worse than GCN cards are worse than Pascal cards in the DX12 version of Quantum Break which doesn't even really do much to take advantage of DX12.
Agreed, but again we need a head to head comparison here (same scenes, DX11 vs DX12). We need to know whether DX12 from AMD is faster than DX11 from NV, or the vice versa.

On another note, overclock3d tested the 8GB version of the RX 480 and found the same trends as before:

28132656313l.jpg


http://www.overclock3d.net/reviews/gpu_displays/forza_horizon_3_pc_performance_review/8
 
Where is the empirical data, I don't see memory usage numbers, do you?
Included in the link to the first GameGPU benchmarks. There are clearly numbers provided showing how much VRAM a card would like to use.

So until we have more data point, it's obvious the VRAM bottleneck is not playing any major role in the current performance metrics we have today.
So long as losing 66% of your framerate and getting texture popping isn't major I'm sure you're right here. So please continue to cite numbers that can't actually support your claim as evidence. I'm not seeing a whole lot of other reasons a 3GB and 6GB 1060 would have that large of a difference and not exhibit the same issues. And no the bottleneck won't start at the provided memory capacity, that just means it's present at that capacity.
 
Included in the link to the first GameGPU benchmarks. There are clearly numbers provided showing how much VRAM a card would like to use.


So long as losing 66% of your framerate and getting texture popping isn't major I'm sure you're right here. So please continue to cite numbers that can't actually support your claim as evidence. I'm not seeing a whole lot of other reasons a 3GB and 6GB 1060 would have that large of a difference and not exhibit the same issues. And no the bottleneck won't start at the provided memory capacity, that just means it's present at that capacity.


It doesn't show you how much a 1060 3gb would be using at the settings the review was done at. You can't say definitively anything. Is that very hard to understand? I have stated it could be but you can't tell me it is 100% without having more data. There is no empirical data to draw that conclusion from. Because you are comparing different setting across different reviews, with different methods.

And that is the same thing David stated.

So until we have more data point, it's obvious the VRAM bottleneck is not playing any major role in the current performance metrics we have today.
 
It doesn't show you how much a 1060 3gb would be using at the settings the review was done at.
The capacity of the card should have no bearing on what gets drawn in a scene, excluding dynamic effects. Sure it can vary a bit between scenes, but as I said it seems likely <4GB would have an impact. Which is precisely what the first benchmark posted was testing. Average or max FPS won't show the issue either, unless it's really limited.
 
Included in the link to the first GameGPU benchmarks. There are clearly numbers provided showing how much VRAM a card would like to use.
In that link a FuryX is only consuming 3.5GB of it's 4GB @1080p, while still trailing Polaris GPUs, hardly a VRAM bottleneck here at all.

So long as losing 66% of your framerate and getting texture popping isn't major I'm sure you're right here. So please continue to cite numbers that can't actually support your claim as evidence. I'm not seeing a whole lot of other reasons a 3GB and 6GB 1060 would have that large of a difference and not exhibit the same issues.
Again, the 3GB 1060 is irrelevant here, 3GB cards often hit VRAM limitations in this age, what's relevant here is the behavior of 4GB cards, which shows no sign of being VRAM limited in this title. And I have given you ample examples, the last of which is this test from overclock3d:
http://www.overclock3d.net/reviews/gpu_displays/forza_horizon_3_pc_performance_review/8
 
Just one thing with Forza, the game is plagued by some deep problems.. That i hope could be corrected: One should be easy to fix, it is the v_sync problem ( on/off, it revert to 30fps if the fps run under 59.9fps, so what you need there is a complete graph of frametimes, specially with old and small gpu's ), the second is the files who need to be decompressed and streamed are AES256 encrypted, meaning it give an big cpu usage permanently and this can provide stutter, speciallly when you run it at high fps. basically faster the game files are needed ( high fps ), higher will be the cpu usage and depending what CPU you have, higher will be the stutter fest ( gpu need to wait the files are decompressed ). This explain too the 100% CPU usage peak reported when the game run on high fps...

Dont ask me why they have use this type of encryption.

In addition it seems some report a theres a serious memory leak who have been reported.. ( vram usage seems highly increasing over time ( something who are maybe not shown in quick and fast benchmarks, but this dont mean it have no impact )

I will not much read too fast on actual numbers as thoses 2 problems only ( and who are sadly not the only one ) can dramatically change the output results.
 
Last edited:
The capacity of the card should have no bearing on what gets drawn in a scene, excluding dynamic effects. Sure it can vary a bit between scenes, but as I said it seems likely <4GB would have an impact. Which is precisely what the first benchmark posted was testing. Average or max FPS won't show the issue either, unless it's really limited.


Drivers can have a huge affect on this..... Why do you think Fury X doesn't get memory bottlenecked in certain games that we know use more than 4gb on other cards? As much as 50% more too.

This is why you can't just say what you just stated, too many different things across the board to narrow it down like that.
 
Drivers can have a huge affect on this..... Why do you think Fury X doesn't get memory bottlenecked in certain games that we know use more than 4gb on other cards? As much as 50% more too.

This is why you can't just say what you just stated, too many different things across the board to narrow it down like that.

Yes, but it was a general trends with AMD pre Polaris, they use in general way less memory than Nvidia counterparts .. ( could be 500MB to 1 GB at best )...

Completely offtopic, but games use way to much memory that they could and should, lets not forget that some engine try to fill the memory untill it is completely filled ( a total aberration with modern gpu's ).
 
Yes, but it was a general trends with AMD pre Polaris, they use in general way less memory than Nvidia counterparts .. ( could be 500MB to 1 GB at best )...

Completely offtopic, but games use way to much memory that they could and should, lets not forget that some engine try to fill the memory untill it is completely filled ( a total aberration with modern gpu's ).

LOL old problem doesn't matter how much ram ya have it will always get filled up hehe.

Well most games don't overly use more memory just because its there. If we take older games like 5 year old games, they won't do it.
 
In that link a FuryX is only consuming 3.5GB of it's 4GB @1080p, while still trailing Polaris GPUs, hardly a VRAM bottleneck here at all.
That doesn't necessarily mean it's not limited. DX12 memory management being app dependent means it won't likely be perfectly utilized. They may have opted to stream all textures or 500MB is reserved. Keeping space for new allocations or swapping around data wouldn't be unreasonable. It's possible, even likely 3.5GB is close to the amount of data required to render a frame. That won't account for moving around and streaming in new data however. The Fury with HBM and all that bandwidth isn't a great example as we've seen before how it's designed to stream everything. It has so much bandwidth it can easily swap resources around with minimal performance impact.

Again, the 3GB 1060 is irrelevant here, 3GB cards often hit VRAM limitations in this age, what's relevant here is the behavior of 4GB cards, which shows no sign of being VRAM limited in this title. And I have given you ample examples, the last of which is this test from overclock3d:
It's relevant because it seems to indicate a point where capacity is likely an issue. I'm not trying to fault the 1060, it's just the 3GB model provides an interesting correlation with a 6GB model. Same as the 470/480. Those are the obvious examples where the architecture is nearly identical with different memory capacities. Given enough VRAM, a dev with DX12/Vulkan could simply load all assets and make bundles. Shouldn't be a whole lot of hitching in that situation and memory is a lot easier to manage when not constrained.
 
That doesn't necessarily mean it's not limited. DX12 memory management being app dependent means it won't likely be perfectly utilized. They may have opted to stream all textures or 500MB is reserved. Keeping space for new allocations or swapping around data wouldn't be unreasonable. It's possible, even likely 3.5GB is close to the amount of data required to render a frame. That won't account for moving around and streaming in new data however. The Fury with HBM and all that bandwidth isn't a great example as we've seen before how it's designed to stream everything. It has so much bandwidth it can easily swap resources around with minimal performance impact.


It's relevant because it seems to indicate a point where capacity is likely an issue. I'm not trying to fault the 1060, it's just the 3GB model provides an interesting correlation with a 6GB model. Same as the 470/480. Those are the obvious examples where the architecture is nearly identical with different memory capacities. Given enough VRAM, a dev with DX12/Vulkan could simply load all assets and make bundles. Shouldn't be a whole lot of hitching in that situation and memory is a lot easier to manage when not constrained.

Even in DX11 games that use a lot of memory the Fury X had issues, you're going to be limited by PCI-E bandwidth and latency, afaik AMD releases game specific profiles in drivers for major titles, I'm thinking that new mirror's edge game that had hilarious vram requirements at max texture settings. I found this in the meantime

FC4-MinFrameRate.png


me-catalyst-bench-1080-hyper.png
 
Back
Top