No DX12 Software is Suitable for Benchmarking *spawn*

That still depends on the definition of "Dynamic". If it's tessellating by distance and resolution it should be reasonably consistent and the preferred method. If it's reducing tessellation based on framerate then yeah it won't work.

If it's like Forza 6 Apex, which I think it is, it's turning every knob you could think of to hit the target frame rate, including internal resolution.
 
ComputerBase results are up, they tested the game through the jungle area at night (a challenging scene), they confirm GameGPU results, Fury X is worse than RX 480, the 1060 is better than both. They also state memory is not the limiting factor in this title.
https://www.computerbase.de/2016-09...abschnitt_benchmarks_von_full_hd_bis_ultra_hd

What does the paragraph at the top of the page(page 2) say, using Google translate it almost seems to imply vsync does not work as you would expect.
 
ComputerBase results are up, they tested the game through the jungle area at night (a challenging scene), they confirm GameGPU results, Fury X is worse than RX 480, the 1060 is better than both. They also state memory is not the limiting factor in this title.
https://www.computerbase.de/2016-09...abschnitt_benchmarks_von_full_hd_bis_ultra_hd

Couldn't work it out from there as my language skills are shamefully none existent, but where do we find out that's an 8GB 480?
 
ComputerBase results are up, they tested the game through the jungle area at night (a challenging scene), they confirm GameGPU results, Fury X is worse than RX 480, the 1060 is better than both. They also state memory is not the limiting factor in this title.
https://www.computerbase.de/2016-09...abschnitt_benchmarks_von_full_hd_bis_ultra_hd
And they provide absolutely zero evidence to support this conclusion. The only evidence presented shows vsync rules out any meaningful performance comparisons and memory may indeed be a significant factor. Go figure the Fury X with 4GB is performing worse than the 480 that allegedly has 8GB. And the article even says the 480 and 1060 are going head to head. The 390 is struggling, but again no idea what the specifications of that card are.

If it's like Forza 6 Apex, which I think it is, it's turning every knob you could think of to hit the target frame rate, including internal resolution.
This was specific to geometry, so I doubt it's changing settings that widely.
 
...

This was specific to geometry, so I doubt it's changing settings that widely.

I think the "Dynamic Optimization" setting has a fairly broad scope, similar to what it was in Apex. The "Dynamic Geometry Quality" option sounds much as your first description, although the game says it affects draw distance as well as detail (not just detail by draw distance).
 
What does the paragraph at the top of the page(page 2) say, using Google translate it almost seems to imply vsync does not work as you would expect.
--
It's basically saying, that normal Fraps/Perfmon-based fps-measurement are not valid because of UWP behaviour, so they ditch fps for the whole test and instead show the number of frames per run which cannot keep the 60-fps Forza requires for smooth gameplay. They say, as soon as the system cannot keep 60 fps, it drops to 30 even with VSync turned off and that this is occuring with bascially all their tested cards.
--
^^ That's not a literal translation but my best effort in breaking their paragraphs into two short sentences. Does that make sense for you? On the page before that, they state that the fps alternate between 60 and 30 when the system cannot keep up and that despite VSync off, UWP still keeps frames from tearing. They propose to switch to 30 fps in the menu in order to at least have a consistent framerate.
 
That would certainly explain the reports of "hitching" being universal across all different spec machines. So do you think this is an issue with the game, or was MS just jerking us around about unlocking frame rates in UWP? I can't find an article of anyone actually testing UWP post anniversary update.
 
I would need to check with my colleagues doing the testing on Forza, but judging from their curses, I guess we're experiencing similar things. Though I did not specifically test for it, i strongly suspect that UWP has not been fixed because unlike with Doom, where only the Geforce cards showed strong hitching with Vulkan, in Forza Geforce and Radeon are affected.
 
--
It's basically saying, that normal Fraps/Perfmon-based fps-measurement are not valid because of UWP behaviour, so they ditch fps for the whole test and instead show the number of frames per run which cannot keep the 60-fps Forza requires for smooth gameplay. They say, as soon as the system cannot keep 60 fps, it drops to 30 even with VSync turned off and that this is occuring with bascially all their tested cards.
--
^^ That's not a literal translation but my best effort in breaking their paragraphs into two short sentences. Does that make sense for you? On the page before that, they state that the fps alternate between 60 and 30 when the system cannot keep up and that despite VSync off, UWP still keeps frames from tearing. They propose to switch to 30 fps in the menu in order to at least have a consistent framerate.


Does the game support Adaptive Sync of any kind, either?
Because if it does, the results that fall within the monitor's vertical refresh range should be considered valid.
 
The talk of dropping to 30 fps and not hitting lower than 60fps seems to be the exact opposite to what DigitalFoundry has experience with. In their 8.5 minute video they go through several different settings and show the fps fluctuations.
 
And they provide absolutely zero evidence to support this conclusion. The only evidence presented shows vsync rules out any meaningful performance comparisons and memory may indeed be a significant factor. Go figure the Fury X with 4GB is performing worse than the 480 that allegedly has 8GB. The 390 is struggling, but again no idea what the specifications of that card are.
See also DF results here, 480 performing almost double the 390. Even the GTX 970 is leading it strongly (the card that is most likely to suffer in VRAM limited situations).
And the article even says the 480 and 1060 are going head to head.
They are not (480 dropped more fps) and is hitching more, that was also apparent with the DF testing.
--
They say, as soon as the system cannot keep 60 fps, it drops to 30 even with VSync turned off and that this is occuring with bascially all their tested cards.
I suspect they hit a bug where a form of VSync is always kept on no matter the settings by the user, we already have many other people that can get variable fps above and below 60.

The game has many settings that can be put at automatic Dynamic adjustments, also the setting for fps target needs to be carefully chosen, as it offers 30, 60, 30 with Vsync, 60 with Vsync, and unlocked. I suspect the first 2 options are bugged and Vsync remains engaged. They should have chosen unlocked fps as their target.
 
See also DF results here, 480 performing almost double the 390. Even the GTX 970 is leading it strongly (the card that is most likely to suffer in VRAM limited situations).
That clearly explains why a 6GB 1060 significantly outperforms a 3GB 1060. Or are you now going to argue the architectures are significantly different? Beyond a few less cores and some bandwidth. A game can have more than one bottleneck simultaneously. They can stack all of them up for that matter. A 970 could push more geometry than 390, so it's not surprising it's ahead considering a sufficient memory pool. Only in some scenes was the 480 approaching double the 390. Most of the time it was maybe 50% ahead if that. This would be geometry capabilities, but still only relevant provided playable framerates which that first benchmark lacked.

They are not (480 dropped more fps) and is hitching more, that was also apparent with the DF testing.
0.8% more frames with hitching is your evidence? The DF tests showed similar hitching with a Titan XP as well as hitching pointed out by many others here irregardless of vendors.
 
What type of renderer does that game use, to start with? Deferred or forward? And as usual, a performance profile for a couple of frames would help drawing proper conclusions, rather than just wildly guessing.

If we are playing wildly guessing, I would assume that we are observing cache trashing in the L2 cache, due to poor locality when the number of wavefronts on flights exceeds a certain threshold. The smaller the ratio between shaders in flight and L2 size, the better the scaling. During which phase of the render pipeline? Would need a profile to tell that.

Ignoring the potential >4GB VRAM usage for now, we don't know the size of the actual working set per frame. All we do know, is that up to 6GB can be used in a single scene, but not necessarily in the same view volume.
 
That clearly explains why a 6GB 1060 significantly outperforms a 3GB 1060. Or are you now going to argue the architectures are significantly different? Beyond a few less cores and some bandwidth. A game can have more than one bottleneck simultaneously. They can stack all of them up for that matter. A 970 could push more geometry than 390, so it's not surprising it's ahead considering a sufficient memory pool. Only in some scenes was the 480 approaching double the 390. Most of the time it was maybe 50% ahead if that. This would be geometry capabilities, but still only relevant provided playable framerates which that first benchmark lacked.


0.8% more frames with hitching is your evidence? The DF tests showed similar hitching with a Titan XP as well as hitching pointed out by many others here irregardless of vendors.

Bottlenecks really don't "stack up". Depends on when a bottleneck is hit in the graphics pipeline that will determine what is affected and how. If there is a memory or bandwidth issue, that would over ride all other bottlenecks as that is pretty early in the pipeline stages, and depending if other parts of the GPU start to become a bottleneck at further down the pipeline, the bottleneck would have to over come the initial slow down of the bandwidth issue. So for instance, you might not see a geometry bottleneck if the bandwidth bottleneck is so much that it slows things down to a crawl. If you have bottenecks within the same frame at different points, they still don't really "stack up", because just as what I stated before, the frame rendering of different parts are going to exhibit the same thing, and that is what you get as an end result.

So in instances of bandwidth over the PCI-e being a problem, the amount of streaming and how its done, should be staggered enough so that it doesn't slow down the rest of the GPU pipeline, similar to how you hide latency, engines are made nowadays to minimize the slow down of using streaming textures and geometry over the pci-e.

If you notice in the video they specifically dropped certain effects that seam to affect polygon throughput and that helped the r390 quite a bit.

Concerning this game where did you see 1060 3gb benchmarks, was it in the video? I might have missed it if it was.
 
Last edited:
Bottlenecks really don't "stack up".

They do. Vertex processing limit hit = slow, tessellation data-storage size limit = more slow, slow Geometry shader = very much slow so, pixel shader interpolator space limit = damn ugly slow, tons of texture reads = damn-is-this-slow-for-real slow, and when they all arrive with the same slowliness at the ROPs and want to squeeze 64 pixels out at once = absolutely useless performance slow.
 
What type of renderer does that game use, to start with? Deferred or forward?

The previous game used clustered forward+, so presumably it's an evolution of that at least if not the same.
 
Last edited:
They do. Vertex processing limit hit = slow, tessellation data-storage size limit = more slow, slow Geometry shader = very much slow so, pixel shader interpolator space limit = damn ugly slow, tons of texture reads = damn-is-this-slow-for-real slow, and when they all arrive with the same slowliness at the ROPs and want to squeeze 64 pixels out at once = absolutely useless performance slow.


Well what ever is slowest is where its going be there, its not really stacking up.

Its just the consequence of have to feed things through a certain way. Lets say geometry shader slows you down, but you have a fast pixel shader, Well you really arn't going to go faster cause you have the geometry shader to worry about. If it was the other way around, well now you have to worry about the pixel shader. If everything is slow in the pipeline lol, well someone did something wrong ;)
 
Bottlenecks really don't "stack up". Depends on when a bottleneck is hit in the graphics pipeline that will determine what is affected and how.
With a low level API it's possible. Just need really bad pipelines. Think deferred rendering where you need to run all the geometry, then any compute work until, then start texturing. The CUs start idle, then the geometry engines, then you hit the ROPs at the same time the next frame possibly starts using the geometry engines for it's prepass.

Concerning this game where did you see 1060 3gb benchmarks, was it in the video? I might have missed it if it was.
Graphs start around 4:25.
Not the best, but I didn't come up with a bunch of low memory benchmarks. It's 3GB 1060 vs 4GB 470. Minimum FPS I'd consider key for memory issues which gets worse as resolution increases. While not directly comparable, I don't recall the 6GB getting hit that hard on minimum framerate. The 6GB 1060 and 8GB 480 don't seem to share those framerate regressions. Performance drops as you would expect with higher resolutions.

Keep in mind for the original benchmark results posted most of the cards had <4GB of memory. Exception being 1080, 1070, 1060, 980ti which were all leading. No 8GB AMD cards were shown.
 
Back
Top