Fable Legends DX12 Benchmark

So the Fable Legends Dx12 Benchmark is out: http://www.anandtech.com/show/9659/fable-legends-directx-12-benchmark-analysis

And quite usefully has what is essentially a GPU profiler in the results.

xJfEq4N.png


I know the "Dynamic Global Illumination" is in this title handled via compute shader. It's interesting to note which IHV wins what. Clearly for compute shader performance Nvidia's high end GPU has the performance advantage. Same with GBuffer creation, which I would assume is geometry bound.

For Pixel operations on the other hand we can see AMD has the advantage, eg Transparency/Effects, Dynamic Lighting, and Post Processing. Which is unsurprising as AMD GPUs have scaled better with resolution to begin with.

Another interesting thing to note is scaling between GPU's on the same, or rather similar, architecture
dy3EqEC.png


Here we see the 980ti scaling better compared to the 980 than the Fury X scales compared to the 390x.

Since it's hard to see I'll just post the numbers. Gbuffer Creation, 4.4ms/6.4ms for Ti/980. 6.5ms/7.0ms Fury/390. Dynamic lighting 7.1/9.6 Ti/980. 6.7/7.5 Fury/390. Gi is probably fighting for resources so irrelevant. Compute 1.2/1.9 Ti/980. 2.2/3.0 Fury/390. Transparency 6.0/8.0 Ti/980. 5.4/7.0 Fury/390.

As we can see, for GBuffer creation Ti scales 45% better, while the FuryX scales only 7% better. Dynamic Lighting Ti scales 35% better, FuryX scales 10% better. Compute scales almost 60% better for the Ti, while only scaling to 36% better for the FuryX. Transparency scales 33% better on Ti, while scaling 30% better on FuryX.

From this we can see that, as some proposed, the Fury X is highly bound by certain bottlenecks. While it scales for transparency just fine as does the 980Ti, other things don't scale much at all, at least at 4k on this benchmark, while the 980ti scales quite well across the board.

This quite possibly bodes well for AMD's 4xx series, assuming they get back to properly balanced designs like the 2/390 series rather than the Fury.
 
That kind of profiling is actually a really interesting idea for a benchmark. It looks a lot more useful for PC players who want to tweak their settings to get best performance. It's an interesting way for a gamer to eyeball a bottleneck and adjust their graphics settings appropriately if the options are available.

On top of that, it's way more insightful for comparing GPUs for reviews, which in the end helps consumers.
 
I am not sure what to make of it, when individual times between reviews differ as much as almost 20% (Dynamic GI on a 980 Ti). Might not be really good datapoints.
 
I am not sure what to make of it, when individual times between reviews differ as much as almost 20% (Dynamic GI on a 980 Ti). Might not be really good datapoints.
And the interesting thing is that the benchmark is locked down completely. So the only thing that can vary is the hardware. There are no settings for us to tinker with that might influence the results.:eek:

It could very well just be that the benchmark doesn't have good sub-1ms resolution.
 
Can someone explain the scaling between GTX 980(GM204) and GTX 980ti(GM200)? As well, the scaling with GTX960(GM206)? Especially with Dynamic GI the scaling is pretty broken, where GM204 and GM200 achieve very similar performance, where the GM206 falls back dramatically. The same goes for Hawaii/Fiji and Tonga. IMO there are some major driver issues.
 
I am not sure what to make of it, when individual times between reviews differ as much as almost 20% (Dynamic GI on a 980 Ti). Might not be really good datapoints.

The FuryX varies by pretty much the exact same percentage in dynamic GI between the two tests, so no mystery there. Just a different CPU/Temperature/whatever.

Can someone explain the scaling between GTX 980(GM204) and GTX 980ti(GM200)? As well, the scaling with GTX960(GM206)? Especially with Dynamic GI the scaling is pretty broken, where GM204 and GM200 achieve very similar performance, where the GM206 falls back dramatically. The same goes for Hawaii/Fiji and Tonga. IMO there are some major driver issues.

Dynamic GI in this title is handled via compute, and thus in all likelihood async compute. Meaning it doesn't have a fixed amount of resources/time scheduled to it, instead async compute is there to take up whatever resources are temporarily free while the GPU waits for other things to get done. It's basically just like how a CPU will run a different thread on hardware if the thread it was currently running is stalled because it's waiting for some data.
 
And the interesting thing is that the benchmark is locked down completely. So the only thing that can vary is the hardware. There are no settings for us to tinker with that might influence the results.:eek:

It could very well just be that the benchmark doesn't have good sub-1ms resolution.
I know and i absolutely agree. In a normal game benchmark where scenes differ, performance differs, maybe even drivers differ I wouldn't have said anything, but here.... . :) But yeah, it shows that there's something quite variable with different systems inside even a microsecond.

The FuryX varies by pretty much the exact same percentage in dynamic GI between the two tests, so no mystery there. Just a different CPU/Temperature/whatever.



Dynamic GI in this title is handled via compute, and thus in all likelihood async compute. Meaning it doesn't have a fixed amount of resources/time scheduled to it, instead async compute is there to take up whatever resources are temporarily free while the GPU waits for other things to get done. It's basically just like how a CPU will run a different thread on hardware if the thread it was currently running is stalled because it's waiting for some data.

First you "blame" it on the CPU - which could probably make sense given the alleged Nvidia-allocation of CPU ressources for that and then you state it's the same on radeons, which should handle it in hardware. I'm not sure now if your point is valid.

FWIW i just picked the topmost value from Anand for comparison, it just happened to be GI and 980 Ti.

edit: strike that. Dynamic GI pretty much stays within half the variation (~8,5%) for Fury X between Anand and Extreme. So your point seeing the CPU as culprit might have merit in it! OTOH though, R9 380 (Tonga Pro) should have Async Compute as well, performance in GI tanks as well to R7 370 levels, which it shouldn't.
 
Last edited:
Back
Top