No DX12 Software is Suitable for Benchmarking *spawn*

And then panic and cower in fear at the sight of the mighty german language.
IMO, that is a major problem if the site is not your native language. Google does a poor job a translating (as well as graphical functionality not working, ie dropdowns) which makes some translated sites a mess.
Cater native speakers or go international for clicks?
 
One of the Turn10 devs came out and said they limited a significant amount of cpu workload to core0, in order to prevent fluctuations in controller input latency that could come about from more multithreaded code.
I suppose its easier to multithread when you have fixed hardware.

Hello everyone,

Some users may notice that the game utilizes nearly 100% of one of their processor cores. This is expected behavior; we intentionally run in this manner so we can react as fast as possible in order to minimize input latency. Users on power-constrained devices, such as laptops and tablets, might want to use a Performance Target of “30 FPS (V-SYNC),” which will reduce processor usage and minimize power consumption.

https://forums.forzamotorsport.net/...tilizes-only-one-thread-core.aspx#post_769942
 
Regarding the single core being filled up:
That's also the case with Forza 3 Horizon until a patch came out. Strange.
Now, if that loaded core happens to be the same as a core chosen by the driver to do it's magic, I can imagine weird stuff happens. OTOH, it's highly unlikely that in all of CB's benchmark runs those two cores (Forza and driver) were incidentally the same one. Except, it happened more often and the other runs seemed to be outliers.

@CarstenS
Wtf-tech have a high reach of people, if you look at the comment section they have the most time about 5000 comments and more. I think a lot of people are clicking at the link to Computerbase which appears in the first sentence of the article. And they posted a link under every picture. Guru3d didn't do that. At the end of the day I think Computerbase had more clicks on there side because of WTF-tech.

But I wish also that more people go to the original sides and show respect to there works.
I was really thinking about whether or not I should reply to the OT (which I started, sorry again), but being sometimes in the same situation as Computerbase is right now, I can say that referred visits from that site are sparse. Which makes perfect sense of course, when you think about how they basically rip off all the content, leaving no reason to keep the hand alive that feeds them. Locusts anyone?

It would be completely different, if they and others, did their story, describe the situation, make their analysis and borrow one illustrative diagramm, leaving their readers with enough curiosity to click on the source link. That would channel visits, give and take, as it should be. Some older internet saying goes along "Do what you do best and link to the rest".

And speaking of german (or other languages): Yes, google translate leaves much to be desired. But i am completely unsure how it is better to read someone else's interpretation of google translate. To add to that, that site's story did apart from the introductory parapgrah nothing more than rehash the percentages of respective cards which, ironically enough, is a function of computerbase's diagram function (mouse over - 100% and such) that they could not copy with their rip-off shots.
 
Now, if that loaded core happens to be the same as a core chosen by the driver to do it's magic, I can imagine weird stuff happens. OTOH, it's highly unlikely that in all of CB's benchmark runs those two cores (Forza and driver) were incidentally the same one. Except, it happened more often and the other runs seemed to be outliers.
From NV DX12 Do's And Don'ts:
Don’ts
  • Don’t rely on the driver to parallelize any Direct3D12 works in driver threads
    • On DX11 the driver does farm off asynchronous tasks to driver worker threads where possible – this doesn’t happen anymore under DX12
Not sure at the moment where I read that but while with DX11 driver may create it's own threads and shift work to them (which NV does extensively) in DX12 driver stays in the same application thread. So yes, we have seen NV cards burning through a lot more CPU threads on fast CPUs while still performing better on low end CPUs. Here however everything is cramped up into one thread.
 
So we should be moving towards 5Ghz+ 2c2t i3 CPUs for DX12 gaming for lazy devs that relied on Nvidia doing all the multi-threading work.

Many other games, including many racing games don't have an issue with input latency and running on 4+ cores.
 
One of the Turn10 devs came out and said they limited a significant amount of cpu workload to core0, in order to prevent fluctuations in controller input latency that could come about from more multithreaded code.
I suppose its easier to multithread when you have fixed hardware.



https://forums.forzamotorsport.net/...tilizes-only-one-thread-core.aspx#post_769942
From NV DX12 Do's And Don'ts:

Not sure at the moment where I read that but while with DX11 driver may create it's own threads and shift work to them (which NV does extensively) in DX12 driver stays in the same application thread. So yes, we have seen NV cards burning through a lot more CPU threads on fast CPUs while still performing better on low end CPUs. Here however everything is cramped up into one thread.

This makes me think that perhaps the NV driver is running into the thing I notice on my system with Dx11 titles. Where all 4 of my cores are pegged at 100% and I've got multiple applications running which each require CPU time. The NV driver appears to not deal with this as well as the AMD driver. Once the NV Dx11 driver can't find spare CPU time on spare cores it really slows down.

Dx12 getting limited to a single thread for the NV driver and the developer not explicitly farming out work to other threads might be putting it into the same situation sort of situation.

I don't know how things work in the driver world, but is it perhaps that the AMD Dx11 driver being less accomplished at spreading work among multiple threads has made AMD more aggressive in making sure enough CPU time is reserved on a single core? Thus making it less impacted when spare CPU time is harder to come by?

Regards,
SB
 
Last edited:
Fun fact: Rise of the Tomb Raider does not run faster in DX12 vs. DX11 on an i5-7600K, but very much so on i7-7700K.
Out of curiosity, what about the frame times? Computerbase's update today had Intel(higher clocks) with better average FPS and AMD(more cores) with better frametimes. That finding is somewhat at odds as higher FPS and better frame times should coincide. That may need taken into account comparing DX11 to DX12.

Dx12 getting limited to a single thread for the NV driver and the developer not explicitly farming out work to other threads might be putting it into the same situation sort of situation.
Forza Motorsport 7 is not limited to running on one core. There seems to have been a miscommunication along the way. “Forza Motorsport 7” uses as many cores as are available on whatever system it runs on, whether that is a 4- to 16-core PC or the 7 cores available on Xbox One.
 
Forza Motorsport 7 is not limited to running on one core. There seems to have been a miscommunication along the way. “Forza Motorsport 7” uses as many cores as are available on whatever system it runs on, whether that is a 4- to 16-core PC or the 7 cores available on Xbox One.

It may use many cores, as many other games have claimed in the past, but it hammers only 1 or 2 cores down. On my system the game only hammers 1 core. In fact this new statement isn't in contradiction to the old statement, they are complementary, the game uses many cores but focuses it's effort on just 1.

Fun fact: Rise of the Tomb Raider does not run faster in DX12 vs. DX11 on an i5-7600K, but very much so on i7-7700K.
It doesn't run any faster on my 3770 either, maybe something to do with more cores/threads? What settings are you using and what game levels?
 
720p, mostly max detail settings apart from pure graphics effects. I test in Geothermal Valley, a manual runthrough.

I did record frametimes, but did not yet prepare the graphs. There's one very noticeable hickup in DX12 mode only though, which is most pronounced on Skylake X, less pronounced on Threadripper and subjectively least felt on Kaby Lake (X) and similar quads/hexs.
 
On my system the game only hammers 1 core. In fact this new statement isn't in contradiction to the old statement, they are complementary, the game uses many cores but focuses it's effort on just 1.
It's not really hammering one core as it's not doing work, it's checking for work as the original explanation intended. It's quite literally a loop calling getinput() as quickly as possible. Throw a sleep or wait into that thread and I'd be surprised if any core was over 50%. They farm out the work to all available cores and have one checking for updates. That's the very reason they released that clarification. The CPU side looks extremely well done best I can tell. The hitch is the async likely causing Nvidia difficulties keeping utilization high. At the very least I'd say there is a strong argument GCN is getting fed ideally.
 
It's not really hammering one core as it's not doing work, it's checking for work as the original explanation intended. It's quite literally a loop calling getinput() as quickly as possible. Throw a sleep or wait into that thread and I'd be surprised if any core was over 50%. They farm out the work to all available cores and have one checking for updates. That's the very reason they released that clarification. The CPU side looks extremely well done best I can tell. The hitch is the async likely causing Nvidia difficulties keeping utilization high. At the very least I'd say there is a strong argument GCN is getting fed ideally.
But async is about threading on the GPU and internal utilisation of the GPU's units, not about preparing frames from the CPU side. NV having a diff / inferior async implementation should not affect its CPU overhead.
Isn't Forza 7 using 11_0 feature set and is running on mainly 2 cores only?
If that is the case, then they are completely doing the port incorrectly. It is D3D12 according to afterburner. Also DX11 does not inherently mean the game has to be poorly threaded, some of the best threaded engine out there that scale into the 200 fps range on modern CPUs on PC are DX11.
I think you got it backwards.
Many cores with lower single-threaded performance will gain more than fewer cores with higher single-threaded performance.
You may want to read the review and contextualise it all. The xb1 is running on 6 Jaguar cores, running at 1,75 Ghz.

The Computerbase review is running 6 cores / 12 thread intel CPU at 4,3 Ghz. Not only does this CPU beat jaguar clock for clock, it has 6 more available threads and is clocked more than 2x higher. It not achieving 2x the CPU performance in a CPU limited scenario on NV hardware (or even ony AMD hardware for that matter), is frankly embarassing for a DX12 engine. A DX12 engine from an MS first party dev.
 
If that is the case, then they are completely doing the port incorrectly. It is D3D12 according to afterburner. Also DX11 does not inherently mean the game has to be poorly threaded, some of the best threaded engine out there that scale into the 200 fps range on modern CPUs on PC are DX11
I didn't say DX11. It's using DX12 but the 11_0 feature set.
 
Lighting is gorgeous, as expected. Damn wish I had a GPU capable of 200% scaling

2160p-200--pcgh.png
 
I wish I had not sold my toy Slave-One at the flea market as a kid, not knowing what they'd be worth in the future. :D
 
Why can Vega reach in 720p 350 fps and titan XP only 240 fps?

Limit in Frontend of Pascal? Will be nice to see how gtx 1700 behave in 720p.

And also why does Vega fall behind titan XP at 4k? Workload distribution?
 
Back
Top