Nvidia's 3000 Series RTX GPU [3090s with different memory capacity]

@sir doris If you go back to the BFV video with the old i7, the guy “fixed” his performance regression with the 1660 ti by overclocking his ram from 1866 to 2133 and also improving primary and secondary timings. Maybe he just saved enough clock cycles by lowering latency to give his gpu more room to run, or maybe nvidia drivers themselves are actually sensitive to cache and memory latency.
 
@sir doris Igor also doesn’t measure cpu usage in any way, and instead tries to infer performance. He also makes a guess that it has something to do with how nvidia drivers manage asynchronous compute, but I’m not really clear on how he got there without actually doing some kind of profiling. I’m not sure bfv dx11 can do asynchronous compute, but the issue is visible there.
 
Note the CPU use


At epic settings with the gpus pushing to 100% the cpu use on the nvidia setup is always higher. At low competitive settings the cpu use on the Nvidia setup is always higher.

A Far Cry Primal comparison between FuryX and 980Ti on the other end of spectrum. 4k-res with high settings and low-fps gpu bound settings( <50fps on both ) with FuryX having higher fps and lower CPU usage at the same time.

 
@gamervivek The more I look for tests the more I find ones that show AMD using less cpu for about the same performance, even if just slightly.

This Far Cry 5 is using an 8700K with a 1080 and Vega64. Vega is a little bit faster and uses just a tiny bit less cpu in this test.

Here's Rage 2 running a little bit better on AMD with about the same CPU usage. I guess this is the avalanche engine? Whatever the Just Cause team uses.
 
@sir doris Igor also doesn’t measure cpu usage in any way, and instead tries to infer performance. He also makes a guess that it has something to do with how nvidia drivers manage asynchronous compute, but I’m not really clear on how he got there without actually doing some kind of profiling. I’m not sure bfv dx11 can do asynchronous compute, but the issue is visible there.

In fairness to Igor he admits he doesn't really have the time atm to investigate this further with the detail that's required, so he is basically just throwing out suppositions at this point.

It is interesting though that he's observing actual noticeable artifacts as a result of this (albeit in a manufactured situation). Again not hard data which is what we want, but at least in his brief experience, this is producing behaviors which actually affect the presentation beyond just dropped frames and less CPU headroom:

Whether it was Horizon Zero Dawn or Watch Dogs Legion, whenever the FPS dropped on the GeForce (especially in the measurements with only 2 cores), the slower popping of content, delayed loading of textures, or errors with lighting and shadows were less bad on the Radeon than on the GeForce. This is also an indicator that the pipeline was simply tight (bubbles) and the multi-threading on the GPU was not really optimal. This is supported by the fact that the percentage gaps between the two cards are always the same when increasing the core count and decreasing the CPU limit (see page two). Because I see the problem rather less with the CPU, but the processing of the pipelines on the GPU. A limiting CPU only makes the process more obvious, but is not the real reason.
 
@gamervivek The more I look for tests the more I find ones that show AMD using less cpu for about the same performance, even if just slightly.

Here's Rage 2 running a little bit better on AMD with about the same CPU usage. I guess this is the avalanche engine? Whatever the Just Cause team uses.
It's a little bit better in overall fps in Rage, but what stood out to me in particular was the consistency of the rendering graph in ms - the Radeon is noticeably more stable. No massive drops on the Geforce, but overall a far more jagged line.
 
In other news, NVIDIA broke their own Ethereum-limiter on GeForce RTX 3060 with the new GeForce 470.05 beta-drivers.
The drivers are available from NVIDIAs developer site (requiring registration) and apparently from Windows Update for Windows Insider users.

https://www.computerbase.de/2021-03/mining-bremse-geforce-rtx-3060-umgehbar-eth-hashrate/
https://www.hardwareluxx.de/index.p...e-der-geforce-rtx-3060-umgangen-2-update.html
https://pc.watch.impress.co.jp/docs/news/yajiuma/1312085.html
https://videocardz.com/newz/pc-watch-geforce-rtx-3060-ethereum-mining-restrictions-have-been-broken
 
In other news, NVIDIA broke their own Ethereum-limiter on GeForce RTX 3060 with the new GeForce 470.05 beta-drivers.
The drivers are available from NVIDIAs developer site (requiring registration) and apparently from Windows Update for Windows Insider users.

https://www.computerbase.de/2021-03/mining-bremse-geforce-rtx-3060-umgehbar-eth-hashrate/
https://www.hardwareluxx.de/index.p...e-der-geforce-rtx-3060-umgangen-2-update.html
https://pc.watch.impress.co.jp/docs/news/yajiuma/1312085.html
https://videocardz.com/newz/pc-watch-geforce-rtx-3060-ethereum-mining-restrictions-have-been-broken
Not sure if that wasn't an intentional change:

 
Tried Control - DX11 is faster than DX12. DX11 has better multi-threading, too... like WoW.
I guess developers either ignore nVidia's DX11 driver and life with a slower DX12 path or they go the "max out every core" route to "optimize" the DX12 path.
I recently played through Control on a GTX 1080 in DirectX 12 mode because the level load times were much faster there. Rendering performance didn't seem noticeably different in either mode.

Though I've read that the texture LOD changes might be faster in DirectX 11. In DirectX 12 you see low resolution textures being swapped quite often.
 
I recently played through Control on a GTX 1080 in DirectX 12 mode because the level load times were much faster there. Rendering performance didn't seem noticeably different in either mode.

Though I've read that the texture LOD changes might be faster in DirectX 11. In DirectX 12 you see low resolution textures being swapped quite often.

The texture pop-in definitely was lessened on my gtx1660 in DX11 compared to 12, but there's actually this mod which fixes the texture pop-in completely in that mode:

https://community.pcgamingwiki.com/files/file/2035-control-blurry-textures-fix/

I never see texture pop-in with it installed, even when running from a HDD. Even less than on a PS5.

DX12 though has always been a disaster for me in this regard, this mod does nothing for it.
 
PCGH tests CPU limited performance.

They conclude that when focusing on low level API games, AMD is noticeably faster.
Three Vulkan games there show a different result though - 3090 is faster than 6900XT.
That's 3 out of 11 of "low level API games". NV is generally faster in DX11 too.
Frametime graphs aren't that different either, with AC Valhalla being the only game noticeably better on 6900XT.
Again it's not clear what is the issue here or if it even is CPU related and not API related or app specific.
 
Three Vulkan games there show a different result though - 3090 is faster than 6900XT.
That's 3 out of 11 of "low level API games". NV is generally faster in DX11 too.
Frametime graphs aren't that different either, with AC Valhalla being the only game noticeably better on 6900XT.
Again it's not clear what is the issue here or if it even is CPU related and not API related or app specific.
I do agree that its clear we are seeing symptoms of something, but the tests have not yet been sufficient to lead to a conclusion of the cause quite yet.
We are still very much still in the diagnosis stage, hope to see more tests and information released.
 
I would still like to see a straightforward cpu utilization plot with equalized performance (in game frame limiter). It should be pretty easy to see which uses more cpu time.

Could easily be something that’s only a consideration in dx12, for example.
 
I would still like to see a straightforward cpu utilization plot with equalized performance (in game frame limiter). It should be pretty easy to see which uses more cpu time.

Could easily be something that’s only a consideration in dx12, for example.
Needs to be more than that I think. We should think on how to profile the CPU time per frame to see where it is being spent on different systems and in different games.
 
Needs to be more than that I think. We should think on how to profile the CPU time per frame to see where it is being spent on different systems and in different games.

I have zero expectations of any reviewers touching a profiler, but profiling drivers may actually be difficult. Windows Performance Toolkit, maybe?
 
I do agree that its clear we are seeing symptoms of something, but the tests have not yet been sufficient to lead to a conclusion of the cause quite yet.
We are still very much still in the diagnosis stage, hope to see more tests and information released.

It was visible since the first games. DX12 on nVidia hardware doesnt work right. For example Hitman 1 with DX12 is slower on my 4C/8T and 2060 notebook than DX11 in a CPU limited scenario. There is a software(API) overhead involved to get nVidia GPUs running. And without proper multi-threading the nVidia DX11 driver is just superior to DX12...
 
Back
Top