With all the sh*t storm we are about to see regarding Microsoft Store/WDDM 2.0 - Windows Composite Engine/VSYNC,
anyone tested the very latest version (applied to recent update) of Ashes of the Singularity with both GSYNC and Freesync?
Going back awhile I thought Freesync can only operate correctly with full screen application, and I wonder if GSYNC will also be screwed by impact of implementation of game requirement for Windows Store-DX12.
I meant to post awhile back, but reading Ryan's latest article on PCPer and that at Guru3d means this is going to throw some spanners in the gaming development world on PC, let alone the support for Crossfire and SLI.
http://www.pcper.com/reviews/Genera...up-Ashes-Singularity-DX12-and-Microsoft-Store
I get that bad feeling Microsoft is creating another palm and face moment for the PC gaming environment; wonder if this debacle is something else Phil Spencer will need to take control of and put on the right track as we saw with the XBOX One project.
Cheers
Nothing substantial. But it was repeatable behaviour, not singular events. Might help to look at it with GPU view or similar. If only I had time...Any ideas on the negative gains for fury X Carsten? The overflow seems awkward since that should effect all of them no?
Isn't this kinda expected and has been pointed at in this thread already? Running compute and graphics concurrently makes both queues compete for resources. Most notably bandwidth. As the amount of work increases so does the amount of L1/L2 evictions. HBM interface is clocked far slower then GDDR 5 interfaces are and thus it's able to service less transactions.Nothing substantial. But it was repeatable behaviour, not singular events. Might help to look at it with GPU view or similar. If only I had time...
Given the minimal differences between R9 390X und R9 290X, I'd rather guess it's based on available Videomemory. (8/6 GiB in top-tier, 4/3,5+0,5 GiB in med-tier and 1/2 GiB in lowest tier). But that's just a wild guess.Recommended specs for DX12 game Gears of War: Ultimate edition:
Those are only estimates but still it's interesting which cards are grouped together by the devs: DX12, savior of AMD?
Ideal -> Nvidia Geforce GTX 980 Ti / AMD Radeon R9 390X
Recommended -> Nvidia Geforce GTX 970 / AMD Radeon R9 290X
Min -> Nvidia Geforce GTX 650 Ti / AMD Radeon R7 260X
Sounds sound - I had not thought about the comparatively low clock speed of the HBM.Isn't this kinda expected and has been pointed at in this thread already? Running compute and graphics concurrently makes both queues compete for resources. Most notably bandwidth. As the amount of work increases so does the amount of L1/L2 evictions. HBM interface is clocked far slower then GDDR 5 interfaces are and thus it's able to service less transactions.
What you’re watching is the Radeon Fury running the Gears of War: Ultimate Edition benchmark on my capable Intel test bench, at 1440p with High quality settings. These settings include FXAA and Ambient Occlusion. You’re also seeing horrendous hitching and stuttering, and some visual corruption thrown in for good measure, making the game completely unplayable on an excellent $500 graphics card.
AMD’s Radeon Fury X and Radeon 380 also choked when switching quality to High and running at 1440p or higher.
Surely the performance gets even worse as you make your way down the Radeon product stack, right? Oddly enough, no. I tested an Asus Strix R7 370 under the same demanding 4K benchmark, and it turned in only a 13% lower average framerate. Crucially, no stuttering or artifacting was present.
The Radeon 390x is just fine, achieving double the framerate at High Quality/4K as the more expensive Fury and Nano cards.
Isn't this kinda expected and has been pointed at in this thread already? Running compute and graphics concurrently makes both queues compete for resources. Most notably bandwidth. As the amount of work increases so does the amount of L1/L2 evictions. HBM interface is clocked far slower then GDDR 5 interfaces are and thus it's able to service less transactions.
Most of the shit storm in the past 2-3 days springs from all type of folks mixing up forced composition (respectively the lack of exclusive full screen) for Microsoft Store apps, and the (entirely unrelated) move from Direct Flip to Immediate Flip AMD has performed.With all the sh*t storm we are about to see regarding Microsoft Store/WDDM 2.0 - Windows Composite Engine/VSYNC,
Don't forget the last update:GCN3's poor dx12 performance got even worse.
http://www.forbes.com/sites/jasonev...-disaster-for-amd-radeon-gamers/#33c0e9857e7e
There's a stripe over all the channels so that's still .75 Gtxn/s on 44CUs vs. .5 Gtxn/s on 64CUs. In this particular case at 64k particles graphics queue is rendering from a 2MB buffer and compute queue is reading from another 2MB buffer so that definitely blows over cache sizes. And since it's an n-body problem it needs entire 2MB buffer for each of the 64k particles.Is there a particular restriction in mind, such as an access pattern that does not adequately stripe across all channels or GPU/DRAM bottleneck?
GDDR5 has a burst length of 8 per transaction. For the 390X with data at 6 Gb/s , that is .75 Gtxn/s.
Across 16 such channels, that is 12 Gtxn/s.
Fiji would be running 32 channels, with 1 Gb/s yielding .5 Gtxn/s. That is .5 Gtxn/s per channel, and there are 2x as many channels, giving 16 Gtxn/s.
GCN3's poor dx12 performance got even worse.
Using the single-channel figure means the GPU cannot use all channels?There's a stripe over all the channels so that's still .75 Gtxn/s on 44CUs vs. .5 Gtxn/s on 64CUs. In this particular case at 64k particles graphics queue is rendering from a 2MB buffer and compute queue is reading from another 2MB buffer so that definitely blows over cache sizes. And since it's an n-body problem it needs entire 2MB buffer for each of the 64k particles.
ABout the Freesync, V-sync.. i think Guru3D was explain that AMD have a driver for fix it who will be released soon for Ashes. I will not too much read on this situation right now.
This could be the case.Isn't this kinda expected and has been pointed at in this thread already? Running compute and graphics concurrently makes both queues compete for resources. Most notably bandwidth. As the amount of work increases so does the amount of L1/L2 evictions.
That doesn't make a lot of sense: ignoring refresh, HBM is theoretically able to saturate the bus (just like GDDR5), so if HBM has high absolute BW, it can also service more transactions.HBM interface is clocked far slower then GDDR 5 interfaces are and thus it's able to service less transactions.