Nebuchadnezzar
Legend
I'm talking about GPU, not CPU.
I think quite a bit of it is that there's also quite a bit of CPU work going on in a lot of workloads that also use the GPU (browsers in particular), so balancing power across the whole SoC probably doesn't come out in favour of high GPU frequencies. So the GPU gets used, but unless vendors rethink how to balance power it's never going to work out the way we'd expect over here in GPU land.
Are you guys planning on reviewing the Lumia 950 XL? From my experience using both (950 XL & 6P) the Lumia seems to have the best implementation of the S810 to date in terms of thermal dissipation (don't know about throttling). It also has the best video recording capabilities with the ability to shoot at 4K/30fps with OIS(always on) + Digital OIS (optional). The file bit-rate is 52Mbps (!) with stereo audio recorded at 42Hz. It also natively supports playback of HEVC/H.265 videos.I'm talking about GPU, not CPU.
It doesn't look like we're receiving a review sample.Are you guys planning on reviewing the Lumia 950 XL? From my experience using both (950 XL & 6P) the Lumia seems to have the best implementation of the S810 to date in terms of thermal dissipation (don't know about throttling). It also has the best video recording capabilities with the ability to shoot at 4K/30fps with OIS(always on) + Digital OIS (optional). The file bit-rate is 52Mbps (!) with stereo audio recorded at 42Hz. It also natively supports playback of HEVC/H.265 videos.
It doesn't look like we're receiving a review sample.
Unfortunately as of right now Windows 10 Mobile only features DX11 support AFAICS on the 950/950XL.Too bad, would love to see something like dxcapsviewer on a DX12 qualcomm device. Sorry for being off-topic!
Unfortunately as of right now Windows 10 Mobile only features DX11 support AFAICS on the 950/950XL.
changing clocks costs a lot of power, whether it's CPU or GPU. changing them every 10-20ms is catastrophic for power.The issue is that I'm currently not aware of any SoC that employs DVFS policies that would even be able to respond to super fine-grained high loads that for example would be used in browsers or similar use-cases, like most SoCs out there switch frequency on a 100ms sample rate and GPUs have mostly always step-wise policies so it's always going to take a continuous load 200-300ms to trigger the highest frequencies. At this point you'd need user-space optimizations for QoS on the GPU freq and AFAIK only Samsung does stuff like that and even there they never request the highest frequencies.
I think that's highly dependent on the SoC and PMIC platform.changing clocks costs a lot of power, whether it's CPU or GPU. changing them every 10-20ms is catastrophic for power.
source: my own experiments
sure, I should say that's what I measured on 5X/6P (but I seem to recall it's also true on Krait for CPUs).I think that's highly dependent on the SoC and PMIC platform.
The 808/810 are hardly representative chipsets and should best be forgotten.sure, I should say that's what I measured on 5X/6P (but I seem to recall it's also true on Krait for CPUs).
I'm assuming PLLs, as it's certainly not any sort of software DVFS cost. Experiment is really simple: get the system into an unloaded state, switch to userspace governor, lock a CPU benchmark (dhrystone or something absurdly simple) onto a single core, run a script on another core that alternates between two different clocks, measure power/perf.The 808/810 are hardly representative chipsets and should best be forgotten.
Do you actually mean the power cost for the PLL, regulator switch, software latency overhead affecting efficiency, or again the result of having a too high sampling rate and clocks ending up too high due to over-scaling DVFS too high and needlessly running lower efficiency points? If it's the latter then I would argue it's just an issue of bad scaling logics.
HiSilicon among others runs their big cores on a 10ms rate on ondemand and Samsung does 20ms with a 20ms timer slack on interactive.
Not at this point - the device is 1.5 years old by now. In the Kirin 950 they switched to interactive at least.I'm assuming PLLs, as it's certainly not any sort of software DVFS cost. Experiment is really simple: get the system into an unloaded state, switch to userspace governor, lock a CPU benchmark (dhrystone or something absurdly simple) onto a single core, run a script on another core that alternates between two different clocks, measure power/perf.
Also I honestly can't believe anyone uses ondemand at this point.