source:link(transcript section)For further details about these features, please check the Metal 2 documentation. A11 makes many significant performance improvements to the GPU. It has up to 2X math performance when it comes to tasks for computer vision, image processing, and machine learning. But that is not the only area of improvement for performance. Let us review the improved performance and capabilities of A11 GPU. We doubled F16 math and texture filtering rate per clock cycle when compared to A10 GPU.
Please note: On A11, using F16 data types in your shaders when possible makes a much larger performance difference.
could you make some test of gfxbench low-level and high-level offscreen test on low power mode on? since iphone 6s, apple soc throttling a lot on graphics. for example, iphone 7 plus stable only 67% from initial score on manhattan 3.1 long term benchmark. with low power mode, iphone 7 plus will produce stable result from the beginning.Finally I've bought iPhone8, so benchmark iteration time will be shorter.
I'm still struggling with stable autodetect. Threads are jumping between cores like crazy.
Repo is updated with WIP commit, just in case.
~6 seconds measurement time should be enough even with high thread count (otherwise zeroes are displayed)
"id" is a core signature based on issue width.
Updated numbers for # of threads (1-6):
2376*
2304 2304
2304 2304 1680
2304 2304 1572 1572
2304 2304 1572 1572 1572
2304 2304 1572 1572 1572 1572
(measured on iPhone8, iOS 11.1)
* Single thread frequency result is significantly affected by measurement loop iteration count (CALIB_REPEAT)
2376 +-0 variability with ~1/2 msec loop
2385 +-5 variability with ~1/16 msec loop
Yeah, should be eDP or DP. See the A10X teardown at ifixit. In an AppleTV they just hook up the SoC to an external DP to HMDI TX IC.
Bit surprised there is an USB PHY on it as well, with its 5V I/O.
And from the Ifixit teardown, might be an PCIe lane somewhere as well...
Ashraf Eassa said:I am told that the A11 CPU has ARM32 capability but it’s disabled. Next core goes totally 64-bit apparently.
Interesting rumour, but hard to verify.From Ashraf Eassa at the Motley Fool:
Seems like this is bullshit benchmark.CPU DasherX Benchmark latest update.
IPhone X A11 GPU FP32 look weaker than A9 GPU but near FP16 A9X GPU
op q0, q12, q13
op q1, q12, q13
op q2, q12, q13
op q3, q12, q13
op q4, q12, q13
op q5, q12, q13
op q6, q12, q13
op q7, q12, q13
op q8, q12, q13
op q9, q12, q13
op q10, q12, q13
op q11, q12, q13
Seems like this is bullshit benchmark.
For example they show 297 GFlops for A11.
With NEON, each A11 monsoon core can theoretically achieve 57 GFlops.
It think they just multiplied single-core measured result by 6.
The only adequate peak FLOPS benchmarking utility in App Store is vfpbench.
http://dench.flatlib.jp/app/vfpbench
Measured numbers are a bit lower - 51GFlops single core and 197GFlops for six cores with a code like
where op is fmla.4sCode:op q0, q12, q13 op q1, q12, q13 op q2, q12, q13 op q3, q12, q13 op q4, q12, q13 op q5, q12, q13 op q6, q12, q13 op q7, q12, q13 op q8, q12, q13 op q9, q12, q13 op q10, q12, q13 op q11, q12, q13
With 6 core layout the clock frequency of Monsoon is 3% less.
So we have 197- 2*49,5 = 98GFlops for all 4 Mistral cores.
98000(Gflops) / 4 (cores) / 1572(freq) = 15,58Flops per cycle.
That means each Mistral has two 128bit pipelines. Too good, given it's size.
So, I don't believe A11 GPU FP32 is slower.