CLPeak(Compute GFLOPS with OpenCL on Android)

mfaisalkemal

Newcomer
I have found android benchmark application from github. You can dowload it on playstore.

Description
A synthetic benchmarking tool to measure peak capabilities of opencl devices. It only measures the peak metrics that can be achieved using vector operations and does not represent a real-world use case
Oneplus 3T Result from author github
Platform: QUALCOMM Snapdragon(TM)
Device: QUALCOMM Adreno(TM)
Driver version : OpenCL 2.0 QUALCOMM build: commit #6ff34ae changeid #I0ac3940325 Date: 09/23/16 Fri Local Branch: Remote Branch: refs/tags/AU_LINUX_ANDROID_LA.HB.1.3.2.06.00.01.214.261 Compiler E031.31.00.01 (Android)
Compute units : 4
Clock frequency : 1 MHz

Global memory bandwidth (GBPS)
float : 15.28
float2 : 11.46
float4 : 16.31
float8 : 20.20
float16 : 20.43

Single-precision compute (GFLOPS)
float : 249.85
float2 : 249.99
float4 : 237.66
float8 : 263.18
float16 : 202.15

half-precision compute (GFLOPS)
half : 260.10
half2 : 391.85
half4 : 383.45
half8 : 270.69
half16 : 202.19

No double precision support! Skipped

Integer compute (GIOPS)
int : 52.21
int2 : 55.46
int4 : 71.81
int8 : 69.74
int16 : 69.22

Transfer bandwidth (GBPS)
enqueueWriteBuffer : 18.17
enqueueReadBuffer : 8.72
enqueueMapBuffer(for read) : 2839.08
memcpy from mapped ptr : 8.45
enqueueUnmap(after write) : 816.78
memcpy to mapped ptr : 9.77

Kernel launch latency : 296.08 us

Please post your result here :)
 
OnePlus 5


Code:
Platform: QUALCOMM Snapdragon(TM)
Device: QUALCOMM Adreno(TM)
Driver version : OpenCL 2.0 QUALCOMM build: commit #8209866 changeid #I528a81912f Date: 04/01/17 Sat Local Branch: Remote Branch: Compiler E031.33.00.03 (Android)
Compute units : 4
Clock frequency : 1 MHz

Global memory bandwidth (GBPS)
float : 17.82
float2 : 19.30
float4 : 19.43
float8 : 20.38
float16 : 20.45

Single-precision compute (GFLOPS)
float : 294.65
float2 : 285.81
float4 : 311.02
float8 : 265.02
float16 : 308.34

half-precision compute (GFLOPS)
half : 570.72
half2 : 539.62
half4 : 610.79
half8 : 314.82
half16 : 313.73

No double precision support! Skipped

Integer compute (GIOPS)
int : 65.98
int2 : 68.05
int4 : 80.68
int8 : 79.25
int16 : 77.79

Transfer bandwidth (GBPS)
enqueueWriteBuffer : 8.71
enqueueReadBuffer : 8.98
enqueueMapBuffer(for read) : 3228.33
memcpy from mapped ptr : 8.99
enqueueUnmap(after write) : 1016.22
memcpy to mapped ptr : 8.97

Kernel launch latency : 214.91 us
 
ZTE Axon 7 with Snapdragon 820:

Platform: QUALCOMM Snapdragon(TM)
Device: QUALCOMM Adreno(TM)
Driver version : OpenCL 2.0 QUALCOMM build: commit #7f9221e changeid #I45b30eba69 Date: 01/24/17 Tue Local Branch: Remote Branch: Compiler E031.31.00.03 (Android)
Compute units : 4
Clock frequency : 1 MHz

Global memory bandwidth (GBPS)
float : 13.43
float2 : 10.43
float4 : 14.39
float8 : 17.86
float16 : 18.08

Single-precision compute (GFLOPS)
float : 236.04
float2 : 236.81
float4 : 226.37
float8 : 250.93
float16 : 192.25

half-precision compute (GFLOPS)
half : 246.33
half2 : 370.75
half4 : 365.02
half8 : 259.32
half16 : 191.22

No double precision support! Skipped

Integer compute (GIOPS)
int : 49.29
int2 : 52.46
int4 : 68.24
int8 : 66.10
int16 : 65.75

Transfer bandwidth (GBPS)
enqueueWriteBuffer : 15.51
enqueueReadBuffer : 7.98
enqueueMapBuffer(for read) : 547.27
memcpy from mapped ptr : 8.46
enqueueUnmap(after write) : 741.74
memcpy to mapped ptr : 8.73

Kernel launch latency : 362.67 us
 
Xiaomi Redmi 4X with Snapdragon 435:

Platform: QUALCOMM Snapdragon(TM)
Device: QUALCOMM Adreno(TM)
Driver version : OpenCL 2.0 QUALCOMM build: commit #710c145 changeid #Iebe23be877 Date: 08/23/16 Tue Local Branch: Remote Branch: Compiler E031.29.02.03 (Android)
Compute units : 1
Clock frequency : 1 MHz

Global memory bandwidth (GBPS)
float : 2.65
float2 : 2.27
float4 : 3.30
float8 : 3.17
float16 : 2.58

Single-precision compute (GFLOPS)
float : 14.04
float2 : 22.32
float4 : 20.35
float8 : 22.00
float16 : 19.89

half-precision compute (GFLOPS)
half : 26.25
half2 : 37.41
half4 : 38.15
half8 : 22.04
half16 : 20.00

No double precision support! Skipped

Integer compute (GIOPS)
int : 5.09
int2 : 5.39
int4 : 3.95
int8 : 3.82
int16 : 3.62

Transfer bandwidth (GBPS)
enqueueWriteBuffer : 5.78
enqueueReadBuffer : 1.87
enqueueMapBuffer(for read) : 101.15
memcpy from mapped ptr : 1.81
enqueueUnmap(after write) : 104.83
memcpy to mapped ptr : 2.01

Kernel launch latency : 222.19 us
 
Xiaomi Redmi Note 3 Pro with Snapdragon 650:

Platform: QUALCOMM Snapdragon(TM)
Device: QUALCOMM Adreno(TM)
Driver version : OpenCL 2.0 QUALCOMM build: commit #a7823f5 changeid #I59a6815413 Date: 09/23/16 Fri Local Branch: mybranch22028469 Remote Branch: quic/LA.BR.1.3.3_rb2.26 Compiler E031.29.00.01 (Android)
Compute units : 2
Clock frequency : 1 MHz

Global memory bandwidth (GBPS)
float : 7.06
float2 : 4.38
float4 : 6.53
float8 : 9.09
float16 : 8.90

Single-precision compute (GFLOPS)
float : 39.75
float2 : 42.01
float4 : 49.62
float8 : 55.32
float16 : 52.29

half-precision compute (GFLOPS)
half : 75.18
half2 : 111.50
half4 : 102.55
half8 : 58.87
half16 : 52.57

No double precision support! Skipped

Integer compute (GIOPS)
int : 12.00
int2 : 13.70
int4 : 10.34
int8 : 10.38
int16 : 9.83

Transfer bandwidth (GBPS)
enqueueWriteBuffer : 6.42
enqueueReadBuffer : 2.12
enqueueMapBuffer(for read) : 239.08
memcpy from mapped ptr : 2.13
enqueueUnmap(after write) : 461.07
memcpy to mapped ptr : 2.16

Kernel launch latency : 186.45 us
 
Back
Top