Android benchmark. Looking for testers

codedivine

Regular
The current tests for FP performance benchmarking on Android (such as Linpack for Android) are IMO not very accurate. Many such tests are written using Java, and are not properly tuned to use the hardware.

Thus, I have implemented my own small benchmark using the NDK. I have tested it on a Snapdragon S3 based device already, and need testers for other systems. Currently, the test is in early stages and I am still experimenting. It is currently single-threaded but will be adding multi-threaded modes, as well as more comprehensive tests.

If you are interested, let me know and I will provide you with a self-signed APK. It will run on any device running ICS or higher. If you are worried about security of a non-market APK, note that the APK does not need internet access or SD card access. In fact, it does not ask for any special permissions at all. All you get is an application that has a "Run" button. You run it, and shows the result on screen.
 
Last edited by a moderator:
Well, I already got enough testers on another forum within a few minutes :oops:.
So testing is closed for now. Thanks for reading!
 
As a reference point, I got about 1750 megaflops on my Snapdragon S3 dual-core based device. Let me know if you have any questions. :D
 
HTC One X International Tegra 3 AP33 ICS, 4 threads:
3376.0 MFlops

Asus Transformer Pad TF300TG Tegra 3 T30L ICS, 4 threads:
3374.0 MFlops
 
so we can see why the compiler sucks?

Why are you assuming the compiler sucks? Were you expecting to reach 100% theoretical peak? The test is not doing only ALU computations. From my webpage:

It performs a fp64 matrix multiply (hence MM). It is a fully multi-threaded benchmark written using the NDK in C++, and performs a tiled matrix multiply with multiple tile sizes and reports the best performance.

You are not going to get peak. And the search space I am searching over is also not the most optimal one (yet). My benchmark is not totally ideal and I know that. It was meant as a quick hack that is still substantially better indicator than the current benches used by the blogosphere such as "Linpack for Android".
 
I'm not expecting you to reach peak, but a well written algorithm with a half decent compiler should be able to reach pretty damn close in matrix multiplication on a CPU. Far closer than you're reaching. Cortex-A9 should be able to issue a 64-bit FMAC in 2 cycles (and 64-bit FADD in 1 cycle), and you should have enough registers to cover its latency for a matrix multiplication kernel. Particularly if you don't compile it for the 16 register variant. No idea about Snapdragon, but I wasn't under the impression that it was any worse.

Nonetheless, why not post the source and disassembly listing so we can decide for ourselves how the compiler's doing (and how you're doing)? No one else who makes the benchmarks does such a thing, are you really going to deprive us of this rare opportunity?
 
A9 issues DP instructions every other cycle, including fadd.

EDIT : This was wrong, double fadd only needs one cycle. The TRM is correct.
 
Last edited by a moderator:
Thanks Exophase and Laurent. I am looking into it. I am now testing a better version which is already giving more than 2x the performance of v1.1 reported here.
The issue was that I was not tiling properly for optimal usage of the register file.

Re source code: I do aim to push the source code out at some point, but not for now.
 
TF201 Tegra3 1.3 Ghz 4 threads - 1881 Mflops
TF201 Tegra3 1.6 Ghz 4 threads - 2250 Mflops
GSIII Exynos 4412 1.4 Ghz 4 threads - 2189 Mflops
 
How hard is it to run this? Wanted to try making my mother run it on her fairly cheap device, but she didn't get it. Something like seeing 1,2,4 and a green button next to one, that didn't seem to do anything ...
 
TF201 Tegra3 1.3 Ghz 4 threads - 1881 Mflops
TF201 Tegra3 1.6 Ghz 4 threads - 2250 Mflops
GSIII Exynos 4412 1.4 Ghz 4 threads - 2189 Mflops

Thanks :D

How hard is it to run this? Wanted to try making my mother run it on her fairly cheap device, but she didn't get it. Something like seeing 1,2,4 and a green button next to one, that didn't seem to do anything ...

Well, the "1, 2, 4" is a dialog for choosing number of threads. I was having trouble correctly detecting number of supported threads on every phone so I made it a user dialog.

After you select the number of threads, you press the "Run" button. These are the only two user inputs.

However, it does take a while to run. On low-end phones, you should wait for say 10 minutes before it produces a result. Unfortunately, I currently don't display a progress bar, only a message saying "Running" for the entire run. So thats why your mom probably thought that the app is not doing anything after that.

Also, make sure the phone supports ARMv7 ISA. The benchmark does not support say ARM11 processors (well I can compile for them, but there were a few bugs on a few phones that I wanted to avoid related to loading the correct ISA code). The app might just crash or force close if you try to run it on ARM11.
 
Back
Top