| # Bionic Benchmarks |
| |
| [TOC] |
| |
| ## libc benchmarks (bionic-benchmarks) |
| |
| `bionic-benchmarks` is a command line tool for measuring the runtimes of libc functions. It is built |
| on top of [Google Benchmark](https://github.com/google/benchmark) with some additions to organize |
| tests into suites. |
| |
| ### Device benchmarks |
| |
| $ mmma bionic/benchmarks |
| $ adb root |
| $ adb sync data |
| $ adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks |
| $ adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks |
| |
| By default, `bionic-benchmarks` runs all of the benchmarks in alphabetical order. Pass |
| `--benchmark_filter=getpid` to run just the benchmarks with "getpid" in their name. |
| |
| Note that we also build _static_ benchmark binaries. |
| They're useful for testing on devices running different versions of Android, or running non-Android OSes. |
| Those binaries are called `bionic-benchmarks-static` instead. |
| Copy from `out/target/product/<device>/symbols/data/benchmarktest64/bionic-benchmarks-static` instead of |
| `out/target/product/<device>/data/benchmarktest64/bionic-benchmarks-static` if you want symbols for perf(1). |
| |
| ### Host benchmarks |
| |
| See the `benchmarks/run-on-host.sh` script. The host benchmarks can be run with 32-bit or 64-bit |
| Bionic, or the host glibc. |
| |
| ### XML suites |
| |
| Suites are stored in the `suites/` directory and can be chosen with the command line flag |
| `--bionic_xml`. |
| |
| To choose a specific XML file, use the `--bionic_xml=FILE.XML` option. By default, this option |
| searches for the XML file in the `suites/` directory. If it doesn't exist in that directory, then |
| the file will be found as relative to the current directory. If the option specifies the full path |
| to an XML file such as `/data/nativetest/suites/example.xml`, it will be used as-is. |
| |
| If no XML file is specified through the command-line option, the default is to use `suites/full.xml`. |
| However, for the host bionic benchmarks (`bionic-benchmarks-glibc`), the default is to use |
| `suites/host.xml`. |
| |
| ### XML suite format |
| |
| The format for a benchmark is: |
| |
| ``` |
| <fn> |
| <name>BM_sample_benchmark</name> |
| <cpu><optional_cpu_to_lock></cpu> |
| <iterations><optional_iterations_to_run></iterations> |
| <args><space separated list of function args|shorthand></args> |
| </fn> |
| ``` |
| |
| XML-specified values for iterations and cpu take precedence over those specified via command line |
| (via `--bionic_iterations` and `--bionic_cpu`, respectively.) |
| |
| To make small changes in runs, you can also schedule benchmarks by passing in their name and a |
| space-separated list of arguments via the `--bionic_extra` command line flag, e.g. |
| `--bionic_extra="BM_string_memcpy AT_COMMON_SIZES"` or `--bionic_extra="BM_string_memcmp 32 8 8"` |
| |
| Note that benchmarks will run normally if extra arguments are passed in, and it will fail |
| with a segfault if too few are passed in. |
| |
| ### Shorthand |
| |
| For the sake of brevity, multiple runs can be scheduled in one XML element by putting one of the |
| following in the args field: |
| |
| NUM_PROPS |
| MATH_COMMON |
| AT_ALIGNED_<ONE|TWO>BUF |
| AT_<any power of two between 2 and 16384>_ALIGNED_<ONE|TWO>BUF |
| AT_COMMON_SIZES |
| |
| Definitions for these can be found in bionic_benchmarks.cpp, and example usages can be found in |
| the suites directory. |
| |
| ### Unit Tests |
| |
| `bionic-benchmarks` also has its own set of unit tests, which can be run from the binary in |
| `/data/nativetest[64]/bionic-benchmarks-tests` |
| |
| ## Process startup time (bionic-spawn-benchmarks) |
| |
| The `spawn/` subdirectory has a few benchmarks measuring the time used to start simple programs |
| (e.g. Toybox's `true` and `sh -c true`). Run it on a device like so: |
| |
| m bionic-spawn-benchmarks |
| adb root |
| adb sync data |
| adb shell /data/benchmarktest/bionic-spawn-benchmarks/bionic-spawn-benchmarks |
| adb shell /data/benchmarktest64/bionic-spawn-benchmarks/bionic-spawn-benchmarks |
| |
| Google Benchmark reports both a real-time figure ("Time") and a CPU usage figure. For these |
| benchmarks, the CPU measurement only counts time spent in the thread calling `posix_spawn`, not that |
| spent in the spawned process. The real-time is probably more useful, and it is the figure used to |
| determine the iteration count. |
| |
| Locking the CPU frequency seems to improve the results of these benchmarks significantly, and it |
| reduces variability. |
| |
| ## Google Benchmark notes |
| |
| ### Repetitions |
| |
| Google Benchmark uses two settings to control how many times to run each benchmark, "iterations" and |
| "repetitions". By default, the repetition count is one. Google Benchmark runs the benchmark a few |
| times to determine a sufficiently-large iteration count. |
| |
| Google Benchmark can optionally run a benchmark run repeatedly and report statistics (median, mean, |
| standard deviation) for the runs. To do so, pass the `--benchmark_repetitions` option, e.g.: |
| |
| # ./bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll --benchmark_repetitions=4 |
| ... |
| ------------------------------------------------------------------- |
| Benchmark Time CPU Iterations |
| ------------------------------------------------------------------- |
| BM_stdlib_strtoll 27.7 ns 27.7 ns 25290525 |
| BM_stdlib_strtoll 27.7 ns 27.7 ns 25290525 |
| BM_stdlib_strtoll 27.7 ns 27.7 ns 25290525 |
| BM_stdlib_strtoll 27.8 ns 27.7 ns 25290525 |
| BM_stdlib_strtoll_mean 27.7 ns 27.7 ns 4 |
| BM_stdlib_strtoll_median 27.7 ns 27.7 ns 4 |
| BM_stdlib_strtoll_stddev 0.023 ns 0.023 ns 4 |
| |
| There are 4 runs, each with 25290525 iterations. Measurements for the individual runs can be |
| suppressed if they aren't needed: |
| |
| # ./bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll --benchmark_repetitions=4 --benchmark_report_aggregates_only |
| ... |
| ------------------------------------------------------------------- |
| Benchmark Time CPU Iterations |
| ------------------------------------------------------------------- |
| BM_stdlib_strtoll_mean 27.8 ns 27.7 ns 4 |
| BM_stdlib_strtoll_median 27.7 ns 27.7 ns 4 |
| BM_stdlib_strtoll_stddev 0.043 ns 0.043 ns 4 |
| |
| ### CPU frequencies |
| |
| To get consistent results between runs, it can sometimes be helpful to restrict a benchmark to |
| specific cores, or to lock cores at specific frequencies. Some phones have a big.LITTLE core setup, |
| or at least allow some cores to run at higher frequencies than others. |
| |
| A core can be selected for `bionic-benchmarks` using the `--bionic_cpu` option or using the |
| `taskset` utility. e.g. A Pixel 3 device has 4 Kryo 385 Silver cores followed by 4 Gold cores: |
| |
| blueline:/ # /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll --bionic_cpu=0 |
| ... |
| ------------------------------------------------------------ |
| Benchmark Time CPU Iterations |
| ------------------------------------------------------------ |
| BM_stdlib_strtoll 64.2 ns 63.6 ns 11017493 |
| |
| blueline:/ # /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll --bionic_cpu=4 |
| ... |
| ------------------------------------------------------------ |
| Benchmark Time CPU Iterations |
| ------------------------------------------------------------ |
| BM_stdlib_strtoll 21.8 ns 21.7 ns 33167103 |
| |
| A similar result can be achieved using `taskset`. The first parameter is a bitmask of core numbers |
| to pass to `sched_setaffinity`: |
| |
| blueline:/ # taskset f /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll |
| ... |
| ------------------------------------------------------------ |
| Benchmark Time CPU Iterations |
| ------------------------------------------------------------ |
| BM_stdlib_strtoll 64.3 ns 63.6 ns 10998697 |
| |
| blueline:/ # taskset f0 /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll |
| ... |
| ------------------------------------------------------------ |
| Benchmark Time CPU Iterations |
| ------------------------------------------------------------ |
| BM_stdlib_strtoll 21.3 ns 21.2 ns 33094801 |
| |
| To lock the CPU frequency, use the sysfs interface at `/sys/devices/system/cpu/cpu*/cpufreq/`. |
| Changing the scaling governor to `performance` suppresses the warning that Google Benchmark |
| otherwise prints: |
| |
| ***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead. |
| |
| Some devices have a `perf-setup.sh` script that locks CPU and GPU frequencies. Some TradeFed |
| benchmarks appear to be using the script. For more information: |
| * run `adb shell perf-setup.sh` to execute the script, it is already by default be installed on device for eng and userdebug build |
| * see: https://android.googlesource.com/platform/platform_testing/+/refs/heads/main/scripts/perf-setup/ |