| # Overview of performance test suite |
| |
| For design of the tests, see https://grpc.io/docs/guides/benchmarking. |
| |
| This document contains documentation of on how to run gRPC end-to-end benchmarks |
| using the gRPC OSS benchmarks framework (recommended) or how to run them |
| manually (for experts only). |
| |
| ## Approach 1: Use gRPC OSS benchmarks framework (Recommended) |
| |
| ### gRPC OSS benchmarks |
| |
| The scripts in this section generate LoadTest configurations for the GKE-based |
| gRPC OSS benchmarks framework. This framework is stored in a separate |
| repository, [grpc/test-infra]. |
| |
| These scripts, together with tools defined in [grpc/test-infra], are used in the |
| continuous integration setup defined in [grpc_e2e_performance_gke.sh] and |
| [grpc_e2e_performance_gke_experiment.sh]. |
| |
| #### Generating scenarios |
| |
| The benchmarks framework uses the same test scenarios as the legacy one. The |
| script [scenario_config_exporter.py](./scenario_config_exporter.py) can be used |
| to export these scenarios to files, and also to count and analyze existing |
| scenarios. |
| |
| The language(s) and category of the scenarios are of particular importance to |
| the tests. Continuous runs will typically run tests in the `scalable` category. |
| |
| The following example counts scenarios in the `scalable` category: |
| |
| ``` |
| $ ./tools/run_tests/performance/scenario_config_exporter.py --count_scenarios --category=scalable |
| Scenario count for all languages (category: scalable): |
| Count Language Client Server Categories |
| 56 c++ scalable |
| 19 python_asyncio scalable |
| 16 java scalable |
| 12 go scalable |
| 12 node scalable |
| 9 csharp scalable |
| 9 dotnet scalable |
| 7 python scalable |
| 5 ruby scalable |
| 4 csharp c++ scalable |
| 4 dotnet c++ scalable |
| 4 php7 c++ scalable |
| 4 php7_protobuf_c c++ scalable |
| 3 python_asyncio c++ scalable |
| 2 ruby c++ scalable |
| 2 python c++ scalable |
| 1 csharp c++ scalable |
| 1 dotnet c++ scalable |
| |
| 170 total scenarios (category: scalable) |
| ``` |
| |
| Client and server languages are only set for cross-language scenarios, where the |
| client or server language do not match the scenario language. |
| |
| #### Generating load test configurations |
| |
| The benchmarks framework uses LoadTest resources configured by YAML files. Each |
| LoadTest resource specifies a driver, a server, and one or more clients to run |
| the test. Each test runs one scenario. The scenario configuration is embedded in |
| the LoadTest configuration. Example configurations for various languages can be |
| found here: |
| |
| https://github.com/grpc/test-infra/tree/master/config/samples |
| |
| The script [loadtest_config.py](./loadtest_config.py) generates LoadTest |
| configurations for tests running a set of scenarios. The configurations are |
| written in multipart YAML format, either to a file or to stdout. Each |
| configuration contains a single embedded scenario. |
| |
| The LoadTest configurations are generated from a template. Any configuration can |
| be used as a template, as long as it contains the languages required by the set |
| of scenarios we intend to run (for instance, if we are generating configurations |
| to run go scenarios, the template must contain a go client and a go server; if |
| we are generating configurations for cross-language scenarios that need a go |
| client and a C++ server, the template must also contain a C++ server; and the |
| same for all other languages). |
| |
| The LoadTests specified in the script output all have unique names and can be |
| run by applying the test to a cluster running the LoadTest controller with |
| `kubectl apply`: |
| |
| ``` |
| $ kubectl apply -f loadtest_config.yaml |
| ``` |
| |
| > Note: The most common way of running tests generated by this script is to use |
| > a _test runner_. For details, see [running tests](#running-tests). |
| |
| A basic template for generating tests in various languages can be found here: |
| [loadtest_template_basic_all_languages.yaml](./templates/loadtest_template_basic_all_languages.yaml). |
| The following example generates configurations for C# and Java tests using this |
| template, including tests against C++ clients and servers, and running each test |
| twice: |
| |
| ``` |
| $ ./tools/run_tests/performance/loadtest_config.py -l go -l java \ |
| -t ./tools/run_tests/performance/templates/loadtest_template_basic_all_languages.yaml \ |
| -s client_pool=workers-8core -s driver_pool=drivers \ |
| -s server_pool=workers-8core \ |
| -s big_query_table=e2e_benchmarks.experimental_results \ |
| -s timeout_seconds=3600 --category=scalable \ |
| -d --allow_client_language=c++ --allow_server_language=c++ \ |
| --runs_per_test=2 -o ./loadtest.yaml |
| ``` |
| |
| The script `loadtest_config.py` takes the following options: |
| |
| - `-l`, `--language`<br> Language to benchmark. May be repeated. |
| - `-t`, `--template`<br> Template file. A template is a configuration file that |
| may contain multiple client and server configuration, and may also include |
| substitution keys. |
| - `-s`, `--substitution` Substitution keys, in the format `key=value`. These |
| keys are substituted while processing the template. Environment variables that |
| are set by the load test controller at runtime are ignored by default |
| (`DRIVER_PORT`, `KILL_AFTER`, `POD_TIMEOUT`). The user can override this |
| behavior by specifying these variables as keys. |
| - `-p`, `--prefix`<br> Test names consist of a prefix_joined with a uuid with a |
| dash. Test names are stored in `metadata.name`. The prefix is also added as |
| the `prefix` label in `metadata.labels`. The prefix defaults to the user name |
| if not set. |
| - `-u`, `--uniquifier_element`<br> Uniquifier elements may be passed to the test |
| to make the test name unique. This option may be repeated to add multiple |
| elements. The uniquifier elements (plus a date string and a run index, if |
| applicable) are joined with a dash to form a _uniquifier_. The test name uuid |
| is derived from the scenario name and the uniquifier. The uniquifier is also |
| added as the `uniquifier` annotation in `metadata.annotations`. |
| - `-d`<br> This option is a shorthand for the addition of a date string as a |
| uniquifier element. |
| - `-a`, `--annotation`<br> Metadata annotation to be stored in |
| `metadata.annotations`, in the form key=value. May be repeated. |
| - `-r`, `--regex`<br> Regex to select scenarios to run. Each scenario is |
| embedded in a LoadTest configuration containing a client and server of the |
| language(s) required for the test. Defaults to `.*`, i.e., select all |
| scenarios. |
| - `--category`<br> Select scenarios of a specified _category_, or of all |
| categories. Defaults to `all`. Continuous runs typically run tests in the |
| `scalable` category. |
| - `--allow_client_language`<br> Allows cross-language scenarios where the client |
| is of a specified language, different from the scenario language. This is |
| typically `c++`. This flag may be repeated. |
| - `--allow_server_language`<br> Allows cross-language scenarios where the server |
| is of a specified language, different from the scenario language. This is |
| typically `node` or `c++`. This flag may be repeated. |
| - `--instances_per_client`<br> This option generates multiple instances of the |
| clients for each test. The instances are named with the name of the client |
| combined with an index (or only an index, if no name is specified). If the |
| template specifies more than one client for a given language, it must also |
| specify unique names for each client. In the most common case, the template |
| contains only one unnamed client for each language, and the instances will be |
| named `0`, `1`, ... |
| - `--runs_per_test`<br> This option specifies that each test should be repeated |
| `n` times, where `n` is the value of the flag. If `n` > 1, the index of each |
| test run is added as a uniquifier element for that run. |
| - `-o`, `--output`<br> Output file name. The LoadTest configurations are added |
| to this file, in multipart YAML format. Output is streamed to `sys.stdout` if |
| not set. |
| |
| The script adds labels and annotations to the metadata of each LoadTest |
| configuration: |
| |
| The following labels are added to `metadata.labels`: |
| |
| - `language`<br> The language of the LoadTest scenario. |
| - `prefix`<br> The prefix used in `metadata.name`. |
| |
| The following annotations are added to `metadata.annotations`: |
| |
| - `scenario`<br> The name of the LoadTest scenario. |
| - `uniquifier`<br> The uniquifier used to generate the LoadTest name, including |
| the run index if applicable. |
| |
| [Labels](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/) |
| can be used in selectors in resource queries. Adding the prefix, in particular, |
| allows the user (or an automation script) to select the resources started from a |
| given run of the config generator. |
| |
| [Annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) |
| contain additional information that is available to the user (or an automation |
| script) but is not indexed and cannot be used to select objects. Scenario name |
| and uniquifier are added to provide the elements of the LoadTest name uuid in |
| human-readable form. Additional annotations may be added later for automation. |
| |
| #### Concatenating load test configurations |
| |
| The LoadTest configuration generator can process multiple languages at a time, |
| assuming that they are supported by the template. The convenience script |
| [loadtest_concat_yaml.py](./loadtest_concat_yaml.py) is provided to concatenate |
| several YAML files into one, so configurations generated by multiple generator |
| invocations can be concatenated into one and run with a single command. The |
| script can be invoked as follows: |
| |
| ``` |
| $ loadtest_concat_yaml.py -i infile1.yaml infile2.yaml -o outfile.yaml |
| ``` |
| |
| #### Generating load test examples |
| |
| The script [loadtest_examples.sh](./loadtest_examples.sh) is provided to |
| generate example load test configurations in all supported languages. This |
| script takes only one argument, which is the output directory where the |
| configurations will be created. The script produces a set of basic |
| configurations, as well as a set of template configurations intended to be used |
| with prebuilt images. |
| |
| The [examples](https://github.com/grpc/test-infra/tree/master/config/samples) in |
| the repository [grpc/test-infra] are generated by this script. |
| |
| #### Generating configuration templates |
| |
| The script [loadtest_template.py](./loadtest_template.py) generates a load test |
| configuration template from a set of load test configurations. The source files |
| may be load test configurations or load test configuration templates. The |
| generated template supports all languages supported in any of the input |
| configurations or templates. |
| |
| The example template in |
| [loadtest_template_basic_template_all_languages.yaml](./templates/loadtest_template_basic_all_languages.yaml) |
| was generated from the example configurations in [grpc/test-infra] by the |
| following command: |
| |
| ``` |
| $ ./tools/run_tests/performance/loadtest_template.py \ |
| -i ../test-infra/config/samples/*_example_loadtest.yaml \ |
| --inject_client_pool --inject_server_pool \ |
| --inject_big_query_table --inject_timeout_seconds \ |
| -o ./tools/run_tests/performance/templates/loadtest_template_basic_all_languages.yaml \ |
| --name basic_all_languages |
| ``` |
| |
| The example template with prebuilt images in |
| [loadtest_template_prebuilt_all_languages.yaml](./templates/loadtest_template_prebuilt_all_languages.yaml) |
| was generated by the following command: |
| |
| ``` |
| $ ./tools/run_tests/performance/loadtest_template.py \ |
| -i ../test-infra/config/samples/templates/*_example_loadtest_with_prebuilt_workers.yaml \ |
| --inject_client_pool --inject_driver_image --inject_driver_pool \ |
| --inject_server_pool --inject_big_query_table --inject_timeout_seconds \ |
| -o ./tools/run_tests/performance/templates/loadtest_template_prebuilt_all_languages.yaml \ |
| --name prebuilt_all_languages |
| ``` |
| |
| The script `loadtest_template.py` takes the following options: |
| |
| - `-i`, `--inputs`<br> Space-separated list of the names of input files |
| containing LoadTest configurations. May be repeated. |
| - `-o`, `--output`<br> Output file name. Outputs to `sys.stdout` if not set. |
| - `--inject_client_pool`<br> If this option is set, the pool attribute of all |
| clients in `spec.clients` is set to `${client_pool}`, for later substitution. |
| - `--inject_driver_image`<br> If this option is set, the image attribute of the |
| driver(s) in `spec.drivers` is set to `${driver_image}`, for later |
| substitution. |
| - `--inject_driver_pool`<br> If this attribute is set, the pool attribute of the |
| driver(s) is set to `${driver_pool}`, for later substitution. |
| - `--inject_server_pool`<br> If this option is set, the pool attribute of all |
| servers in `spec.servers` is set to `${server_pool}`, for later substitution. |
| - `--inject_big_query_table`<br> If this option is set, |
| spec.results.bigQueryTable is set to `${big_query_table}`. |
| - `--inject_timeout_seconds`<br> If this option is set, `spec.timeoutSeconds` is |
| set to `${timeout_seconds}`. |
| - `--inject_ttl_seconds`<br> If this option is set, `spec.ttlSeconds` is set to |
| `${ttl_seconds}`. |
| - `-n`, `--name`<br> Name to be set in `metadata.name`. |
| - `-a`, `--annotation`<br> Metadata annotation to be stored in |
| `metadata.annotations`, in the form key=value. May be repeated. |
| |
| The options that inject substitution keys are the most useful for template |
| reuse. When running tests on different node pools, it becomes necessary to set |
| the pool, and usually also to store the data on a different table. When running |
| as part of a larger collection of tests, it may also be necessary to adjust test |
| timeout and time-to-live, to ensure that all tests have time to complete. |
| |
| The template name is replaced again by `loadtest_config.py`, and so is set only |
| as a human-readable memo. |
| |
| Annotations, on the other hand, are passed on to the test configurations, and |
| may be set to values or to substitution keys in themselves, allowing future |
| automation scripts to process the tests generated from these configurations in |
| different ways. |
| |
| #### Running tests |
| |
| Collections of tests generated by `loadtest_config.py` are intended to be run |
| with a test runner. The code for the test runner is stored in a separate |
| repository, [grpc/test-infra]. |
| |
| The test runner applies the tests to the cluster, and monitors the tests for |
| completion while they are running. The test runner can also be set up to run |
| collections of tests in parallel on separate node pools, and to limit the number |
| of tests running in parallel on each pool. |
| |
| For more information, see the |
| [tools README](https://github.com/grpc/test-infra/blob/master/tools/README.md) |
| in [grpc/test-infra]. |
| |
| For usage examples, see the continuous integration setup defined in |
| [grpc_e2e_performance_gke.sh] and [grpc_e2e_performance_gke_experiment.sh]. |
| |
| [grpc/test-infra]: https://github.com/grpc/test-infra |
| [grpc_e2e_performance_gke.sh]: ../../internal_ci/linux/grpc_e2e_performance_gke.sh |
| [grpc_e2e_performance_gke_experiment.sh]: ../../internal_ci/linux/grpc_e2e_performance_gke_experiment.sh |
| |
| ## Approach 2: Running benchmarks locally via legacy tooling (still useful sometimes) |
| |
| This approach is much more involved than using the gRPC OSS benchmarks framework |
| (see above), but can still be useful for hands-on low-level experiments |
| (especially when you know what you are doing). |
| |
| ### Prerequisites for running benchmarks manually: |
| |
| In general the benchmark workers and driver build scripts expect |
| [linux_performance_worker_init.sh](../../gce/linux_performance_worker_init.sh) |
| to have been ran already. |
| |
| ### To run benchmarks locally: |
| |
| - From the grpc repo root, start the |
| [run_performance_tests.py](../run_performance_tests.py) runner script. |
| |
| ### On remote machines, to start the driver and workers manually: |
| |
| The [run_performance_test.py](../run_performance_tests.py) top-level runner |
| script can also be used with remote machines, but for e.g., profiling the |
| server, it might be useful to run workers manually. |
| |
| 1. You'll need a "driver" and separate "worker" machines. For example, you might |
| use one GCE "driver" machine and 3 other GCE "worker" machines that are in |
| the same zone. |
| |
| 2. Connect to each worker machine and start up a benchmark worker with a |
| "driver_port". |
| |
| - For example, to start the grpc-go benchmark worker: |
| [grpc-go worker main.go](https://github.com/grpc/grpc-go/blob/master/benchmark/worker/main.go) |
| --driver_port <driver_port> |
| |
| #### Commands to start workers in different languages: |
| |
| - Note that these commands are what the top-level |
| [run_performance_test.py](../run_performance_tests.py) script uses to build |
| and run different workers through the |
| [build_performance.sh](./build_performance.sh) script and "run worker" scripts |
| (such as the [run_worker_java.sh](./run_worker_java.sh)). |
| |
| ##### Running benchmark workers for C-core wrapped languages (C++, Python, C#, Node, Ruby): |
| |
| - These are more simple since they all live in the main grpc repo. |
| |
| ``` |
| $ cd <grpc_repo_root> |
| $ tools/run_tests/performance/build_performance.sh |
| $ tools/run_tests/performance/run_worker_<language>.sh |
| ``` |
| |
| - Note that there is one "run_worker" script per language, e.g., |
| [run_worker_csharp.sh](./run_worker_csharp.sh) for c#. |
| |
| ##### Running benchmark workers for gRPC-Java: |
| |
| - You'll need the [grpc-java](https://github.com/grpc/grpc-java) repo. |
| |
| ``` |
| $ cd <grpc-java-repo> |
| $ ./gradlew -PskipCodegen=true -PskipAndroid=true :grpc-benchmarks:installDist |
| $ benchmarks/build/install/grpc-benchmarks/bin/benchmark_worker --driver_port <driver_port> |
| ``` |
| |
| ##### Running benchmark workers for gRPC-Go: |
| |
| - You'll need the [grpc-go repo](https://github.com/grpc/grpc-go) |
| |
| ``` |
| $ cd <grpc-go-repo>/benchmark/worker && go install |
| $ # if profiling, it might be helpful to turn off inlining by building with "-gcflags=-l" |
| $ $GOPATH/bin/worker --driver_port <driver_port> |
| ``` |
| |
| #### Build the driver: |
| |
| - Connect to the driver machine (if using a remote driver) and from the grpc |
| repo root: |
| |
| ``` |
| $ tools/run_tests/performance/build_performance.sh |
| ``` |
| |
| #### Run the driver: |
| |
| 1. Get the 'scenario_json' relevant for the scenario to run. Note that "scenario |
| json" configs are generated from [scenario_config.py](./scenario_config.py). |
| The [driver](../../../test/cpp/qps/qps_json_driver.cc) takes a list of these |
| configs as a json string of the form: `{scenario: <json_list_of_scenarios> }` |
| in its `--scenarios_json` command argument. One quick way to get a valid json |
| string to pass to the driver is by running the |
| [run_performance_tests.py](./run_performance_tests.py) locally and copying |
| the logged scenario json command arg. |
| |
| 2. From the grpc repo root: |
| |
| - Set `QPS_WORKERS` environment variable to a comma separated list of worker |
| machines. Note that the driver will start the "benchmark server" on the first |
| entry in the list, and the rest will be told to run as clients against the |
| benchmark server. |
| |
| Example running and profiling of go benchmark server: |
| |
| ``` |
| $ export QPS_WORKERS=<host1>:<10000>,<host2>,10000,<host3>:10000 |
| $ bins/opt/qps_json_driver --scenario_json='<scenario_json_scenario_config_string>' |
| ``` |
| |
| ### Example profiling commands |
| |
| While running the benchmark, a profiler can be attached to the server. |
| |
| Example to count syscalls in grpc-go server during a benchmark: |
| |
| - Connect to server machine and run: |
| |
| ``` |
| $ netstat -tulpn | grep <driver_port> # to get pid of worker |
| $ perf stat -p <worker_pid> -e syscalls:sys_enter_write # stop after test complete |
| ``` |
| |
| Example memory profile of grpc-go server, with `go tools pprof`: |
| |
| - After a run is done on the server, see its alloc profile with: |
| |
| ``` |
| $ go tool pprof --text --alloc_space http://localhost:<pprof_port>/debug/heap |
| ``` |
| |
| ### Configuration environment variables: |
| |
| - QPS_WORKER_CHANNEL_CONNECT_TIMEOUT |
| |
| Consuming process: qps_worker |
| |
| Type: integer (number of seconds) |
| |
| This can be used to configure the amount of time that benchmark clients wait |
| for channels to the benchmark server to become ready. This is useful in |
| certain benchmark environments in which the server can take a long time to |
| become ready. Note: if setting this to a high value, then the scenario config |
| under test should probably also have a large "warmup_seconds". |
| |
| - QPS_WORKERS |
| |
| Consuming process: qps_json_driver |
| |
| Type: comma separated list of host:port |
| |
| Set this to a comma separated list of QPS worker processes/machines. Each |
| scenario in a scenario config has specifies a certain number of servers, |
| `num_servers`, and the driver will start "benchmark servers"'s on the first |
| `num_server` `host:port` pairs in the comma separated list. The rest will be |
| told to run as clients against the benchmark server. |