src/tools/rustc-perf/collector/compile-benchmarks/README.md - toolchain/rustc - Git at Google

 # The Compile-time Benchmark Suite

 This file describes the programs in the compile-time benchmark suite and explains why they
 were included.

 The suite changes over time. Sometimes the code for a benchmark is updated, in
 which case a small suffix will be added (starting with "-2", then "-3", and so
 on.)

 There are three categories of compile-time benchmarks, **Primary**, **Secondary**, and
 **Stable**.

 ## Primary

 These are real programs that are important in some way, and worth tracking.
 They mostly consist of real-world crates.

 - **bitmaps-3.1.0**: A bitmaps implementation. Stresses the compiler's trait
   handling by implementing a trait `Bits` for the type `BitsImpl<N>` for every
   `N` value from 1 to 1024.
 - **cargo-0.60.0**: The Rust package manager. A large program, and an important
   part of the Rust ecosystem.
 - **clap-3.1.6**: A command line argument parser library. A crate used by many
   Rust programs.
 - **cranelift-codegen-0.82.1**: The largest crate from a code generator. Used by
   wasmtime. Stresses obligation processing.
 - **diesel-1.4.8**: A type safe SQL query builder. Utilizes the type system to
   ensure a lot of invariants. Stresses anything related to resolving
   trait bounds, by having a lot of trait impls for a large number of different
   types.
 - **exa-0.10.1**: An `ls` replacement. A widely-used utility, and a binary
   crate.
 - **helloworld**: A trivial program. Gives a lower bound on compile time.
 - **html5ever-0.26.0**: An HTML parser. Stresses macro parsing code.
 - **hyper-0.14.18**: A fairly large crate. Utilizes async/await, and used by
   many Rust programs. The crate uses cargo features to enable large portions of its
   structure and is built with `--features=client,http1,http2,server,stream`.
 - **image-0.24.1**: Basic image processing functions and methods for
   converting to and from various image formats. Used often in graphics
   programming.
 - **libc-0.2.124**: An interface to `libc`. Contains many declarations of
   types, constants, and functions, but relatively little normal code. Stresses
   the parser. A very widely-used crate.
 - **regex-1.5.5**: A regular expression parser. Used by many Rust programs.
 - **ripgrep-13.0.0**: A line-oriented search tool. A widely-used utility, and a
   binary crate.
 - **serde-1.0.136**: A serialization/deserialization crate. Used by many other
   Rust programs.
 - **serde_derive-1.0.136**: A proc-macro sub-crate used by `serde`. Used by
   many other Rust programs. Stresses declarative macro expansion somewhat.
 - **stm32f4-0.14.0**: A crate that has many thousands of blanket impl blocks.
   It uses cargo features to enable large portions of its structure and is
   built with `--features=stm32f410` to have faster benchmarking times.
 - **syn-1.0.89**: A library for parsing Rust code. An important part of the Rust
   ecosystem.
 - **typenum-1.17.0**: A library that encodes integer computation within the trait system. Serves as
   a stress test for the trait solver, but at the same time it is also a very popular crate.
 - **unicode-normalization-0.1.19**: Unicode character composition and decomposition
   utilities. Uses huge `match` statements that stress the compiler in unusual
   ways.
 - **webrender-2022**: A web renderer. A large, complex crate used by Firefox
   and Servo. Webrender isn't released regularly so this is a development
   version (revision da1df33). The `-2022` suffix distinguishes it from earlier
   Webrender versions that used to be used in this benchmark suite.

 ## Secondary

 These are either artificial programs or real crates that stress one particular aspect of the
 compiler in interesting ways.

 - **await-call-tree**: A tree of async fns that await each other, creating a
   large type composed of many repeated `impl Future` types. Such types caused
   [poor performance](https://github.com/rust-lang/rust/issues/65147) in the
   past.
 - **coercions**: Contains a static array with 65,536 string literals, which
   caused [poor performance](https://github.com/rust-lang/rust/issues/32278) in
   the past.
 - **ctfe-stress-5**: A stress test for compile-time function evaluation.
 - **deeply-nested-multi**: A small program containing multiple examples
   ([one](https://github.com/rust-lang/rust/issues/38528),
   [two](https://github.com/rust-lang/rust/issues/72408),
   [three](https://github.com/rust-lang/rust/issues/75992))
   of code that caused exponential behavior in the past.
 - **deep-vector**: A test containing a single large vector of zeroes, which
   caused [poor performance](https://github.com/rust-lang/rust/issues/20936) in
   the past. Stresses macro expansion and type inference.
 - **derive**: A large number of simple structs with a `#[derive]` attribute for common built-in traits such as Copy and Debug.
 - **externs**: A large number of extern functions has caused [slowdowns in the past](https://github.com/rust-lang/rust/pull/78448).
 - **helloworld-tiny**: A trivial program optimized with flags that should reduce binary size.
   Gives a lower bound on compiled binary size.
 - **issue-46449**: A small program that caused [poor
   performance](https://github.com/rust-lang/rust/issues/46449) in the past.
 - **issue-58319**: A small program that caused [poor
   performance](https://github.com/rust-lang/rust/issues/58319) in the past.
 - **issue-88862**: A MCVE of a program that had a
   [severe performance regression](https://github.com/rust-lang/rust/issues/88862)
   when trying to normalize large opaque types with late-bound regions.
 - **many-assoc-items**: Contains a struct with many associated items, which
   caused [quadratic behavior](https://github.com/rust-lang/rust/issues/68957)
   in the past.
 - **match-stress**: Contains examples
   (one involving [a huge enum](https://github.com/rust-lang/rust/issues/7462),
   one involving
   [`exhaustive_patterns`](https://github.com/rust-lang/rust/pull/79394)) of
   `match` code that caused bad performance in the past.
 - **projection-caching**: A small program that causes extremely, deeply nested
   types which stress the trait system's projection cache. Removing that cache
   resulted in hours long compilations for some programs using futures,
   actix-web and other libraries with similarly nested type combinators.
 - **regression-31157**: A small program that caused a [large performance
   regression](https://github.com/rust-lang/rust/issues/31157) from the past.
 - **ripgrep-13.0.0-tiny**: A line-oriented search tool, optimized with flags that should reduce
   binary size.
 - **token-stream-stress**: A proc-macro crate. Constructs a long token stream
   much like the `quote` crate does, which caused [quadratic
   behavior](https://github.com/rust-lang/rust/issues/65080) in the past.
 - **tt-muncher**: Calls a quadratic TT muncher macro (based on `quote::quote!`)
   with a long input, which stresses macro expansion.
 - **tuple-stress**: Contains a single array of 65,535 nested `(i32, (f64, f64,
   f64))` tuples. The data was extracted and reduced from a [program dealing
   with grid coordinates](https://github.com/urschrei/ostn15_phf) that was
   causing rustc to [run out of
   memory](https://github.com/rust-lang/rust/issues/36799).
 - **ucd**: A Unicode crate. Contains large statics that
   [stress](https://github.com/rust-lang/rust/issues/53643) the borrow checker's
   implementation of NLL.
 - **unify-linearly**: Contains many variables that all have equality relations
   between them, which caused [exponential
   behavior](https://github.com/rust-lang/rust/pull/32062) in the past.
 - **unused-warnings**: Contains many unused imports, which caused [quadratic
   behavior](https://github.com/rust-lang/rust/issues/43572) in the past.
 - **wf-projection-stress-65510**: A stress test which showcases [quadratic
   behavior](https://github.com/rust-lang/rust/issues/65510) (in the number of
   associated type bounds).
 - **wg-grammar**: A parser generator.
   [Stresses](https://github.com/rust-lang/rust/issues/58178) the borrow
   checker's implementation of NLL.

 ## Stable

 These are benchmarks used in the
 [dashboard](https://perf.rust-lang.org/dashboard.html). They provide the
 longest continuous data set for compiler performance. As a result, they are
 quite old (e.g. 2017 or earlier), and not necessarily reflective of typical
 Rust code being written today.

 - **encoding**: An old crate providing character encoding support. Contains
   some large tables.
 - **futures**: v0.1.0 of the popular `futures` crate, which was used by many
   Rust programs. Newer versions of this crate (e.g. v0.3.21 from February 2021)
   contain very little code, instead relying on sub-crates. This makes them less
   interesting as benchmarks, because we only measure final crate compilation.
   This is why there is no futures crate among the primary benchmarks.
 - **html5ever**: See above. This is an older version (v0.5.4) of the crate.
 - **inflate**: An old implementation of the DEFLATE algorithm. Contains
   a very large function containing many locals and basic blocks, which stresses
   obligation processing.
 - **regex**: See above. This is an older version of the crate.
 - **piston-image**: See above. This is an older version of the `image` crate.
 - **style-servo**: An old version of Servo's `style` crate. A large crate, and
   one used by old versions of Firefox. Built with `--features=gecko`.
 - **syn**: See above. This is an older version (0.11.11) of the crate.
 - **tokio-webpush-simple**: A simple web server built with a very old version
   of tokio. Uses futures a lot, but doesn't use `async`/`await`.

 # How to update/add/remove benchmarks

 ## Add a new benchmark

 - Decide on which category it belongs to. Probably primary if it's a real-world
   crate, and secondary if it's a stress test or intended to catch specific
   regressions.
 - If it's a third-party crate:
   - If you are keen: talk with a maintainer of the crate to see if there is
     anything we should be aware of when using this crate as a compile-time
     benchmark.
   - Look at [crates.io](https://crates.io) to find the latest (non-prerelease) version.
   - Download it with `collector download -c $CATEGORY -a $ARTIFACT crate $NAME $VERSION`.
     The `$CATEGORY` is probably `primary`. `$ARTIFACT` is either `library` or `binary`, depending
     on what kind of artifact does the benchmark build.
 - It makes it easier for reviewers if you split things into two commits.
 - In the first commit, just add the code for the entire benchmark.
   - Do this by doing `git add` on the new directory.
   - There is no need to remove seemingly unnecessary files such as
     documentation or CI configuration.
 - In the second commit, do everything else.
   - Add `[workspace]` to the very bottom of the benchmark's `Cargo.toml`, if
     doesn't already have a `[workspace]` section. This means commands like
     `cargo build` will work within the benchmark directory.
   - Add any necessary stuff to the `perf-config.json` file.
     - If the benchmark is a sub-crate within a top-level crate, you'll need a
       `"cargo_toml"` entry.
     - If you get a "non-wrapped rustc" error when running it, you'll need a
       `"touch_file"` entry.
     - See [`collector/src/benchmark/mod.rs`](https://github.com/rust-lang/rustc-perf/blob/12cb796f8a932a891b385ba23a36d78a2867ace1/collector/src/benchmark/mod.rs#L24-L27) for a complete reference.
   - Consider adding one or more `N-*.patch` files for the `IncrPatched`
     scenario.
     - If it's a primary benchmark, you should definitely do this.
     - These usually consist of a patch that adds a single
       `println!("testing");` statement somewhere.
     - Creating the patch against what you've committed so far might be useful.
       Use `git diff` from the repository root, or `git diff --relative` within
       the benchmark directory. Copy the output into the `N-*.patch` file.
     - Do a test run with an `IncrPatched` scenario to make sure the patch
       applies correctly, e.g. `target/release/collector bench_local +nightly
       --id Test --profiles=Check --scenarios=IncrPatched
       --include=$NEW_BENCHMARK`
   - Add the new entry to `collector/compile-benchmarks/README.md`.
   - Add a new licensing entry to `collector/compile-benchmarks/REUSE.toml` (see existing entries
   for inspiration).
     - If the benchmark is artificial, use the `MIT OR Apache-2.0` license and set Rust Project
     developers as the copyright owners (see e.g. `await-call-tree` as an example).
     - If the benchmark is a third-party crate, make sure to use its license. Try to find the
     copyright owner in the crate's `COPYRIGHT` or `README` files. If you cannot find it, consider
     using the copyright owner `<crate-name> contributors`.
   - `git add` the `Cargo.lock` file, if it's not already part of the
     benchmark's committed code.
     - If the benchmark has a `.gitignore` file that contains `Cargo.lock`,
       you'll need to comment out that line so that `Cargo.lock` gets uploaded
       in the PR.
 - Consider the benchmarking time for the benchmark.
   - First, measure the entire compilation time with something like this, by
     doing this within the benchmark directory is good:
     ```
     CARGO_INCREMENTAL=0 cargo check ; cargo clean
     CARGO_INCREMENTAL=0 cargo build ; cargo clean
     CARGO_INCREMENTAL=0 cargo build --release ; cargo clean
     ```
   - Second, compare the final crate time with these commands:
     ```
     target/release/collector bench_local +nightly --id Test \
       --profiles=Check,Debug,Opt --scenarios=Full --include=$NEW,helloworld
     target/release/site results.db
     ```
     (See [here](../../site/README.md) for instructions on how to build the website).
     Then switch to wall-times, compare `Test` against itself, and toggle the
     "Show non-relevant results"/"Display raw data" check boxes to make sure it
     hasn't changed drastically.
     - E.g. `futures` was changed so it's just a facade for a bunch of
       sub-crates, and the final crate time became very similar to `helloworld`,
       which wasn't interesting.
 - File a PR, including the two sets of timing measurements in the description.

 ## Remove a benchmark

 - It makes it easier for reviewers if you split things into two commits.
 - In the first commit just remove the old code.
   - Do this with `git rm -r` on the directory.
 - In the second commit do everything else.
   - Remove the entry from `collector/compile-benchmarks/README.md`.
   - `git grep` for occurrences of the old benchmark name (e.g. in
     `.github/workflows/ci.yml` or `ci/check-*.sh`) and see if anything needs
     changing... usually not.
 - File a PR.

 ## Update a benchmark

 - Do this in two steps.
   - First add the new version of the benchmark. Compare the benchmarking time
     of the two versions to make sure nothing outrageous has happened. Once the
     PR is merged, make sure it's running correctly.
   - Second, remove the old version of the benchmark.
   Doing it in two steps ensures we have continuity of profiling coverage in the
   case where something goes wrong.
 - Compare the benchmarking time of the two versions.
 - When adding the new version, for `perf-config.json` and the `N-*.patch`
   files, use the corresponding files for the old version as a starting point.

 # Benchmark update policy

 ## Background

 rustc-perf is a "living benchmark suite" that is regularly changed. Some
 benchmarks in rustc-perf are verbatim copies of third-party crates. We
 periodically do a mass update of these benchmarks.

 Benefits of this approach:
 - We ensure we are measuring compilation of the crates most people are using.
   This is most relevant for popular crates.
 - We get coverage of newer language features.

 Costs of this approach:
 - It takes time and effort.
 - We lose some data continuity.
   - But the stable set of benchmarks used for the dashboard are not affected,
     and they provide the greatest continuity.
 - If the code hasn't changed much, it won't have much effect.

 ## Update policy

 - The third-party crates should be updated every three years. This is a
   reasonable refresh period that is neither too short or too long. It happens
   to match the Rust edition cycle, but this is just coincidence.
 - All third-party crates that have had at least one new release should be
   updated, even if not much code has changed. This avoids having to make
   decisions about whether a crate has changed enough.
 - When doing this mass update, there may be some benchmarks that are deemed no
   longer interesting and removed. For example, in the 2022 update we found that
   the `futures` crate was no longer interesting because all the functionality
   had been split into sub-crates that rustc-perf doesn't measure. Likewise,
   there may be some new benchmarks that are added.
 - New versions should be added before old versions are removed, to ensure
   continuity of profiling coverage.
 - The ad hoc addition and removal of individual benchmarks can continue
   independently of this update cycle, as per the judgment of the rustc-perf
   maintainers.

 History:
 - The first mass update of third-party crates occurred in [March/April
   2022](https://hackmd.io/d9uE7qgtTWKDLivy0uoVQw).
	# The Compile-time Benchmark Suite

	This file describes the programs in the compile-time benchmark suite and explains why they
	were included.

	The suite changes over time. Sometimes the code for a benchmark is updated, in
	which case a small suffix will be added (starting with "-2", then "-3", and so
	on.)

	There are three categories of compile-time benchmarks, Primary, Secondary, and
	Stable.

	## Primary

	These are real programs that are important in some way, and worth tracking.
	They mostly consist of real-world crates.

	- bitmaps-3.1.0: A bitmaps implementation. Stresses the compiler's trait
	handling by implementing a trait `Bits` for the type `BitsImpl<N>` for every
	`N` value from 1 to 1024.
	- cargo-0.60.0: The Rust package manager. A large program, and an important
	part of the Rust ecosystem.
	- clap-3.1.6: A command line argument parser library. A crate used by many
	Rust programs.
	- cranelift-codegen-0.82.1: The largest crate from a code generator. Used by
	wasmtime. Stresses obligation processing.
	- diesel-1.4.8: A type safe SQL query builder. Utilizes the type system to
	ensure a lot of invariants. Stresses anything related to resolving
	trait bounds, by having a lot of trait impls for a large number of different
	types.
	- exa-0.10.1: An `ls` replacement. A widely-used utility, and a binary
	crate.
	- helloworld: A trivial program. Gives a lower bound on compile time.
	- html5ever-0.26.0: An HTML parser. Stresses macro parsing code.
	- hyper-0.14.18: A fairly large crate. Utilizes async/await, and used by
	many Rust programs. The crate uses cargo features to enable large portions of its
	structure and is built with `--features=client,http1,http2,server,stream`.
	- image-0.24.1: Basic image processing functions and methods for
	converting to and from various image formats. Used often in graphics
	programming.
	- libc-0.2.124: An interface to `libc`. Contains many declarations of
	types, constants, and functions, but relatively little normal code. Stresses
	the parser. A very widely-used crate.
	- regex-1.5.5: A regular expression parser. Used by many Rust programs.
	- ripgrep-13.0.0: A line-oriented search tool. A widely-used utility, and a
	binary crate.
	- serde-1.0.136: A serialization/deserialization crate. Used by many other
	Rust programs.
	- serde_derive-1.0.136: A proc-macro sub-crate used by `serde`. Used by
	many other Rust programs. Stresses declarative macro expansion somewhat.
	- stm32f4-0.14.0: A crate that has many thousands of blanket impl blocks.
	It uses cargo features to enable large portions of its structure and is
	built with `--features=stm32f410` to have faster benchmarking times.
	- syn-1.0.89: A library for parsing Rust code. An important part of the Rust
	ecosystem.
	- typenum-1.17.0: A library that encodes integer computation within the trait system. Serves as
	a stress test for the trait solver, but at the same time it is also a very popular crate.
	- unicode-normalization-0.1.19: Unicode character composition and decomposition
	utilities. Uses huge `match` statements that stress the compiler in unusual
	ways.
	- webrender-2022: A web renderer. A large, complex crate used by Firefox
	and Servo. Webrender isn't released regularly so this is a development
	version (revision da1df33). The `-2022` suffix distinguishes it from earlier
	Webrender versions that used to be used in this benchmark suite.

	## Secondary

	These are either artificial programs or real crates that stress one particular aspect of the
	compiler in interesting ways.

	- await-call-tree: A tree of async fns that await each other, creating a
	large type composed of many repeated `impl Future` types. Such types caused
	[poor performance](https://github.com/rust-lang/rust/issues/65147) in the
	past.
	- coercions: Contains a static array with 65,536 string literals, which
	caused [poor performance](https://github.com/rust-lang/rust/issues/32278) in
	the past.
	- ctfe-stress-5: A stress test for compile-time function evaluation.
	- deeply-nested-multi: A small program containing multiple examples
	([one](https://github.com/rust-lang/rust/issues/38528),
	[two](https://github.com/rust-lang/rust/issues/72408),
	[three](https://github.com/rust-lang/rust/issues/75992))
	of code that caused exponential behavior in the past.
	- deep-vector: A test containing a single large vector of zeroes, which
	caused [poor performance](https://github.com/rust-lang/rust/issues/20936) in
	the past. Stresses macro expansion and type inference.
	- derive: A large number of simple structs with a `#[derive]` attribute for common built-in traits such as Copy and Debug.
	- externs: A large number of extern functions has caused [slowdowns in the past](https://github.com/rust-lang/rust/pull/78448).
	- helloworld-tiny: A trivial program optimized with flags that should reduce binary size.
	Gives a lower bound on compiled binary size.
	- issue-46449: A small program that caused [poor
	performance](https://github.com/rust-lang/rust/issues/46449) in the past.
	- issue-58319: A small program that caused [poor
	performance](https://github.com/rust-lang/rust/issues/58319) in the past.
	- issue-88862: A MCVE of a program that had a
	[severe performance regression](https://github.com/rust-lang/rust/issues/88862)
	when trying to normalize large opaque types with late-bound regions.
	- many-assoc-items: Contains a struct with many associated items, which
	caused [quadratic behavior](https://github.com/rust-lang/rust/issues/68957)
	in the past.
	- match-stress: Contains examples
	(one involving [a huge enum](https://github.com/rust-lang/rust/issues/7462),
	one involving
	[`exhaustive_patterns`](https://github.com/rust-lang/rust/pull/79394)) of
	`match` code that caused bad performance in the past.
	- projection-caching: A small program that causes extremely, deeply nested
	types which stress the trait system's projection cache. Removing that cache
	resulted in hours long compilations for some programs using futures,
	actix-web and other libraries with similarly nested type combinators.
	- regression-31157: A small program that caused a [large performance
	regression](https://github.com/rust-lang/rust/issues/31157) from the past.
	- ripgrep-13.0.0-tiny: A line-oriented search tool, optimized with flags that should reduce
	binary size.
	- token-stream-stress: A proc-macro crate. Constructs a long token stream
	much like the `quote` crate does, which caused [quadratic
	behavior](https://github.com/rust-lang/rust/issues/65080) in the past.
	- tt-muncher: Calls a quadratic TT muncher macro (based on `quote::quote!`)
	with a long input, which stresses macro expansion.
	- tuple-stress: Contains a single array of 65,535 nested `(i32, (f64, f64,
	f64))` tuples. The data was extracted and reduced from a [program dealing
	with grid coordinates](https://github.com/urschrei/ostn15_phf) that was
	causing rustc to [run out of
	memory](https://github.com/rust-lang/rust/issues/36799).
	- ucd: A Unicode crate. Contains large statics that
	[stress](https://github.com/rust-lang/rust/issues/53643) the borrow checker's
	implementation of NLL.
	- unify-linearly: Contains many variables that all have equality relations
	between them, which caused [exponential
	behavior](https://github.com/rust-lang/rust/pull/32062) in the past.
	- unused-warnings: Contains many unused imports, which caused [quadratic
	behavior](https://github.com/rust-lang/rust/issues/43572) in the past.
	- wf-projection-stress-65510: A stress test which showcases [quadratic
	behavior](https://github.com/rust-lang/rust/issues/65510) (in the number of
	associated type bounds).
	- wg-grammar: A parser generator.
	[Stresses](https://github.com/rust-lang/rust/issues/58178) the borrow
	checker's implementation of NLL.

	## Stable

	These are benchmarks used in the
	[dashboard](https://perf.rust-lang.org/dashboard.html). They provide the
	longest continuous data set for compiler performance. As a result, they are
	quite old (e.g. 2017 or earlier), and not necessarily reflective of typical
	Rust code being written today.

	- encoding: An old crate providing character encoding support. Contains
	some large tables.
	- futures: v0.1.0 of the popular `futures` crate, which was used by many
	Rust programs. Newer versions of this crate (e.g. v0.3.21 from February 2021)
	contain very little code, instead relying on sub-crates. This makes them less
	interesting as benchmarks, because we only measure final crate compilation.
	This is why there is no futures crate among the primary benchmarks.
	- html5ever: See above. This is an older version (v0.5.4) of the crate.
	- inflate: An old implementation of the DEFLATE algorithm. Contains
	a very large function containing many locals and basic blocks, which stresses
	obligation processing.
	- regex: See above. This is an older version of the crate.
	- piston-image: See above. This is an older version of the `image` crate.
	- style-servo: An old version of Servo's `style` crate. A large crate, and
	one used by old versions of Firefox. Built with `--features=gecko`.
	- syn: See above. This is an older version (0.11.11) of the crate.
	- tokio-webpush-simple: A simple web server built with a very old version
	of tokio. Uses futures a lot, but doesn't use `async`/`await`.

	# How to update/add/remove benchmarks

	## Add a new benchmark

	- Decide on which category it belongs to. Probably primary if it's a real-world
	crate, and secondary if it's a stress test or intended to catch specific
	regressions.
	- If it's a third-party crate:
	- If you are keen: talk with a maintainer of the crate to see if there is
	anything we should be aware of when using this crate as a compile-time
	benchmark.
	- Look at [crates.io](https://crates.io) to find the latest (non-prerelease) version.
	- Download it with `collector download -c $CATEGORY -a $ARTIFACT crate $NAME $VERSION`.
	The `$CATEGORY` is probably `primary`. `$ARTIFACT` is either `library` or `binary`, depending
	on what kind of artifact does the benchmark build.
	- It makes it easier for reviewers if you split things into two commits.
	- In the first commit, just add the code for the entire benchmark.
	- Do this by doing `git add` on the new directory.
	- There is no need to remove seemingly unnecessary files such as
	documentation or CI configuration.
	- In the second commit, do everything else.
	- Add `[workspace]` to the very bottom of the benchmark's `Cargo.toml`, if
	doesn't already have a `[workspace]` section. This means commands like
	`cargo build` will work within the benchmark directory.
	- Add any necessary stuff to the `perf-config.json` file.
	- If the benchmark is a sub-crate within a top-level crate, you'll need a
	`"cargo_toml"` entry.
	- If you get a "non-wrapped rustc" error when running it, you'll need a
	`"touch_file"` entry.
	- See [`collector/src/benchmark/mod.rs`](https://github.com/rust-lang/rustc-perf/blob/12cb796f8a932a891b385ba23a36d78a2867ace1/collector/src/benchmark/mod.rs#L24-L27) for a complete reference.
	- Consider adding one or more `N-*.patch` files for the `IncrPatched`
	scenario.
	- If it's a primary benchmark, you should definitely do this.
	- These usually consist of a patch that adds a single
	`println!("testing");` statement somewhere.
	- Creating the patch against what you've committed so far might be useful.
	Use `git diff` from the repository root, or `git diff --relative` within
	the benchmark directory. Copy the output into the `N-*.patch` file.
	- Do a test run with an `IncrPatched` scenario to make sure the patch
	applies correctly, e.g. `target/release/collector bench_local +nightly
	--id Test --profiles=Check --scenarios=IncrPatched
	--include=$NEW_BENCHMARK`
	- Add the new entry to `collector/compile-benchmarks/README.md`.
	- Add a new licensing entry to `collector/compile-benchmarks/REUSE.toml` (see existing entries
	for inspiration).
	- If the benchmark is artificial, use the `MIT OR Apache-2.0` license and set Rust Project
	developers as the copyright owners (see e.g. `await-call-tree` as an example).
	- If the benchmark is a third-party crate, make sure to use its license. Try to find the
	copyright owner in the crate's `COPYRIGHT` or `README` files. If you cannot find it, consider
	using the copyright owner `<crate-name> contributors`.
	- `git add` the `Cargo.lock` file, if it's not already part of the
	benchmark's committed code.
	- If the benchmark has a `.gitignore` file that contains `Cargo.lock`,
	you'll need to comment out that line so that `Cargo.lock` gets uploaded
	in the PR.
	- Consider the benchmarking time for the benchmark.
	- First, measure the entire compilation time with something like this, by
	doing this within the benchmark directory is good:
	```
	CARGO_INCREMENTAL=0 cargo check ; cargo clean
	CARGO_INCREMENTAL=0 cargo build ; cargo clean
	CARGO_INCREMENTAL=0 cargo build --release ; cargo clean
	```
	- Second, compare the final crate time with these commands:
	```
	target/release/collector bench_local +nightly --id Test \
	--profiles=Check,Debug,Opt --scenarios=Full --include=$NEW,helloworld
	target/release/site results.db
	```
	(See [here](../../site/README.md) for instructions on how to build the website).
	Then switch to wall-times, compare `Test` against itself, and toggle the
	"Show non-relevant results"/"Display raw data" check boxes to make sure it
	hasn't changed drastically.
	- E.g. `futures` was changed so it's just a facade for a bunch of
	sub-crates, and the final crate time became very similar to `helloworld`,
	which wasn't interesting.
	- File a PR, including the two sets of timing measurements in the description.

	## Remove a benchmark

	- It makes it easier for reviewers if you split things into two commits.
	- In the first commit just remove the old code.
	- Do this with `git rm -r` on the directory.
	- In the second commit do everything else.
	- Remove the entry from `collector/compile-benchmarks/README.md`.
	- `git grep` for occurrences of the old benchmark name (e.g. in
	`.github/workflows/ci.yml` or `ci/check-*.sh`) and see if anything needs
	changing... usually not.
	- File a PR.

	## Update a benchmark

	- Do this in two steps.
	- First add the new version of the benchmark. Compare the benchmarking time
	of the two versions to make sure nothing outrageous has happened. Once the
	PR is merged, make sure it's running correctly.
	- Second, remove the old version of the benchmark.
	Doing it in two steps ensures we have continuity of profiling coverage in the
	case where something goes wrong.
	- Compare the benchmarking time of the two versions.
	- When adding the new version, for `perf-config.json` and the `N-*.patch`
	files, use the corresponding files for the old version as a starting point.

	# Benchmark update policy

	## Background

	rustc-perf is a "living benchmark suite" that is regularly changed. Some
	benchmarks in rustc-perf are verbatim copies of third-party crates. We
	periodically do a mass update of these benchmarks.

	Benefits of this approach:
	- We ensure we are measuring compilation of the crates most people are using.
	This is most relevant for popular crates.
	- We get coverage of newer language features.

	Costs of this approach:
	- It takes time and effort.
	- We lose some data continuity.
	- But the stable set of benchmarks used for the dashboard are not affected,
	and they provide the greatest continuity.
	- If the code hasn't changed much, it won't have much effect.

	## Update policy

	- The third-party crates should be updated every three years. This is a
	reasonable refresh period that is neither too short or too long. It happens
	to match the Rust edition cycle, but this is just coincidence.
	- All third-party crates that have had at least one new release should be
	updated, even if not much code has changed. This avoids having to make
	decisions about whether a crate has changed enough.
	- When doing this mass update, there may be some benchmarks that are deemed no
	longer interesting and removed. For example, in the 2022 update we found that
	the `futures` crate was no longer interesting because all the functionality
	had been split into sub-crates that rustc-perf doesn't measure. Likewise,
	there may be some new benchmarks that are added.
	- New versions should be added before old versions are removed, to ensure
	continuity of profiling coverage.
	- The ad hoc addition and removal of individual benchmarks can continue
	independently of this update cycle, as per the judgment of the rustc-perf
	maintainers.

	History:
	- The first mass update of third-party crates occurred in [March/April
	2022](https://hackmd.io/d9uE7qgtTWKDLivy0uoVQw).