src/tools/rustc-perf/triage/2020-11-24.md - toolchain/rustc - Git at Google

 # 2020-11-24 Triage Log

 This week saw landing of #79237 which by itself provides no wins but opens the
 door to support for split debuginfo on macOS. This'll eventually show huge wins
 as we can likely avoid re-collecting debuginfo while retaining support for
 lldb and Rust backtraces. #79361 tracks the stabilization of the rustc flag, but
 the precise rollout to stable users is not yet 100% clear.

 Triage done by **@jyn514** and **@simulacrum**.
 Revision range: [c919f490bbcd2b29b74016101f7ec71aaa24bdbb..25a691003cf6676259ee7d4bed05b43cb6283cea](https://perf.rust-lang.org/?start=c919f490bbcd2b29b74016101f7ec71aaa24bdbb&end=25a691003cf6676259ee7d4bed05b43cb6283cea&absolute=false&stat=instructions%3Au)

 4 regressions, 4 improvements, 2 mixed results.
 5 of them in rollups.

 #### Regressions

 [#79167](https://github.com/rust-lang/rust/issues/79167): linux: try to use libc getrandom to allow interposition #78785
 - Large regression in [instruction counts](https://perf.rust-lang.org/compare.html?start=7d747db0d5dd8f08f2efb073e2e77a34553465a7&end=8d2d0014922e9f541694bfe87642749239990e0e&stat=instructions:u) (up to 7.7% on `incr-unchanged` builds of `deeply-nested-async-opt`)
 - The PR allows intercepting `getrandom` at runtime with `LD_PRELOAD`, so it's possible a regression was expected. However, 40% increased bootstrap times for `libcore` seems excessive.
 - Landed in a rollup, so it's possible another PR may be to blame. Opened [#79389](https://github.com/rust-lang/rust/pull/79389) measuring the impact.

 [#78646](https://github.com/rust-lang/rust/issues/78646): Use `PackedFingerprint` in `DepNode` to reduce memory consumption
 - Moderate regression in [instruction counts](https://perf.rust-lang.org/compare.html?start=172acf8f61018df3719e42e633ffd62ebecaa1e7&end=ae6aa22cf26fede2177abe4ff974030058885b7a&stat=instructions:u) (up to 3.2% on `full` builds of `keccak-check`)
 - Major improvement in [memory usage](https://perf.rust-lang.org/compare.html?start=172acf8f61018df3719e42e633ffd62ebecaa1e7&end=ae6aa22cf26fede2177abe4ff974030058885b7a&stat=max-rss) (up to 21.6 on `full` builds of `keccak-opt`)
 - The regression in cycle count is worse than the last perf run on the PR, but overall seems to be expected. Not leaving a comment.

 [#79237](https://github.com/rust-lang/rust/issues/79237): std: Update the backtrace crate submodule
 - Moderate regression in [instruction counts](https://perf.rust-lang.org/compare.html?start=d806d656578c2d6b34cf96809862e8cffb293a68&end=3adedb8f4c5bb71e9e8a21a047cf8ed121ce0e75&stat=instructions:u) (up to 1.4% on `incr-unchanged` builds of `unify-linearly-debug`), mostly on `debug` and `opt` builds.
 - **@ehuss** reports a 600% decrease in incremental builds when using `-Z run-dsymutil=no` on MacOS (!!). [#79361](https://github.com/rust-lang/rust/issues/79361) tracks enabling `-Z run-dsymutil=no` by default.
 - **@alexcrichton** theorizes the regression is because there's more code in libstd overall (since it now handles archives of debug symbol).
 - Not leaving a nag, since the regression is small and the improvement more than makes up for it.

 [#79273](https://github.com/rust-lang/rust/issues/79273): Rollup of 8 pull requests
 - Moderate regression in [instruction counts](https://perf.rust-lang.org/compare.html?start=3adedb8f4c5bb71e9e8a21a047cf8ed121ce0e75&end=da384694807172f0ca40eca2e49a11688aba6e93&stat=instructions:u) (up to 1.8% on `full` builds of `coercions-debug`). **@Mark-Simulacrum** thinks this is a false positive, since there are no similar regressions in `-opt` or `-check` builds.
 - Minor improvements in [instruction counts](https://perf.rust-lang.org/compare.html?start=3adedb8f4c5bb71e9e8a21a047cf8ed121ce0e75&end=da384694807172f0ca40eca2e49a11688aba6e93&stat=instructions:u) on `doc` builds (up to .4% on `unused-warnings-doc`). Likely due to [#79264](https://github.com/rust-lang/rust/pull/79264): Get rid of some doctree items.
 - Most regressions are in LLVM/codegen, so likely due to [#79067](https://github.com/rust-lang/rust/pull/79067/): Refactor the abi handling code a bit.

 #### Improvements

 [#79200](https://github.com/rust-lang/rust/issues/79200): Rollup of 14 pull requests
 - Moderate improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=3d3c8c5e0d534cdd794f1b3359089eba031d492c&end=fe982319aa0aa5bbfc2795791a753832292bd2ba&stat=instructions:u) (up to -1.9% on `full` builds of `ctfe-stress-4-opt`, up to -5.5 on `doc` builds)
 - Improvement is almost completely due to a -8.5% improvement on `eval_to_allocation_raw`
 - Unclear which PR caused the improvement; both [#79149](https://github.com/rust-lang/rust/pull/79149) and [#79101](https://github.com/rust-lang/rust/pull/79101) are likely candidates. [Left a nag](https://github.com/rust-lang/rust/pull/79200#issuecomment-733237927) asking the authors to use `rollup=never` in the future.

 [#79220](https://github.com/rust-lang/rust/issues/79220)
 - Moderate improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=09c9c9f7da72b774cc445c0f56fc0b9792a49647&end=4ec27e4b79891b0ebc2ad71a3c4ac94f67d93f93&stat=instructions:u) (up to -3.3% on `full` builds of `deeply-nested-async-check`)
 - Improvement is almost completely due to a -25.6% improvement in `normalize_generic_arg_after_erasing_regions` and -24.7% improvement in `erase_regions_ty`.
 - Likely due to [#79193](https://github.com/rust-lang/rust/pull/79193), which reverts an earlier PR. We should keep an eye on this, since it will likely regress again when the validation is re-enabled.

 [#78088](https://github.com/rust-lang/rust/issues/78088): Add lint for `panic!("{}")`
 - Moderate improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=4ec27e4b79891b0ebc2ad71a3c4ac94f67d93f93&end=74285eb3a83eac639f9c54ba8c4ccf9879b3b00a&stat=instructions:u) (up to -3.3% on `incr-full` builds of `futures-opt`)
 - The improvement is likely because the desugaring of `panic!` changed.

 [#78343](https://github.com/rust-lang/rust/issues/78343)
 - Moderate improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=d9a105fdd46c926ae606777a46dd90e5b838f92f&end=f32a0cce2fd5aaf5f361192af59cf1f2afa5f0ac&stat=instructions:u) (up to -3.0% on `incr-full` builds of `wg-grammar-opt`)
 - The improvement is likely because the way `panic!` is expanded changed.

 [#79319](https://github.com/rust-lang/rust/issues/79319)
 - Very large improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=a0d664bae6ca79c54cc054aa2403198e105190a2&end=32da90b431919eedb3e281a91caea063ba4edb77&stat=instructions:u) (up to -26.4% on `incr-patched: println` builds of `cargo-opt`)
 - Predominantly incremental perf getting better, likely due to
   #77697 Split each iterator adapter and source into individual modules
   which presumably shuffled CGU ordering in core/std, avoiding multiple LLVM
   module invalidations.

 #### Mixed

 [#78461](https://github.com/rust-lang/rust/issues/78461)
 - Very large regression in [instruction counts](https://perf.rust-lang.org/compare.html?start=da384694807172f0ca40eca2e49a11688aba6e93&end=a1a13b2bc4fa6370b9501135d97c5fe0bc401894&stat=instructions:u) (up to 36.6% on `incr-patched: println` builds of `clap-rs-debug`)
 - Moderate improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=da384694807172f0ca40eca2e49a11688aba6e93&end=a1a13b2bc4fa6370b9501135d97c5fe0bc401894&stat=instructions:u) (up to -2.6% on `incr-patched: Compiler new` builds of `regex-opt`)
 - Pretty much limited to just incremental builds, likely the addition of
   allocators to Vec is causing some problems in incremental caching.
   Potentially worth tracking down the specific cause.

 [#79186](https://github.com/rust-lang/rust/issues/79186)
 - Moderate improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=068320b39e3e4839d832b3aa71fa910ba170673b&end=40cf72108edb9b8633a9d284b238988309204494&stat=instructions:u) (up to -4.5% on `full` builds of `regression-31157-opt`)
 - Moderate regression in [instruction counts](https://perf.rust-lang.org/compare.html?start=068320b39e3e4839d832b3aa71fa910ba170673b&end=40cf72108edb9b8633a9d284b238988309204494&stat=instructions:u) (up to 4.4% on `full` builds of `deeply-nested-async-check`)
 - Seems to largely be an improvement due to less queries being run in some
   cases, but there is some upfront cost -- presumably the regressed test case
   didn't end up calling/using the now less needed queries, but paid the price in
   metadata decoding.

 #### Nags requiring follow up

 - [Left a comment](https://github.com/rust-lang/rust/pull/79167#issuecomment-733207145) nagging the author of the `LD_PRELOAD` PR.
 - [Left a comment](https://github.com/rust-lang/rust/pull/79273#issuecomment-733224830) asking why a codegen refactor could have regressed instruction count.

 #### Compiler team notes

 * https://github.com/rust-lang/rust/pull/78461 regressed incremental performance
   on debug builds of clap (interestingly, not opt builds). It may be worth
   investigating why, as the pattern of adding a generic parameter with a default
   really should not be causing regressions in downstream code. Not all of the
   regression is in LLVM.
   See by-query breakdown:
   https://perf.rust-lang.org/detailed-query.html?commit=a1a13b2bc4fa6370b9501135d97c5fe0bc401894&base_commit=da384694807172f0ca40eca2e49a11688aba6e93&benchmark=clap-rs-debug&run_name=incr-patched:%20println.
	# 2020-11-24 Triage Log

	This week saw landing of #79237 which by itself provides no wins but opens the
	door to support for split debuginfo on macOS. This'll eventually show huge wins
	as we can likely avoid re-collecting debuginfo while retaining support for
	lldb and Rust backtraces. #79361 tracks the stabilization of the rustc flag, but
	the precise rollout to stable users is not yet 100% clear.

	Triage done by @jyn514 and @simulacrum.
	Revision range: [c919f490bbcd2b29b74016101f7ec71aaa24bdbb..25a691003cf6676259ee7d4bed05b43cb6283cea](https://perf.rust-lang.org/?start=c919f490bbcd2b29b74016101f7ec71aaa24bdbb&end=25a691003cf6676259ee7d4bed05b43cb6283cea&absolute=false&stat=instructions%3Au)

	4 regressions, 4 improvements, 2 mixed results.
	5 of them in rollups.

	#### Regressions

	[#79167](https://github.com/rust-lang/rust/issues/79167): linux: try to use libc getrandom to allow interposition #78785
	- Large regression in [instruction counts](https://perf.rust-lang.org/compare.html?start=7d747db0d5dd8f08f2efb073e2e77a34553465a7&end=8d2d0014922e9f541694bfe87642749239990e0e&stat=instructions:u) (up to 7.7% on `incr-unchanged` builds of `deeply-nested-async-opt`)
	- The PR allows intercepting `getrandom` at runtime with `LD_PRELOAD`, so it's possible a regression was expected. However, 40% increased bootstrap times for `libcore` seems excessive.
	- Landed in a rollup, so it's possible another PR may be to blame. Opened [#79389](https://github.com/rust-lang/rust/pull/79389) measuring the impact.

	[#78646](https://github.com/rust-lang/rust/issues/78646): Use `PackedFingerprint` in `DepNode` to reduce memory consumption
	- Moderate regression in [instruction counts](https://perf.rust-lang.org/compare.html?start=172acf8f61018df3719e42e633ffd62ebecaa1e7&end=ae6aa22cf26fede2177abe4ff974030058885b7a&stat=instructions:u) (up to 3.2% on `full` builds of `keccak-check`)
	- Major improvement in [memory usage](https://perf.rust-lang.org/compare.html?start=172acf8f61018df3719e42e633ffd62ebecaa1e7&end=ae6aa22cf26fede2177abe4ff974030058885b7a&stat=max-rss) (up to 21.6 on `full` builds of `keccak-opt`)
	- The regression in cycle count is worse than the last perf run on the PR, but overall seems to be expected. Not leaving a comment.

	[#79237](https://github.com/rust-lang/rust/issues/79237): std: Update the backtrace crate submodule
	- Moderate regression in [instruction counts](https://perf.rust-lang.org/compare.html?start=d806d656578c2d6b34cf96809862e8cffb293a68&end=3adedb8f4c5bb71e9e8a21a047cf8ed121ce0e75&stat=instructions:u) (up to 1.4% on `incr-unchanged` builds of `unify-linearly-debug`), mostly on `debug` and `opt` builds.
	- @ehuss reports a 600% decrease in incremental builds when using `-Z run-dsymutil=no` on MacOS (!!). [#79361](https://github.com/rust-lang/rust/issues/79361) tracks enabling `-Z run-dsymutil=no` by default.
	- @alexcrichton theorizes the regression is because there's more code in libstd overall (since it now handles archives of debug symbol).
	- Not leaving a nag, since the regression is small and the improvement more than makes up for it.

	[#79273](https://github.com/rust-lang/rust/issues/79273): Rollup of 8 pull requests
	- Moderate regression in [instruction counts](https://perf.rust-lang.org/compare.html?start=3adedb8f4c5bb71e9e8a21a047cf8ed121ce0e75&end=da384694807172f0ca40eca2e49a11688aba6e93&stat=instructions:u) (up to 1.8% on `full` builds of `coercions-debug`). @Mark-Simulacrum thinks this is a false positive, since there are no similar regressions in `-opt` or `-check` builds.
	- Minor improvements in [instruction counts](https://perf.rust-lang.org/compare.html?start=3adedb8f4c5bb71e9e8a21a047cf8ed121ce0e75&end=da384694807172f0ca40eca2e49a11688aba6e93&stat=instructions:u) on `doc` builds (up to .4% on `unused-warnings-doc`). Likely due to [#79264](https://github.com/rust-lang/rust/pull/79264): Get rid of some doctree items.
	- Most regressions are in LLVM/codegen, so likely due to [#79067](https://github.com/rust-lang/rust/pull/79067/): Refactor the abi handling code a bit.

	#### Improvements

	[#79200](https://github.com/rust-lang/rust/issues/79200): Rollup of 14 pull requests
	- Moderate improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=3d3c8c5e0d534cdd794f1b3359089eba031d492c&end=fe982319aa0aa5bbfc2795791a753832292bd2ba&stat=instructions:u) (up to -1.9% on `full` builds of `ctfe-stress-4-opt`, up to -5.5 on `doc` builds)
	- Improvement is almost completely due to a -8.5% improvement on `eval_to_allocation_raw`
	- Unclear which PR caused the improvement; both [#79149](https://github.com/rust-lang/rust/pull/79149) and [#79101](https://github.com/rust-lang/rust/pull/79101) are likely candidates. [Left a nag](https://github.com/rust-lang/rust/pull/79200#issuecomment-733237927) asking the authors to use `rollup=never` in the future.

	[#79220](https://github.com/rust-lang/rust/issues/79220)
	- Moderate improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=09c9c9f7da72b774cc445c0f56fc0b9792a49647&end=4ec27e4b79891b0ebc2ad71a3c4ac94f67d93f93&stat=instructions:u) (up to -3.3% on `full` builds of `deeply-nested-async-check`)
	- Improvement is almost completely due to a -25.6% improvement in `normalize_generic_arg_after_erasing_regions` and -24.7% improvement in `erase_regions_ty`.
	- Likely due to [#79193](https://github.com/rust-lang/rust/pull/79193), which reverts an earlier PR. We should keep an eye on this, since it will likely regress again when the validation is re-enabled.

	[#78088](https://github.com/rust-lang/rust/issues/78088): Add lint for `panic!("{}")`
	- Moderate improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=4ec27e4b79891b0ebc2ad71a3c4ac94f67d93f93&end=74285eb3a83eac639f9c54ba8c4ccf9879b3b00a&stat=instructions:u) (up to -3.3% on `incr-full` builds of `futures-opt`)
	- The improvement is likely because the desugaring of `panic!` changed.

	[#78343](https://github.com/rust-lang/rust/issues/78343)
	- Moderate improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=d9a105fdd46c926ae606777a46dd90e5b838f92f&end=f32a0cce2fd5aaf5f361192af59cf1f2afa5f0ac&stat=instructions:u) (up to -3.0% on `incr-full` builds of `wg-grammar-opt`)
	- The improvement is likely because the way `panic!` is expanded changed.

	[#79319](https://github.com/rust-lang/rust/issues/79319)
	- Very large improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=a0d664bae6ca79c54cc054aa2403198e105190a2&end=32da90b431919eedb3e281a91caea063ba4edb77&stat=instructions:u) (up to -26.4% on `incr-patched: println` builds of `cargo-opt`)
	- Predominantly incremental perf getting better, likely due to
	#77697 Split each iterator adapter and source into individual modules
	which presumably shuffled CGU ordering in core/std, avoiding multiple LLVM
	module invalidations.

	#### Mixed

	[#78461](https://github.com/rust-lang/rust/issues/78461)
	- Very large regression in [instruction counts](https://perf.rust-lang.org/compare.html?start=da384694807172f0ca40eca2e49a11688aba6e93&end=a1a13b2bc4fa6370b9501135d97c5fe0bc401894&stat=instructions:u) (up to 36.6% on `incr-patched: println` builds of `clap-rs-debug`)
	- Moderate improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=da384694807172f0ca40eca2e49a11688aba6e93&end=a1a13b2bc4fa6370b9501135d97c5fe0bc401894&stat=instructions:u) (up to -2.6% on `incr-patched: Compiler new` builds of `regex-opt`)
	- Pretty much limited to just incremental builds, likely the addition of
	allocators to Vec is causing some problems in incremental caching.
	Potentially worth tracking down the specific cause.

	[#79186](https://github.com/rust-lang/rust/issues/79186)
	- Moderate improvement in [instruction counts](https://perf.rust-lang.org/compare.html?start=068320b39e3e4839d832b3aa71fa910ba170673b&end=40cf72108edb9b8633a9d284b238988309204494&stat=instructions:u) (up to -4.5% on `full` builds of `regression-31157-opt`)
	- Moderate regression in [instruction counts](https://perf.rust-lang.org/compare.html?start=068320b39e3e4839d832b3aa71fa910ba170673b&end=40cf72108edb9b8633a9d284b238988309204494&stat=instructions:u) (up to 4.4% on `full` builds of `deeply-nested-async-check`)
	- Seems to largely be an improvement due to less queries being run in some
	cases, but there is some upfront cost -- presumably the regressed test case
	didn't end up calling/using the now less needed queries, but paid the price in
	metadata decoding.

	#### Nags requiring follow up

	- [Left a comment](https://github.com/rust-lang/rust/pull/79167#issuecomment-733207145) nagging the author of the `LD_PRELOAD` PR.
	- [Left a comment](https://github.com/rust-lang/rust/pull/79273#issuecomment-733224830) asking why a codegen refactor could have regressed instruction count.

	#### Compiler team notes

	* https://github.com/rust-lang/rust/pull/78461 regressed incremental performance
	on debug builds of clap (interestingly, not opt builds). It may be worth
	investigating why, as the pattern of adding a generic parameter with a default
	really should not be causing regressions in downstream code. Not all of the
	regression is in LLVM.
	See by-query breakdown:
	https://perf.rust-lang.org/detailed-query.html?commit=a1a13b2bc4fa6370b9501135d97c5fe0bc401894&base_commit=da384694807172f0ca40eca2e49a11688aba6e93&benchmark=clap-rs-debug&run_name=incr-patched:%20println.