2020-11-24 Triage Log

This week saw landing of #79237 which by itself provides no wins but opens the door to support for split debuginfo on macOS. This'll eventually show huge wins as we can likely avoid re-collecting debuginfo while retaining support for lldb and Rust backtraces. #79361 tracks the stabilization of the rustc flag, but the precise rollout to stable users is not yet 100% clear.

Triage done by @jyn514 and @simulacrum. Revision range: c919f490bbcd2b29b74016101f7ec71aaa24bdbb..25a691003cf6676259ee7d4bed05b43cb6283cea

4 regressions, 4 improvements, 2 mixed results. 5 of them in rollups.

Regressions

#79167: linux: try to use libc getrandom to allow interposition #78785

Large regression in instruction counts (up to 7.7% on incr-unchanged builds of deeply-nested-async-opt)
The PR allows intercepting getrandom at runtime with LD_PRELOAD, so it's possible a regression was expected. However, 40% increased bootstrap times for libcore seems excessive.
Landed in a rollup, so it's possible another PR may be to blame. Opened #79389 measuring the impact.

#78646: Use PackedFingerprint in DepNode to reduce memory consumption

Moderate regression in instruction counts (up to 3.2% on full builds of keccak-check)
Major improvement in memory usage (up to 21.6 on full builds of keccak-opt)
The regression in cycle count is worse than the last perf run on the PR, but overall seems to be expected. Not leaving a comment.

#79237: std: Update the backtrace crate submodule

Moderate regression in instruction counts (up to 1.4% on incr-unchanged builds of unify-linearly-debug), mostly on debug and opt builds.
@ehuss reports a 600% decrease in incremental builds when using -Z run-dsymutil=no on MacOS (!!). #79361 tracks enabling -Z run-dsymutil=no by default.
@alexcrichton theorizes the regression is because there's more code in libstd overall (since it now handles archives of debug symbol).
Not leaving a nag, since the regression is small and the improvement more than makes up for it.

#79273: Rollup of 8 pull requests

Moderate regression in instruction counts (up to 1.8% on full builds of coercions-debug). @Mark-Simulacrum thinks this is a false positive, since there are no similar regressions in -opt or -check builds.
Minor improvements in instruction counts on doc builds (up to .4% on unused-warnings-doc). Likely due to #79264: Get rid of some doctree items.
Most regressions are in LLVM/codegen, so likely due to #79067: Refactor the abi handling code a bit.

Improvements

#79200: Rollup of 14 pull requests

Moderate improvement in instruction counts (up to -1.9% on full builds of ctfe-stress-4-opt, up to -5.5 on doc builds)
Improvement is almost completely due to a -8.5% improvement on eval_to_allocation_raw
Unclear which PR caused the improvement; both #79149 and #79101 are likely candidates. Left a nag asking the authors to use rollup=never in the future.

#79220

Moderate improvement in instruction counts (up to -3.3% on full builds of deeply-nested-async-check)
Improvement is almost completely due to a -25.6% improvement in normalize_generic_arg_after_erasing_regions and -24.7% improvement in erase_regions_ty.
Likely due to #79193, which reverts an earlier PR. We should keep an eye on this, since it will likely regress again when the validation is re-enabled.

#78088: Add lint for panic!("{}")

Moderate improvement in instruction counts (up to -3.3% on incr-full builds of futures-opt)
The improvement is likely because the desugaring of panic! changed.

#78343

Moderate improvement in instruction counts (up to -3.0% on incr-full builds of wg-grammar-opt)
The improvement is likely because the way panic! is expanded changed.

#79319

Very large improvement in instruction counts (up to -26.4% on incr-patched: println builds of cargo-opt)
Predominantly incremental perf getting better, likely due to #77697 Split each iterator adapter and source into individual modules which presumably shuffled CGU ordering in core/std, avoiding multiple LLVM module invalidations.

Mixed

#78461

Very large regression in instruction counts (up to 36.6% on incr-patched: println builds of clap-rs-debug)
Moderate improvement in instruction counts (up to -2.6% on incr-patched: Compiler new builds of regex-opt)
Pretty much limited to just incremental builds, likely the addition of allocators to Vec is causing some problems in incremental caching. Potentially worth tracking down the specific cause.

#79186

Moderate improvement in instruction counts (up to -4.5% on full builds of regression-31157-opt)
Moderate regression in instruction counts (up to 4.4% on full builds of deeply-nested-async-check)
Seems to largely be an improvement due to less queries being run in some cases, but there is some upfront cost -- presumably the regressed test case didn't end up calling/using the now less needed queries, but paid the price in metadata decoding.

Nags requiring follow up

Left a comment nagging the author of the LD_PRELOAD PR.
Left a comment asking why a codegen refactor could have regressed instruction count.

Compiler team notes

https://github.com/rust-lang/rust/pull/78461 regressed incremental performance on debug builds of clap (interestingly, not opt builds). It may be worth investigating why, as the pattern of adding a generic parameter with a default really should not be causing regressions in downstream code. Not all of the regression is in LLVM. See by-query breakdown: https://perf.rust-lang.org/detailed-query.html?commit=a1a13b2bc4fa6370b9501135d97c5fe0bc401894&base_commit=da384694807172f0ca40eca2e49a11688aba6e93&benchmark=clap-rs-debug&run_name=incr-patched:%20println.