2022-03-15 Triage Log

Largely a quiet week. The perf improvement highlight is the use of real world crates such as syn, cargo, and serde in the collecting of profile guided optimization (PGO) profiles for LLVM. Previously only libcore was used for LLVM, though rustc PGO had more crates involved. This led to some decent improvement in compilation of real world crates (upwards of 5.5%).

On the regression side, the regressions were all largely small but contained inside of rollups making them hard to diagnose and correct. The perf team continues to work on process improvements that make changes to the compiler land through CI quickly while minimizing perf regressions that can sneak through.

Triage done by @rylev. Revision range: 10dccdc7fcbdc64ee9efe2c1ed975ab8c1d61287..3ba1ebea122238d1a5c613deb1bf60ce24bd8fd8

2 Regressions, 3 Improvements, 3 Mixed; 3 of them in rollups 42 comparisons made in total

Regressions

Rollup of 8 pull requests #94814

Arithmetic mean of relevant regressions: 2.1%
Arithmetic mean of all relevant changes: 1.8%
Largest regression in instruction counts: 16.8% on incr-patched: println builds of cargo opt

Mostly an extremely large regression in compiling optimized builds of cargo in an incremental patch scenario.
Looks like in the impacted test case the regression is largely in codegen
#94809 is the only change that meaningful touches codegen and luckily testing whether reverting the change makes a difference should be trivial to do. Left a comment here.

Rollup of 7 pull requests #94824

Arithmetic mean of relevant regressions: 0.5%
Arithmetic mean of relevant improvements: -0.3%
Arithmetic mean of all relevant changes: 0.4%
Largest regression in instruction counts: 1.5% on incr-unchanged builds of unicode_normalization check

Unfortunately there are many PRs that could plausibly contribute to the performance change:
- #93950 (Use modern formatting for format! macros)
- #94274 (Treat unstable lints as unknown)
- #94368 ([1/2] Implement macro meta-variable expressions)
The overall regression seems low enough that I don‘t think we need to consider reverting though. Unfortunately we don’t have a good process for determining the culprit in cases like this where many PRs seem somewhat equally likely to be the cause.
Left a comment as such here

Improvements

Improve AdtDef interning. #94733

Arithmetic mean of relevant improvements: -0.5%
Largest improvement in instruction counts: -1.2% on full builds of match-stress-enum doc

Queryify is_doc_hidden #94897

Arithmetic mean of relevant improvements: -0.7%
Largest improvement in instruction counts: -1.1% on full builds of projection-caching doc

Gather LLVM PGO profiles from rustc-perf suite on real-world crates #94704

Arithmetic mean of relevant improvements: -2.8%
Largest improvement in instruction counts: -5.6% on incr-full builds of style-servo debug

Mixed

Treat constant values as mir::ConstantKind::Val #94059

Arithmetic mean of relevant regressions: 1.3%
Arithmetic mean of relevant improvements: -1.0%
Arithmetic mean of all relevant changes: -0.9%
Largest improvement in instruction counts: -6.6% on full builds of ctfe-stress-4 opt
Largest regression in instruction counts: 1.6% on full builds of keccak check

Since the regressions are all in secondary benchmarks and relatively small, we consider this to be an improvement rather than a mixed result.

Change several HashMaps to IndexMap to improve incremental hashing performance #90253

Arithmetic mean of relevant regressions: 0.3%
Arithmetic mean of relevant improvements: -0.8%
Arithmetic mean of all relevant changes: -0.2%
Largest improvement in instruction counts: -7.5% on incr-full builds of clap-rs check
Largest regression in instruction counts: 0.6% on full builds of deep-vector check

Perf was run previously and it was found that there was a large improvements to clap-rs but otherwise an overall performance wash
This story has not really changed, so the PR was marked as triaged

Use MaybeUninit in VecDeque to remove the undefined behavior of slice #94472

Arithmetic mean of relevant regressions: 0.9%
Arithmetic mean of all relevant changes: -2.0%
Largest improvement in instruction counts: -10.7% on incr-patched: println builds of tokio-webpush-simple opt
Largest regression in instruction counts: 1.1% on full builds of tokio-webpush-simple opt

Dominated by a large improvement in the tokio-webpush-simple opt incremental patch test case, the micro benchmarks indicate that this is largely a performance wash (most benchmarks don't seem to show statistical difference and those that do are a mix of small regressions and improvements)
Given all this, the PR was marked as triaged.

2022-03-15 Triage Log

Regressions

Improvements

Mixed

Untriaged Pull Requests