2022-01-26 Triage Log
An awesome week. There was some bits of noise from PR #91032 that landed and then had to be backed out (and may soon land again), and we continue to wrestle with how to classify which things to include in rollup PR‘s. But overall there were some very real wins to the compiler’s performance, and it is definitely reflected in the total bootstrap time graph. Great job!
Triage done by @pnkfelix. Revision range: 7bc7be860f99f4a40d45b0f74e2d01b02e072357..c54dfee65126a0ac385d55389a316e89095a0713
4 Regressions, 5 Improvements, 4 Mixed; 3 of them in rollups
29 comparisons made in total
Regressions
Update some rustc dependencies to deduplicate them #92896
- Average relevant regression: 0.5%
- Largest regression in instruction counts: 0.6% on
full
builds of match-stress-enum check
- 6 of the 7 significant relevant regresions were to variants of
match-stress-enum
. - PR author guesses that it could be noise injected via one of the dependency updates, specifically hashbrown 0.11.0 to 0.11.2.
- Left it untriaged for now (I would like to circle back and check whether there's any way to check that hypothesis; but if it goes untouched for a week, then we might just rubber-stamp it as triaged).
Disable drop range tracking in generators #93165
- Average relevant regression: 284.5%
- Largest regression in instruction counts: 879.2% on
full
builds of deeply-nested-async check
- This regression was expected; it is a result of backing out PR #91032, which was included in rollup PR #93138, but was reverted due to correctness concerns (discussed under “Mixed” section below).
Revert “Do not hash leading zero bytes of i64 numbers in Sip128 hasher” #93014
- Average relevant regression: 1.5%
- Average relevant improvement: -0.7%
- Largest regression in instruction counts: 7.9% on
incr-full
builds of clap-rs check
- This regression was expected; this PR reverts a perf optimization to restore correctness.
Store a Symbol
instead of an Ident
in AssocItem
#93095
- Average relevant regression: 0.8%
- Largest regression in instruction counts: 2.1% on
incr-patched: compile one
builds of regex check
- This PR was already triaged: it is a correctness fix for incremental compilation, and the lesser of two evils (when compared to PR #92837).
Improvements
Improve capacity estimation in Vec::from_iter #92138
- Average relevant improvement: -1.4%
- Largest improvement in instruction counts: -3.0% on
full
builds of projection-caching doc
- This hopefully closes the loop on an nearly four-year old hypothesized performance fix, issue #48994.
Make Decodable
and Decoder
infallible. #93066
- Average relevant improvement: -0.8%
- Largest improvement in instruction counts: -2.1% on
full
builds of helloworld doc
- Wow. This compare page has an impressive amount of green.
- Perhaps even more notably, the bootstrap time had 8 seconds (-1.141%) shaved off via this PR, and from the bootstrap graph, the bulk of that improvement has stuck.
- Huge kudos to @nnethercote (the PR author) here.
Use indexmap
to avoid sorting LocalDefId
s #90842
- Average relevant improvement: -1.1%
- Largest improvement in instruction counts: -1.1% on
incr-unchanged
builds of ctfe-stress-4 check
- This is part of foundational work (namely #90317) to make our incr. comp. system more robust, in that we want to ensure that information untracked for incremental compilation does not indirectly influence values and cause stable hashes to deviate.
- The point is: the motivation here is not performance. That's just a happy accident, from what I can tell.
Rustdoc: remove ListAttributesIter and use impl Iterator instead #92353
- Average relevant improvement: -2.6%
- Largest improvement in instruction counts: -4.2% on
full
builds of deeply-nested doc
- rustdoc performance improvement.
Rollup of 10 pull requests #93069
- Average relevant improvement: -0.7%
- Largest improvement in instruction counts: -1.0% on
full
builds of ripgrep opt
- This was auto-classifed as not relevant and pnkfelix does not know why, so its been moved in “Improvements”
Mixed
Rollup of 17 pull requests #93138
- Average relevant regression: 1.6%
- Average relevant improvement: -57.9%
- Largest improvement in instruction counts: -89.5% on
full
builds of deeply-nested-async check
- Largest regression in instruction counts: 2.6% on
full
builds of await-call-tree check
- The noted improvement from this roll-up was due to the inclusion of PR #91032, “Introduce drop range tracking to generator interior analysis”.
- PR #91032 injected a family of ICEs, such as issue #93161, so the feature it added is being disabled.
- As for the improvement: The PR author, @eholk, made a note hypothesizing that the improvement to deeply-nested-async may be an artifact of how much is pruned from the generator type. (This may be a sign of a overly artificial benchmark; I wrote a comment asking for more clarification there.)
Emit simpler code from format_args #91359
- Average relevant regression: 1.2%
- Average relevant improvement: -0.7%
- Largest improvement in instruction counts: -2.2% on
full
builds of cranelift-codegen check
- Largest regression in instruction counts: 3.0% on
full
builds of html5ever opt
- These performance differences were anticipated ahead of time. @simulacrum posted a nice analysis explaining the probable root cause.
- Notably, “with
-Csymbol-mangling-version=v0
the hashes (changes to which cause LLVM's workload to change) go away; [...] this patch is pretty much an improvement in terms of emitted IR (as roughly expected).”
Update hashbrown to 0.12.0 #92998
- Average relevant regression: 1.0%
- Average relevant improvement: -0.9%
- Largest improvement in instruction counts: -9.4% on
incr-patched: println
builds of webrender opt
- Largest regression in instruction counts: 2.5% on
incr-unchanged
builds of externs debug
- “an overall win but with a bit of noise since this code is extremely sensitive to inlining.”
Rollup of 8 pull requests #93288
- Average relevant regression: 1.5%
- Average relevant improvement: -0.7%
- Largest improvement in instruction counts: -0.8% on
incr-full
builds of keccak check
- Largest regression in instruction counts: 2.0% on
incr-full
builds of stm32f4 check
- stm32f4 regressed on many axes (check/debug/opt/doc, full and incremental); inflate check for both full and incremental. keccak improved slightly.
- It is not obvious what caused the changes here in this rollup.
- stm32f4 was added in part to test compiler trait machinery.
- After looking at each of 8 PR's in the rollup, most likely causes are either PR #93064 (“Properly track DepNodes in trait evaluation provisional cache”) or PR #93175 (“Implement stable overlap check considering negative traits”). Left comments on each PR asking if they should have had perf runs.
- I am leaving this unmarked (i.e. untriaged) for now.
Untriaged Pull Requests