2024-01-16 Triage Log

This week had some small regressions that did not warrant further investigation, several of which were dismissed as being noise/blips in the data. There were also a number of gains. (Don‘t get exicited about that 20.6% improvement, its an measurement artifact from a temporary blip in the PR that immediately preceded this week’s triage.)

Triage done by @pnkfelix. Revision range: 76101eec..f9c2421a

Summary:

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.7%[0.6%, 0.7%]2
Regressions ❌
(secondary)
3.1%[0.8%, 4.1%]9
Improvements ✅
(primary)
-1.2%[-20.6%, -0.2%]133
Improvements ✅
(secondary)
-0.8%[-7.3%, -0.1%]31
All ❌✅ (primary)-1.2%[-20.6%, 0.7%]135

3 Regressions, 5 Improvements, 5 Mixed; 3 of them in rollups 55 artifact comparisons made in total

Regressions

Rollup of 10 pull requests #119754 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
2.4%[2.4%, 2.4%]2
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
--0
All ❌✅ (primary)--0
  • The 2 regressing (and secondary) benchmarks are tt-muncher debug {incr-full, full}. Its not transient.
  • I‘ve skimmed over the list of PR’s in the rollup. None of them are obvious culprits here. I looked at the ones related to debuginfo (#118903) and to code-coverage (#119033 and #119681), but none of those seem likely to be to blame here
  • Since this only affects a secondary benchmark, and only the instruction count (e.g. not cpu-clock:u nor wall-time for these two benchmarks), I do not think its worth further investigation and I'm going to mark it as triaged.

Exhaustiveness: use an Option instead of allocating fictitious patterns #119688 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
3.8%[3.6%, 4.1%]6
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
--0
All ❌✅ (primary)--0
  • This impacted the 6 variants of match-stress {incr-full,full} x {check,debug,opt}
  • I think the impact on match-stress was probably well-anticipated, and within a reasonable range for a stress-test benchmark.
  • Note that #119688 was a precursor to some further cleanup code (namely to remove the use of a local-arena within exhaustiveness checking).
  • Marking as triaged.

never patterns: Check bindings wrt never patterns #119610 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.3%[0.3%, 0.4%]3
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
--0
All ❌✅ (primary)0.3%[0.3%, 0.4%]3
  • This impacted 3 variants of unicode-normalization-0.1.19: debug incr-unchanged and check {incr-unchanged, incr-patched:println}.
  • Interestingly, during two different try runs, those three variants were found to have improved by similar amounts by this PR.
  • there's some weird interaction between that benchmark and the code paths impacted by this PR, and I do not think its worth investing effort in further investigation.
  • marking as triaged.

Improvements

macro_rules: Add an expansion-local cache to span marker #119693 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-1.4%[-20.5%, -0.2%]80
Improvements ✅
(secondary)
-0.8%[-1.9%, -0.3%]16
All ❌✅ (primary)-1.4%[-20.5%, -0.2%]80
  • the bitmaps changes (-20.5%, -17.9%, -13.1%) are all artifacts of returning to normal after a blip in the previous PR.

A more efficient slice comparison implementation for T: !BytewiseEq #116846 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-0.5%[-0.9%, -0.2%]15
Improvements ✅
(secondary)
-0.6%[-0.6%, -0.6%]1
All ❌✅ (primary)-0.5%[-0.9%, -0.2%]15
  • it is too bad that work in PR #100124 stalled.

Remove a large amount of leb128-coded integers #119791 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
1.1%[1.1%, 1.1%]1
Improvements ✅
(primary)
-0.3%[-0.3%, -0.2%]5
Improvements ✅
(secondary)
-0.3%[-0.5%, -0.1%]12
All ❌✅ (primary)-0.3%[-0.3%, -0.2%]5
  • the 1.1% hit is to deep-vector debug full. It may be transient; the history is pretty up-and-down at the time of this PR, and has settled at a lower level than where it was when this PR landed.
  • in any case, the gains elsewhere, especially bootstrap, outweigh the loss to that one secondary benchmark. (Which ... I guess is what the rustc-perf bot now computes as well, since it categorized this as an Improvement rather than Mixed?)

Exhaustiveness: track overlapping ranges precisely #119396 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-0.4%[-1.7%, -0.2%]32
Improvements ✅
(secondary)
--0
All ❌✅ (primary)-0.4%[-1.7%, -0.2%]32

Rollup of 6 pull requests #119889 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-1.8%[-1.9%, -1.7%]4
Improvements ✅
(secondary)
-4.3%[-7.4%, -1.3%]2
All ❌✅ (primary)-1.8%[-1.9%, -1.7%]4

Mixed

Support async recursive calls (as long as they have indirection) #117703 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
0.2%[0.2%, 0.2%]1
Improvements ✅
(primary)
-0.3%[-0.4%, -0.3%]3
Improvements ✅
(secondary)
--0
All ❌✅ (primary)-0.3%[-0.4%, -0.3%]3
  • this is weird, it looks like an inverse blip occurred on the preceding PR, where tt-muncher check incr-unchanged had a single point with -0.2% instruction-count, and then it preceding to “return to normal” on the succeeding PRs.
  • (Its harder for me to explain away “inverse blips” ...)
  • but at the same time, this does not seem like a significant regression by our usual metrics.
  • marking as triaged.

Rollup of 9 pull requests #119767 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
1.3%[0.4%, 2.3%]2
Regressions ❌
(secondary)
0.9%[0.5%, 1.2%]2
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
-0.3%[-0.3%, -0.3%]1
All ❌✅ (primary)1.3%[0.4%, 2.3%]2
  • primary regressions: syn opt-full regressed by 2.3%, bitmaps check-incr-full by 0.35%. secondary regressions: coercions debug-full by 1.23%, ctfe-stress check-full by 0.51%
  • from the overall history, it seems like syn opt-full returned to “normal” with later PRs that don‘t necessarily seem like they would have affected syn (e.g. PR #117449). bitmap check-incr-full’s trend is likewise downward after this point.
  • marking as triaged

Add assume into NonZeroIntX::get #119452 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.5%[0.4%, 0.9%]4
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-0.7%[-0.7%, -0.7%]1
Improvements ✅
(secondary)
--0
All ❌✅ (primary)0.3%[-0.7%, 0.9%]5
  • scottmcm writes: “Instructions have a couple red in instruction counts for opt, but that's entirely reasonable for something intended to enable optimizations. Notably, the cycles are green, with no regressions. So I think this is fine.”
  • marking as triaged

Avoid some redundant work in GVN #119439 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.8%[0.6%, 0.9%]4
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-0.7%[-1.1%, -0.5%]4
Improvements ✅
(secondary)
-0.4%[-0.4%, -0.4%]1
All ❌✅ (primary)0.0%[-1.1%, 0.9%]8
  • primary regressions are regex-1.5.5 debug-full, opt-incr-patched:Job, incr-full, and exa opt-full.
  • the exa regression looks like a blip. The regex ones were predicted during a try run for the PR. I assume they were deemed acceptable as they are offset improvements elsewhere (or dismissed as noise?)
  • marking as triaged.

Sandwich MIR optimizations between DSE. #119672 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.7%[0.2%, 1.4%]14
Regressions ❌
(secondary)
0.5%[0.2%, 2.7%]14
Improvements ✅
(primary)
-1.0%[-2.2%, -0.2%]31
Improvements ✅
(secondary)
-0.9%[-2.2%, -0.2%]10
All ❌✅ (primary)-0.4%[-2.2%, 1.4%]45
  • already marked as triaged by @lqd with the comment “As seen in the previous runs: some nice wins on bigger benchmarks, and overall gains outweigh the few losses.”