2024-05-07 Triage Log

Largely uneventful week; the most notable shifts were considered false-alarms that arose from changes related to cfg-checking (either cargo enabling it, or adding cfg's like rustfmt to the “well-known cfgs list”).

Triage done by @pnkfelix. Revision range: c65b2dc9..69f53f5e

Summary:

(instructions:u)meanrangecount
Regressions ❌
(primary)
3.0%[0.2%, 19.5%]65
Regressions ❌
(secondary)
1.3%[0.2%, 4.5%]103
Improvements ✅
(primary)
-0.9%[-2.2%, -0.2%]24
Improvements ✅
(secondary)
-0.7%[-1.4%, -0.4%]23
All ❌✅ (primary)1.9%[-2.2%, 19.5%]89

3 Regressions, 2 Improvements, 3 Mixed; 5 of them in rollups 54 artifact comparisons made in total

Regressions

Rollup of 7 pull requests #124675 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.5%[0.2%, 1.2%]11
Regressions ❌
(secondary)
0.8%[0.4%, 1.3%]17
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
--0
All ❌✅ (primary)0.5%[0.2%, 1.2%]11
  • all primary regressions are to doc-full scenarios, and the 1.2% is to helloworld.
  • not worth teasing apart a rollup PR.
  • marking as triaged.

Update cargo #124684 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
2.4%[0.2%, 19.1%]83
Regressions ❌
(secondary)
1.6%[0.2%, 5.7%]92
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
--0
All ❌✅ (primary)2.4%[0.2%, 19.1%]83
  • syn (mostly check builds, but also a debug incr-unchanged and opt incr-unchanged) had regressions ranging from 7.24% all the way up to 19.11%.
  • The most plausible hypothesis is that this is due to an explosion in the number of warnings emitted for this benchmark. (The number of warnings went from ~200 up to 1800, according to Urgau's analysis).
  • This means the code ends up becoming, at least in part, a benchmark of the lint machinery, regardless of whether that is our intent or not.
  • see also rustc-perf#1819 “Consider passing -Awarnings (or similar) to avoid false alarms from lint reporting
  • marking as triaged.

Rollup of 3 pull requests #124784 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.3%[0.2%, 0.4%]5
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
--0
All ❌✅ (primary)0.3%[0.2%, 0.4%]5
  • all regressions were to syn, to various incr-unchanged and incr-patched:println scenarios.
  • current hypothesis is that this is due to PR #124742, which adds rustfmt to the well-known cfgs list.
  • that hypothesis implies that this is a (mostly-)false alarm, much like #124684.
  • marking as triaged

Improvements

Rollup of 10 pull requests #124646 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-1.0%[-2.8%, -0.2%]24
Improvements ✅
(secondary)
-0.9%[-1.6%, -0.3%]9
All ❌✅ (primary)-1.0%[-2.8%, -0.2%]24
  • the bulk of the improvements are to variations of html5ever and serde_derive.
  • skimming over the rollup list, I cannot identify an immediate root cause for improvement
  • but for now will treat it like a happy accident

Some hir cleanups #124401 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-0.1%[-0.2%, -0.1%]3
Improvements ✅
(secondary)
-1.1%[-2.0%, -0.2%]2
All ❌✅ (primary)-0.1%[-0.2%, -0.1%]3
  • all improvements are to variations of typenum
  • the hir cleanups in question are largely to store AnonConst (e.g. for array lengths) in the HIR arena, and then move the ConstArg span over to AnonConst span instead.
  • inspection of typenum didn't show any particular cases that seemed like the would stress AnonConst; maybe the benefit comes more from the places where we now pass a span by value instead of passing a pointer to it.

Mixed

Account for immutably borrowed locals in MIR copy-prop and GVN #123602 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.3%[0.2%, 0.9%]10
Regressions ❌
(secondary)
0.8%[0.2%, 2.6%]4
Improvements ✅
(primary)
-0.5%[-1.1%, -0.2%]6
Improvements ✅
(secondary)
-0.5%[-1.0%, -0.3%]8
All ❌✅ (primary)0.0%[-1.1%, 0.9%]16
  • html5ever opt-full regressed by 0.92%; libc in various incremental scenarios regressed by 0.30% to 0.39%.
  • the libc changes were anticipated in the perf build prior to merge; html5ever opt-full was not predicted there.
  • pnkfelix hypothesizes that this just reflects some extra-work from the compiler attempting to do the copy-propagation and global-value-numbering mir-optimizations on a larger set of immutably-borrowed locals, and is acceptable given the expected benefits.
  • marking as triaged

Rollup of 8 pull requests #124703 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.5%[0.2%, 0.6%]4
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
-1.0%[-1.5%, -0.5%]4
All ❌✅ (primary)0.5%[0.2%, 0.6%]4
  • image opt-full regressed by 0.63%; html5ever debug-{incr-full,full} by ~0.5%, html5ever opt-incr-unchaged by 0.21%
  • already triaged by Kobzol, who hypothesizes that PR #124700 modified some inlining decisions.

Rollup of 4 pull requests #124716 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
0.3%[0.3%, 0.5%]6
Improvements ✅
(primary)
-0.8%[-0.8%, -0.8%]1
Improvements ✅
(secondary)
--0
All ❌✅ (primary)-0.8%[-0.8%, -0.8%]1
  • all regressions are secondary (specifically on unused-warnings benchmark)
  • regression identified by Kobzol as caused by PR #124584 “Various improvements to entrypoint code”
  • seems like noise to pnkfelix
  • marked as triaged

Untriaged Pull Requests