2023-07-26 Triage Log

A relatively light week with respect to performance changes. The one major regressing PR was reverted (for other reasons), and we saw some very nice gains on compile-times from (1.) changes to our codegen-unit merging logic and from (2.) changes to the stdlib slice iterators encoding its non-null guarantees directly, allowing the removal of a call to the assume intrinsic.

Triage done by @pnkfelix. Revision range: 6b9236ed..0308df23

Summary:

(instructions:u)meanrangecount
Regressions ❌
(primary)
1.4%[0.6%, 10.2%]27
Regressions ❌
(secondary)
1.1%[0.3%, 2.9%]19
Improvements ✅
(primary)
-2.2%[-8.3%, -0.4%]21
Improvements ✅
(secondary)
-1.6%[-2.0%, -1.2%]2
All ❌✅ (primary)-0.2%[-8.3%, 10.2%]48

1 Regressions, 1 Improvements, 4 Mixed; 1 of them in rollups 35 artifact comparisons made in total

Regressions

Prototype: Add unstable -Z reference-niches option #113166 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.7%[0.3%, 1.1%]19
Regressions ❌
(secondary)
1.0%[0.3%, 1.2%]4
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
--0
All ❌✅ (primary)0.7%[0.3%, 1.1%]19
  • reverted in PR #113946
  • marked as triaged

Improvements

Inline overlap based CGU merging #113777 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
1.2%[1.2%, 1.2%]1
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-1.8%[-4.5%, -0.3%]11
Improvements ✅
(secondary)
--0
All ❌✅ (primary)-1.6%[-4.5%, 1.2%]12

this improved instruction-counts for 9 opt-full primary benchmarks. (The one regression was to regex-1.5.5 opt-full, by -1.15%; but the wins elsewhere pay for this.)

As noted by @nnethercote , this results in nearly 10second reduction in bootstrap time (i.e. -1.495%, no small feat at all!)

Mixed

Turn copy into moves during DSE. #113758 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
9.4%[9.4%, 9.4%]1
Regressions ❌
(secondary)
0.6%[0.6%, 0.6%]1
Improvements ✅
(primary)
-1.0%[-2.1%, -0.2%]14
Improvements ✅
(secondary)
-0.8%[-1.3%, -0.2%]2
All ❌✅ (primary)-0.3%[-2.1%, 9.4%]15
  • regression is to webrender-2022 opt incr-patched, (by 9.4%, as you can see from the above)
  • from the flamegraphs, seems like codegen_module_perform_lto went from 8.6 seconds to 9.6 seconds, with half of the growth in LLVM_lto_optimize, and half in LLVM_module_codegen_emit_obj.
  • not marking as triaged for now.

Rollup of 7 pull requests #113890 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
0.2%[0.2%, 0.2%]1
Improvements ✅
(primary)
-0.3%[-0.3%, -0.2%]4
Improvements ✅
(secondary)
--0
All ❌✅ (primary)-0.3%[-0.3%, -0.2%]4
  • that doesn't seem worth dissecting
  • marking as triaged
  • (the specific secondary is tt-muncher check incr-unchanged 0.23%)

Always const-prop scalars and scalar pairs #113858 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.8%[0.3%, 3.3%]42
Regressions ❌
(secondary)
0.7%[0.2%, 1.1%]19
Improvements ✅
(primary)
-0.6%[-1.3%, -0.2%]6
Improvements ✅
(secondary)
--0
All ❌✅ (primary)0.6%[-1.3%, 3.3%]48
  • we didn't anticipate such a high impact to the instruction-counts; the trial run said there were two primary regressions here, not 42.
    • exa-0.10.1 opt-full regressed by 3.34%
    • five various bitmaps-3.1.0 profiles/scenarios regressed by 1.01% to 1.21%
    • ripgrep-13.0.0 check-incr-unchanged regressed by 1.01%
    • bunch of others that regressed by a little less than 1%... seems not great.
  • not marking as triaged.

Get !nonnull metadata on slice iterators, without assumes #113344 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
8.3%[8.3%, 8.3%]1
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-1.1%[-8.3%, -0.5%]63
Improvements ✅
(secondary)
-0.7%[-1.1%, -0.3%]15
All ❌✅ (primary)-1.0%[-8.3%, 8.3%]64
  • cranelift-codegen-0.82.1 opt-full regressed by 8.31%
  • a slew of other benchmarks improved (regex-1.5.5 incr-patched by -8.28%, bitmaps incr by 1.2-1.4%, the rest by -1% or less)
  • overall, a nice win. That's enough to let me mark this as triaged.

Untriaged Pull Requests