2023-04-18 Triage Log

A busy two weeks (as last week perf triage was not done). Overall improvements outweigh regressions with an average improvement of -2.6% across a large swath of the test cases. Of particular note was the move to use SipHash-1-3 instead of SipHash-2-4 for StableHasher which improved 184 benchmark tests by an average of 2.3%!

Triage done by @rylev. Revision range: 7c96e40..74864f

Summary:

(instructions:u)meanrangecount
Regressions ❌
(primary)
3.1%[0.2%, 24.4%]11
Regressions ❌
(secondary)
4.9%[0.4%, 37.4%]32
Improvements ✅
(primary)
-2.9%[-20.4%, -0.3%]205
Improvements ✅
(secondary)
-4.0%[-43.5%, -0.3%]160
All ❌✅ (primary)-2.6%[-20.4%, 24.4%]216

6 Regressions, 8 Improvements, 11 Mixed; 6 of them in rollups 119 artifact comparisons made in total

Regressions

Erase query cache values #109333 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.4%[0.2%, 0.6%]43
Regressions ❌
(secondary)
0.4%[0.3%, 0.7%]13
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
--0
All ❌✅ (primary)0.4%[0.2%, 0.6%]43
  • Author has some ideas for how to tackle this here

Rollup of 7 pull requests #110012 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.6%[0.6%, 0.7%]2
Regressions ❌
(secondary)
1.9%[1.5%, 2.2%]6
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
--0
All ❌✅ (primary)0.6%[0.6%, 0.7%]2

resolve: Preserve reexport chains in ModChildren #109500 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.7%[0.4%, 1.0%]17
Regressions ❌
(secondary)
1.3%[0.4%, 5.7%]22
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
--0
All ❌✅ (primary)0.7%[0.4%, 1.0%]17
  • There are a few more regressions in the post merge perf run, but they seem to follow the same pattern as the regressions found pre-merge. I think we can just take this as a necessary trade off we need to make for the fixes that landed (as identified by @oli-obk here).

Better diagnostic when pattern matching tuple structs #109760 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
6.4%[3.1%, 9.1%]3
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
--0
All ❌✅ (primary)6.4%[3.1%, 9.1%]3

Rollup of 3 pull requests #110401 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.3%[0.2%, 0.3%]5
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
--0
All ❌✅ (primary)0.3%[0.2%, 0.3%]5
  • This is likely noise (bitmaps is a fairly noisy benchmark).

Bypass the varint path when encoding InitMask #110343 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
1.1%[0.2%, 2.9%]11
Regressions ❌
(secondary)
3.7%[0.2%, 8.1%]11
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
--0
All ❌✅ (primary)1.1%[0.2%, 2.9%]11
  • Marked as triaged since this was identified as most likely being noise.

Improvements

Use SipHash-1-3 instead of SipHash-2-4 for StableHasher #107925 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-2.3%[-4.0%, -0.2%]184
Improvements ✅
(secondary)
-2.6%[-32.1%, -0.2%]153
All ❌✅ (primary)-2.3%[-4.0%, -0.2%]184

resolve: Restore some effective visibility optimizations #109437 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-0.3%[-0.4%, -0.3%]6
Improvements ✅
(secondary)
-0.6%[-1.6%, -0.3%]12
All ❌✅ (primary)-0.3%[-0.4%, -0.3%]6

Rollup of 6 pull requests #110024 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-0.7%[-0.7%, -0.6%]2
Improvements ✅
(secondary)
-1.8%[-2.1%, -1.5%]6
All ❌✅ (primary)-0.7%[-0.7%, -0.6%]2

Rollup of 6 pull requests #110127 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-5.9%[-8.4%, -3.0%]3
Improvements ✅
(secondary)
--0
All ❌✅ (primary)-5.9%[-8.4%, -3.0%]3

rustc_metadata: Filter encoded data more aggressively using DefKind #109765 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.7%[0.7%, 0.7%]1
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-1.0%[-1.8%, -0.2%]101
Improvements ✅
(secondary)
-1.2%[-3.6%, -0.1%]34
All ❌✅ (primary)-0.9%[-1.8%, 0.7%]102

Update cargo #110198 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-1.1%[-1.1%, -1.1%]1
Improvements ✅
(secondary)
--0
All ❌✅ (primary)-1.1%[-1.1%, -1.1%]1

resolve: Pre-compute non-reexport module children #110160 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-0.8%[-1.5%, -0.2%]26
Improvements ✅
(secondary)
-0.3%[-0.3%, -0.3%]1
All ❌✅ (primary)-0.8%[-1.5%, -0.2%]26

Implement StableHasher::write_u128 via write_u64 #110410 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-0.7%[-2.8%, -0.2%]13
Improvements ✅
(secondary)
-1.9%[-7.5%, -0.2%]24
All ❌✅ (primary)-0.7%[-2.8%, -0.2%]13

Mixed

Check pattern refutability on THIR #108504 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
0.3%[0.3%, 0.3%]3
Improvements ✅
(primary)
-0.4%[-0.5%, -0.2%]4
Improvements ✅
(secondary)
-3.0%[-3.2%, -2.8%]6
All ❌✅ (primary)-0.4%[-0.5%, -0.2%]4
  • Marked as triaged since the improvements outweigh the regressions and the regressions are small and in secondary benchmarks.

Refactor unwind in MIR #102906 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
1.1%[0.3%, 1.9%]2
Regressions ❌
(secondary)
--0
Improvements ✅
(primary)
-0.4%[-0.6%, -0.3%]10
Improvements ✅
(secondary)
-0.6%[-0.8%, -0.2%]8
All ❌✅ (primary)-0.2%[-0.6%, 1.9%]12
  • Perf wins outweigh losses.

Make elaboration generic over input #110031 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
2.0%[2.0%, 2.0%]1
Improvements ✅
(primary)
-1.1%[-2.0%, -0.4%]3
Improvements ✅
(secondary)
-0.5%[-0.5%, -0.5%]1
All ❌✅ (primary)-1.1%[-2.0%, -0.4%]3
  • Not enough perf changes here to be a problem.

Make elaboration generic over input #109900 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.6%[0.3%, 1.0%]8
Regressions ❌
(secondary)
13.9%[0.2%, 29.6%]6
Improvements ✅
(primary)
-0.7%[-1.5%, -0.3%]13
Improvements ✅
(secondary)
-1.1%[-1.8%, -0.1%]14
All ❌✅ (primary)-0.2%[-1.5%, 1.0%]21
  • Not enough perf changes here to be a problem.

Permit MIR inlining without #[inline] #109247 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
1.0%[0.2%, 25.6%]135
Regressions ❌
(secondary)
1.4%[0.2%, 8.8%]97
Improvements ✅
(primary)
-1.8%[-5.3%, -0.3%]52
Improvements ✅
(secondary)
-9.6%[-43.2%, -1.2%]21
All ❌✅ (primary)0.2%[-5.3%, 25.6%]187
  • perf may be worse than in the original perf run before merge. Of particular concern is that the cargo incr-patched: println test case has regressed significantly (42% increase in wall time). Pinged the author and reviewer about this.

Alloc hir::Lit in an arena to remove the destructor from Expr #109588 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
1.6%[0.7%, 2.7%]4
Regressions ❌
(secondary)
6.3%[5.0%, 8.1%]6
Improvements ✅
(primary)
-0.5%[-0.6%, -0.1%]3
Improvements ✅
(secondary)
-0.6%[-2.4%, -0.3%]21
All ❌✅ (primary)0.7%[-0.6%, 2.7%]7
  • Does look like something has made keccak noisy again. The bump starts with this PR but I suppose that could be due to the the previous change kicking off some bimodal behaivor.

Alloc hir::Lit in an arena to remove the destructor from Expr #110440 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.3%[0.2%, 0.3%]3
Regressions ❌
(secondary)
0.4%[0.2%, 1.1%]9
Improvements ✅
(primary)
-0.8%[-0.9%, -0.7%]4
Improvements ✅
(secondary)
--0
All ❌✅ (primary)-0.3%[-0.9%, 0.3%]7

Remove some suspicious cast truncations #110367 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
0.4%[0.2%, 1.1%]52
Regressions ❌
(secondary)
0.5%[0.3%, 0.8%]13
Improvements ✅
(primary)
--0
Improvements ✅
(secondary)
-0.9%[-1.5%, -0.2%]6
All ❌✅ (primary)0.4%[0.2%, 1.1%]52
  • Looks like this was fixed in #110410 so marking as triaged

Rollup of 9 pull requests #110458 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
1.2%[1.2%, 1.2%]3
Improvements ✅
(primary)
-1.1%[-2.9%, -0.3%]12
Improvements ✅
(secondary)
-3.5%[-7.7%, -0.2%]11
All ❌✅ (primary)-1.1%[-2.9%, -0.3%]12
  • Looks like this is all noise: keccak, cranelift-codegen and diesel have just started being noisy.

Rollup of 7 pull requests #110481 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
--0
Regressions ❌
(secondary)
0.5%[0.5%, 0.5%]3
Improvements ✅
(primary)
-7.9%[-19.2%, -0.6%]3
Improvements ✅
(secondary)
-1.6%[-1.8%, -1.3%]2
All ❌✅ (primary)-7.9%[-19.2%, -0.6%]3
  • regression is noise

ci: add a runner for vanilla LLVM 16 #110242 (Comparison Link)

(instructions:u)meanrangecount
Regressions ❌
(primary)
1.7%[0.8%, 2.8%]4
Regressions ❌
(secondary)
6.4%[5.2%, 8.0%]6
Improvements ✅
(primary)
-0.2%[-0.2%, -0.2%]1
Improvements ✅
(secondary)
-0.6%[-0.6%, -0.6%]3
All ❌✅ (primary)1.3%[-0.2%, 2.8%]5
  • Noise