This was a fairly negative week for compiler performance, with regressions overall up to 14% on some workloads (primarily incr-unchanged scenarios), largely caused by #101620. We are still chasing down either a revert or a fix for that regression, though a partial mitigation in #101862 has been applied. Hopefully the full fix or revert will be part of the next triage report.
We also saw a number of other regressions land, though most were much smaller in magnitude.
Triage done by @simulacrum. Revision range: 17cbdfd07178349d0a3cecb8e7dde8f915666ced..8fd6d03e22fba2930ad377b87299de6a37076074
Summary:
(instructions:u) | mean | range | count |
---|---|---|---|
Regressions ❌ (primary) | 3.8% | [0.3%, 14.0%] | 148 |
Regressions ❌ (secondary) | 4.3% | [0.3%, 23.6%] | 141 |
Improvements ✅ (primary) | -6.4% | [-24.8%, -0.5%] | 24 |
Improvements ✅ (secondary) | -2.1% | [-4.0%, -0.4%] | 12 |
All ❌✅ (primary) | 2.4% | [-24.8%, 14.0%] | 172 |
1 Regressions, 2 Improvements, 6 Mixed; 2 of them in rollups 43 artifact comparisons made in total
Cleanup slice sort related closures in core and alloc #101816 (Comparison Link)
(instructions:u) | mean | range | count |
---|---|---|---|
Regressions ❌ (primary) | 1.1% | [1.1%, 1.1%] | 1 |
Regressions ❌ (secondary) | - | - | 0 |
Improvements ✅ (primary) | - | - | 0 |
Improvements ✅ (secondary) | - | - | 0 |
All ❌✅ (primary) | 1.1% | [1.1%, 1.1%] | 1 |
This regression occurred in a single benchmark and only in the incremental scenario, so it isn't worth follow up investigation at this time.
Partially revert #101433 #101902 (Comparison Link)
(instructions:u) | mean | range | count |
---|---|---|---|
Regressions ❌ (primary) | - | - | 0 |
Regressions ❌ (secondary) | - | - | 0 |
Improvements ✅ (primary) | -0.5% | [-0.7%, -0.4%] | 6 |
Improvements ✅ (secondary) | -0.6% | [-0.6%, -0.6%] | 1 |
All ❌✅ (primary) | -0.5% | [-0.7%, -0.4%] | 6 |
Do not fetch HIR node when iterating to find lint. #101862 (Comparison Link)
(instructions:u) | mean | range | count |
---|---|---|---|
Regressions ❌ (primary) | - | - | 0 |
Regressions ❌ (secondary) | - | - | 0 |
Improvements ✅ (primary) | -0.6% | [-2.3%, -0.2%] | 24 |
Improvements ✅ (secondary) | -13.1% | [-39.1%, -3.2%] | 6 |
All ❌✅ (primary) | -0.6% | [-2.3%, -0.2%] | 24 |
Simplify caching and storage for queries #101307 (Comparison Link)
(instructions:u) | mean | range | count |
---|---|---|---|
Regressions ❌ (primary) | - | - | 0 |
Regressions ❌ (secondary) | 1.7% | [1.7%, 1.7%] | 1 |
Improvements ✅ (primary) | -0.2% | [-0.3%, -0.2%] | 7 |
Improvements ✅ (secondary) | -0.3% | [-0.3%, -0.3%] | 3 |
All ❌✅ (primary) | -0.2% | [-0.3%, -0.2%] | 7 |
This change is either neutral or a slight improvement; the one regression occurs in a benchmark and only in one scenario, which is also prone to noise, so further investigation isn't necessary.
Rollup of 6 pull requests #101805 (Comparison Link)
(instructions:u) | mean | range | count |
---|---|---|---|
Regressions ❌ (primary) | 0.4% | [0.4%, 0.5%] | 2 |
Regressions ❌ (secondary) | - | - | 0 |
Improvements ✅ (primary) | - | - | 0 |
Improvements ✅ (secondary) | -4.5% | [-5.0%, -4.1%] | 6 |
All ❌✅ (primary) | 0.4% | [0.4%, 0.5%] | 2 |
Using the newish rollup-introspection tooling, @nnethercote narrowed this down to #101433. Put a comment on that PR and marked it as a perf-regression.
Compute lint levels by definition #101620 (Comparison Link)
(instructions:u) | mean | range | count |
---|---|---|---|
Regressions ❌ (primary) | 4.1% | [0.4%, 14.5%] | 133 |
Regressions ❌ (secondary) | 5.2% | [0.2%, 64.1%] | 131 |
Improvements ✅ (primary) | -5.1% | [-24.3%, -0.4%] | 30 |
Improvements ✅ (secondary) | -1.0% | [-1.2%, -0.5%] | 5 |
All ❌✅ (primary) | 2.4% | [-24.3%, 14.5%] | 163 |
Regression is partially addressed by #101862, but not completely. Week/week diff remains negative. Left a comment asking us to pursue the revert for the time being, since the suggested fix for this regression didn't recover performance fully.
Derive various impls instead of hand-rolling them #101858 (Comparison Link)
(instructions:u) | mean | range | count |
---|---|---|---|
Regressions ❌ (primary) | 1.2% | [0.2%, 2.2%] | 11 |
Regressions ❌ (secondary) | 0.9% | [0.3%, 2.1%] | 18 |
Improvements ✅ (primary) | - | - | 0 |
Improvements ✅ (secondary) | -0.7% | [-0.9%, -0.5%] | 6 |
All ❌✅ (primary) | 1.2% | [0.2%, 2.2%] | 11 |
This was a fairly large regression in a number of benchmarks; #101893 is pursuing further investigation here and tracking either improvements or a possible revert. It seems like moving to the derive'd impls may have also been a (correct) behavior change in some cases so a straightforward revert may be undesirable.
Change FnMutDelegate
to trait objects #101857 (Comparison Link)
(instructions:u) | mean | range | count |
---|---|---|---|
Regressions ❌ (primary) | - | - | 0 |
Regressions ❌ (secondary) | 1.2% | [0.9%, 1.6%] | 9 |
Improvements ✅ (primary) | -1.2% | [-1.2%, -1.2%] | 1 |
Improvements ✅ (secondary) | - | - | 0 |
All ❌✅ (primary) | -1.2% | [-1.2%, -1.2%] | 1 |
Regressions are limited to secondary benchmarks and stress tests at that. The goal here is an improvement in bootstrap times (measured at ~1.5% earlier, though the actual PR merge was during a window where that measurement was failing). That improvement is worth the slight loss on stress tests.
Rollup of 7 pull requests #101949 (Comparison Link)
(instructions:u) | mean | range | count |
---|---|---|---|
Regressions ❌ (primary) | 0.2% | [0.2%, 0.2%] | 3 |
Regressions ❌ (secondary) | 0.8% | [0.2%, 3.5%] | 15 |
Improvements ✅ (primary) | - | - | 0 |
Improvements ✅ (secondary) | -1.4% | [-1.6%, -1.3%] | 6 |
All ❌✅ (primary) | 0.2% | [0.2%, 0.2%] | 3 |
Still working on identifying the causes. Likely to be a minor enough delta that spending too much more time won't make sense.