2022-06-14 Triage Log

A mixed week. I suppose it is best to focus on the fact we made some big improvements to a large number of primary benchmarks, at the cost of some smaller regressions to a smaller number of primary benchmarks.

Triage done by @pnkfelix. Revision range: bb55bd449e65e611da928560d948982d73e50027..edab34ab2abbafc16a78daedf71dbacd2eb0b7bf

Summary:

	mean	max	count
Regressions 😿 (primary)	0.6%	1.6%	35
Regressions 😿 (secondary)	2.1%	8.1%	23
Improvements 🎉 (primary)	-0.8%	-3.5%	72
Improvements 🎉 (secondary)	-0.8%	-2.9%	62
All 😿🎉 (primary)	-0.4%	-3.5%	107

4 Regressions, 3 Improvements, 5 Mixed; 4 of them in rollups 47 artifact comparisons made in total

Regressions

Rollup of 5 pull requests #97825 (Comparison Link)

	mean	max	count
Regressions 😿 (primary)	0.2%	0.3%	4
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	0.2%	0.3%	4

stm32f4-0.14.0 regressed for check+debug+opt on incr-patched: negate. diesel-1.4.8 regressed for check on incr-patched: println
Given that all the regressions are to incremental, I'm assuming this is because of #97058
skimming over the perf details for stm32f4 check, it seems like the bulk of the time delta is coming from expand_crate. A total time delta of 0.66, and the biggest contributors to that delta are expand_crate (0.033), incr_comp_load_dep_graph (0.015), misc_checking_1 (0.007), hir_owner_nodes (0.005), generate_crate_metadata (0.005), incr_comp_encode_dep_graph (0.004), and wf_checking (0.004). The remainder are <= 0.003, most of them <= 0.000.
given the relatively small size and scope of the regression, and the fact that it was in a rollup, I do not think this is worth investigating further. marked as triaged.

Rollup of 6 pull requests #97968 (Comparison Link)

	mean	max	count
Regressions 😿 (primary)	0.3%	0.3%	3
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	0.3%	0.3%	3

diesel-1.4.8 regressed on check full, opt full, and check incr-full, by about 0.3% each time.
rylev says that diesel has started to become more noisy in its behavior, perhaps since we turned on PGO.
I do not think this is worth investigating further. marked as triaged.

Handle def_ident_span like def_span. #95880 (Comparison Link)

	mean	max	count
Regressions 😿 (primary)	0.6%	1.6%	92
Regressions 😿 (secondary)	0.8%	1.9%	28
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	0.6%	1.6%	92

Already triaged by the PR author.
“we encode more spans into metadata and invoke more queries, so the slight regression is to be expected.”

Rollup of 5 pull requests #98025 (Comparison Link)

	mean	max	count
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	5.5%	7.9%	6
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	N/A	N/A	0

All six regressions are variations of issue-88862: {opt,debug} x { incr-full, full, incr-unchanged }. The two incr-unchanged cases regressed by 0.94% and 1.72% . The full cases regressed by 7.13% and 7.57% . The incr-full cases regressed by 7.86% and 7.80%.
None of these PRs strike me as something that would cause a problem for the code exercised by #88862. (I briefly thought it might be #98012, but issue-88862 doesn‘t exercise HKT’s...)
but also, given that issue-88862 is a canary that is trying to catch a catastrophic regression, I think we can accept a 7-8% regression here.

Improvements

Re-use the type op instead of calling the implied_outlives_bounds query directly #97081 (Comparison Link)

	mean	max	count
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	-0.3%	-0.6%	22
Improvements 🎉 (secondary)	-0.4%	-0.7%	12
All 😿🎉 (primary)	-0.3%	-0.6%	22

Revert part of #94372 to improve performance #97905 (Comparison Link)

	mean	max	count
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	-0.4%	-0.9%	82
Improvements 🎉 (secondary)	-0.5%	-1.0%	39
All 😿🎉 (primary)	-0.4%	-0.9%	82

Tidy up miscellaneous bounds suggestions #97778 (Comparison Link)

	mean	max	count
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	-0.3%	-0.4%	2
Improvements 🎉 (secondary)	-0.7%	-0.8%	6
All 😿🎉 (primary)	-0.3%	-0.4%	2

Mixed

Folding revamp #97447 (Comparison Link)

	mean	max	count
Regressions 😿 (primary)	0.5%	0.6%	6
Regressions 😿 (secondary)	0.4%	0.7%	5
Improvements 🎉 (primary)	-0.4%	-0.7%	5
Improvements 🎉 (secondary)	-0.8%	-2.1%	23
All 😿🎉 (primary)	0.1%	-0.7%	11

already triaged by its author
“The perf effects are fairly small and there are more improvements than regressions.”

Make Encodable and Encoder infallible. #94732 (Comparison Link)

	mean	max	count
Regressions 😿 (primary)	0.4%	0.8%	16
Regressions 😿 (secondary)	0.5%	0.9%	15
Improvements 🎉 (primary)	-0.3%	-0.5%	25
Improvements 🎉 (secondary)	-0.4%	-0.5%	20
All 😿🎉 (primary)	-0.1%	0.8%	41

PR author already investigated.
“Good news: #97905 fixed the regressions here. That PR plus this PR combined gave a clear performance win.”
marked as triaged.

cleanup bound variable handling #97648 (Comparison Link)

	mean	max	count
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	0.3%	0.3%	4
Improvements 🎉 (primary)	-0.7%	-1.7%	22
Improvements 🎉 (secondary)	-0.5%	-0.6%	16
All 😿🎉 (primary)	-0.7%	-1.7%	22

Rollup of 10 pull requests #98066 (Comparison Link)

	mean	max	count
Regressions 😿 (primary)	0.5%	0.6%	4
Regressions 😿 (secondary)	0.4%	0.5%	11
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	-0.3%	-0.3%	4
All 😿🎉 (primary)	0.5%	0.6%	4

diesel-1.4.8 regressed by 0.5% for variations on check/debug/opt and incr-full/full. That‘s the only primary regression, and its within the noise level that we’re currently associating with diesel, I think.
The other regressions are wf-projection-stress-65510, projection-caching, regression-31157, and wg-grammar.
wf-projection-stress-65510 and regression-31157 are canaries where we are trying to catch a massive regression, not a minor one like the ones presented here.
given that this is a rollup and the remaining regressions are well under 0.5%, I think that's the limit to the amount of investigation I want to do here.

Remove RegionckMode in favor of calling new skip_region_resolution #98041 (Comparison Link)

	mean	max	count
Regressions 😿 (primary)	0.2%	0.2%	1
Regressions 😿 (secondary)	1.0%	1.2%	3
Improvements 🎉 (primary)	-0.9%	-3.7%	35
Improvements 🎉 (secondary)	-0.3%	-0.3%	1
All 😿🎉 (primary)	-0.8%	-3.7%	36

Skimming over the comparison (and even just from the table), it seems clear that the improvements far far far outweighed the gains here.
Marking as triaged.

2022-06-14 Triage Log

Regressions

Improvements

Mixed

Untriaged Pull Requests