2022-08-30 Triage Log

A somewhat difficult week to triage due to the large amount of noise coming from two benchmarks. Hopefully this noise settles down in the future. Other than that, improvements much outweighed regressions with an average of 142 changes to instruction count averaging 0.7% improvement. There were no huge wins this week, however.

Triage done by @rylev. Revision range: 4a24f08b..0631ea5d

Summary:

(instructions:u)	mean	range	count
Regressions ❌ (primary)	1.0%	[0.2%, 2.6%]	4
Regressions ❌ (secondary)	1.3%	[0.3%, 2.6%]	23
Improvements ✅ (primary)	-0.7%	[-2.8%, -0.2%]	138
Improvements ✅ (secondary)	-1.3%	[-2.7%, -0.2%]	71
All ❌✅ (primary)	-0.7%	[-2.8%, 2.6%]	142

2 Regressions, 3 Improvements, 10 Mixed; 6 of them in rollups 40 artifact comparisons made in total

Regressions

add depth_limit in QueryVTable to avoid entering a new tcx in layout_of #100748 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.5%	[0.3%, 0.7%]	8
Regressions ❌ (secondary)	0.7%	[0.3%, 1.3%]	13
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.5%	[0.3%, 0.7%]	8

Most of the regressions are happening in html5ever-0.26.0 and deeply-nested-multi which have been noisy lately. The regressions are small enough that it‘s likely that we’re seeing that noise here too. Subsequent changes show improvements of the same magnitude reversing the regressions here.
However, there are some regressions that seem like they might be real, and they are all in doc profile test cases. The common query across the potentially real regressions is the build_impl query. This change seems like strictly less work, so I'm confused why this might be.
Left a comment asking if anyone had any good ideas despite the cachegrind run not revealing anything.

Don't catch overflow when running with cargo doc #101039 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.6%	[0.6%, 0.8%]	6
Regressions ❌ (secondary)	0.9%	[0.2%, 1.3%]	9
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.6%	[0.6%, 0.8%]	6

The primary regressions seem like noise (as they are reversed in the next perf run), but the secondary regressions seem like sustained regressions.
This was a fix for an issue that broke some crates so the minor perf hit in secondary benchmarks is likely acceptable even if it is real.

Improvements

Symbols: do not write string values of preinterned symbols into compiled artifacts #100803 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.6%	[-2.5%, -0.2%]	35
Improvements ✅ (secondary)	-1.7%	[-2.6%, -0.3%]	24
All ❌✅ (primary)	-0.6%	[-2.5%, -0.2%]	35

Elide superfluous storage markers #99946 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.9%	[0.9%, 0.9%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.6%	[-1.9%, -0.2%]	14
Improvements ✅ (secondary)	-0.5%	[-1.5%, -0.3%]	38
All ❌✅ (primary)	-0.5%	[-1.9%, 0.9%]	15

Rollup of 13 pull requests #101115 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.6%	[-0.6%, -0.6%]	1
Improvements ✅ (secondary)	-1.1%	[-1.3%, -0.9%]	8
All ❌✅ (primary)	-0.6%	[-0.6%, -0.6%]	1

Mixed

Rollup of 15 pull requests #100963 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.2%	[0.2%, 0.3%]	5
Regressions ❌ (secondary)	0.4%	[0.2%, 0.6%]	17
Improvements ✅ (primary)	-0.6%	[-0.8%, -0.5%]	6
Improvements ✅ (secondary)	-0.4%	[-0.7%, -0.3%]	6
All ❌✅ (primary)	-0.2%	[-0.8%, 0.3%]	11

I looked for something obvious that might be causing this, and I couldn't find anything promising.
It seems there are some PRs that are very likely not the cause. We can start by testing the others to see if they yield results.

Check projection types before inlining MIR #100571 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.9%	[0.7%, 1.2%]	8
Improvements ✅ (primary)	-0.4%	[-1.1%, -0.2%]	115
Improvements ✅ (secondary)	-0.4%	[-1.3%, -0.1%]	42
All ❌✅ (primary)	-0.4%	[-1.1%, -0.2%]	115

All of the regressions are secondary, and many are in the recently noisy deeply-nested-multi. Additionally, the improvements far outweigh the regressions.
Marked as triaged

Rollup of 8 pull requests #101017 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.3%	[0.3%, 0.3%]	2
Regressions ❌ (secondary)	1.5%	[0.2%, 3.6%]	14
Improvements ✅ (primary)	-0.4%	[-0.6%, -0.3%]	8
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.3%	[-0.6%, 0.3%]	10

Looks like #10034 is the likely culprit for a large part of the regressions
This is tracked by those working in the area.

Avoid reporting overflow in is_impossible_method #100705 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.7%	[0.6%, 0.8%]	6
Regressions ❌ (secondary)	0.3%	[0.3%, 0.3%]	1
Improvements ✅ (primary)	-0.5%	[-0.6%, -0.4%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.4%	[-0.6%, 0.8%]	8

The primary perf regression on this PR seems to be reversed by #101039
Marked as triaged

Rollup of 8 pull requests #101037 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	1.5%	[0.3%, 3.8%]	3
Regressions ❌ (secondary)	0.9%	[0.5%, 1.2%]	7
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-1.3%	[-1.3%, -1.2%]	2
All ❌✅ (primary)	1.5%	[0.3%, 3.8%]	3

Mostly a mixture of noisy and a small regression from #101006.
That PR is a correctness fix, so it seems likely that we'll be ok with this small regression.

session: stabilize split debuginfo on linux #98051 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.2%	[0.2%, 0.3%]	2
Improvements ✅ (primary)	-0.6%	[-0.7%, -0.5%]	6
Improvements ✅ (secondary)	-0.9%	[-1.1%, -0.6%]	6
All ❌✅ (primary)	-0.6%	[-0.7%, -0.5%]	6

It was determined that this was just noise.

interpret: remove support for uninitialized scalars #100043 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.8%	[0.6%, 0.9%]	2
Regressions ❌ (secondary)	1.3%	[0.3%, 3.2%]	18
Improvements ✅ (primary)	-0.6%	[-0.7%, -0.5%]	6
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.2%	[-0.7%, 0.9%]	8

Looks like the primary regressions are due to noise.
The regressions in secondary benchmarks seem to be more real though. Looks like the most impacted query is eval_to_allocation_raw. Seems possible that that might indeed be impacted by this change (just going off the usage of eval_to_allocation_raw in const eval)?
Indeed, it was found that an unconditional format! was causing this issue.
This should be fixed by #101154.

Rollup of 9 pull requests #101064 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.6%	[0.5%, 0.7%]	6
Regressions ❌ (secondary)	1.0%	[1.0%, 1.0%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-1.3%	[-1.5%, -1.3%]	3
All ❌✅ (primary)	0.6%	[0.5%, 0.7%]	6

It was determined that this was just noise.

Avoid cloning a collection only to iterate over it #100497 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.5%	[0.5%, 0.6%]	6
Regressions ❌ (secondary)	0.8%	[0.5%, 1.1%]	10
Improvements ✅ (primary)	-0.6%	[-0.8%, -0.5%]	3
Improvements ✅ (secondary)	-1.0%	[-1.2%, -0.5%]	4
All ❌✅ (primary)	0.2%	[-0.8%, 0.6%]	9

The regressions seem to just be noise. The improvements though seem real. See here.

Rollup of 8 pull requests #101152 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.6%	[0.5%, 0.7%]	10
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.3%	[-0.5%, -0.2%]	24
Improvements ✅ (secondary)	-0.7%	[-1.9%, -0.3%]	34
All ❌✅ (primary)	-0.0%	[-0.5%, 0.7%]	34

#99821 is responsible for all the improvements and regressions for this rollup.
Given that the improvements outweigh the regressions we mark this as triaged.