2024-01-08 Triage Log

Not a particularly notable week. Large swings aren‘t spurious but also are driven by changes in high-level behavior (diagnostics going from zero to one emission primarily), which causes a lot more work to happen. This isn’t really representative of the underlying rustc performance changing though.

Triage done by @simulacrum. Revision range: 67b6975051b83ef2bd28f06e8467470d570aceb3..76101eecbe9aa80753664bbe637ad06d1925f315

Summary:

(instructions:u)	mean	range	count
Regressions ❌ (primary)	4.9%	[0.2%, 24.3%]	14
Regressions ❌ (secondary)	4.6%	[0.2%, 29.9%]	55
Improvements ✅ (primary)	-0.5%	[-1.5%, -0.2%]	61
Improvements ✅ (secondary)	-0.7%	[-1.0%, -0.4%]	14
All ❌✅ (primary)	0.5%	[-1.5%, 24.3%]	75

4 Regressions, 4 Improvements, 6 Mixed; 1 of them in rollups

33 artifact comparisons made in total

Regressions

rustc_lint: Enforce rustc::potential_query_instability lint #119251 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.2%	[0.2%, 0.3%]	2
Regressions ❌ (secondary)	0.3%	[0.2%, 0.4%]	8
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.2%	[0.2%, 0.3%]	2

Minor change in just a few benchmarks. Not clear whether this is noise or not but the overall change is required for correctness.

Merge unused_tuple_struct_fields into dead_code #118297 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.6%	[0.3%, 0.8%]	7
Regressions ❌ (secondary)	8.3%	[0.2%, 30.4%]	28
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.6%	[0.3%, 0.8%]	7

Regressions are related to this lint firing in a few benchmarks, which causes a good deal of lazy loading to actually happen in diagnostics infra.

Exhaustiveness: Statically enforce revealing of opaques #119329 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.9%	[1.7%, 2.0%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Only a match-stress regression, seems like an acceptable change for the correctness fix.

Inline a few utility functions around MIR #119459 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.3%	[0.3%, 0.4%]	3
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.3%	[0.3%, 0.4%]	3

Potentially just noise. Overall impact is limited to just one benchmark and only incr-full.

Improvements

Separate immediate and in-memory ScalarPair representation #118991 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.6%	[0.6%, 0.6%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.0%	[-1.5%, -0.6%]	13
Improvements ✅ (secondary)	-0.2%	[-0.2%, -0.2%]	1
All ❌✅ (primary)	-0.9%	[-1.5%, 0.6%]	14

rustc_span: Optimize syntax context comparisons #119531 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.6%	[-0.8%, -0.4%]	5
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.6%	[-0.8%, -0.4%]	5

Exhaustiveness: remove Matrix.wildcard_row #119667 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.8%	[-3.0%, -2.6%]	6
All ❌✅ (primary)	-	-	0

macro_rules: Add an expansion-local cache to span marker #119693 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.4%	[-20.5%, -0.2%]	80
Improvements ✅ (secondary)	-0.8%	[-1.9%, -0.3%]	16
All ❌✅ (primary)	-1.4%	[-20.5%, -0.2%]	80

Largest improvements here are recovery from a spurious regression in previous PR, but this is still a good win even aside from that.

Mixed

Reorder check_item_type diagnostics so they occur next to the corresponding check_well_formed diagnostics #117213 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.2%	[0.2%, 0.4%]	7
Regressions ❌ (secondary)	0.8%	[0.2%, 2.3%]	5
Improvements ✅ (primary)	-0.3%	[-0.3%, -0.3%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.1%	[-0.3%, 0.4%]	9

Stabilize THIR unsafeck #117673 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.3%	[0.2%, 0.4%]	2
Improvements ✅ (primary)	-0.4%	[-0.9%, -0.2%]	39
Improvements ✅ (secondary)	-0.6%	[-1.1%, -0.4%]	9
All ❌✅ (primary)	-0.4%	[-0.9%, -0.2%]	39

Improvements outweigh regressions.

Replace a number of FxHashMaps/Sets with stable-iteration-order alternatives #119192 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.3%	[0.1%, 0.5%]	7
Improvements ✅ (primary)	-0.4%	[-0.6%, -0.2%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.4%	[-0.6%, -0.2%]	2

Correctness fix, acceptable regressions.

Rollup of 9 pull requests #119662 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.2%	[0.2%, 0.2%]	1
Regressions ❌ (secondary)	0.7%	[0.2%, 1.3%]	13
Improvements ✅ (primary)	-0.4%	[-0.4%, -0.2%]	6
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.3%	[-0.4%, 0.2%]	7

tt-muncher is the primary significant regression, and appears to be significantly beyond the noise level for that benchmark. Investigation is ongoing.

mark vec::IntoIter pointers as !nonnull #114205 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.7%	[0.6%, 0.8%]	2
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.4%	[-0.4%, -0.4%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.4%	[-0.4%, 0.8%]	3

Likely slightly more work for LLVM.