2024-07-30 Triage Log

There were some notable regressions this week. Some of them are being addressed via follow-up PRs (such as the change to whitespace diagnostic reporting), and some via reverts (such as the dead code analysis that tried to flag pub structs without pub constructors). A few regressions have not yet been addressed. See report for details.

Triage done by @pnkfelix. Revision range: 9629b90b..7e3a9718

Summary:

(instructions:u)	mean	range	count
Regressions ❌ (primary)	1.3%	[0.2%, 6.1%]	43
Regressions ❌ (secondary)	1.9%	[0.1%, 10.4%]	46
Improvements ✅ (primary)	-1.0%	[-3.9%, -0.2%]	27
Improvements ✅ (secondary)	-1.6%	[-6.8%, -0.2%]	43
All ❌✅ (primary)	0.4%	[-3.9%, 6.1%]	70

5 Regressions, 6 Improvements, 6 Mixed; 8 of them in rollups 65 artifact comparisons made in total

Regressions

Do not use global caches if opaque types can be defined #126024 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	3.4%	[1.6%, 5.5%]	6
Regressions ❌ (secondary)	3.1%	[0.4%, 5.4%]	11
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	3.4%	[1.6%, 5.5%]	6

This PR says it is fixing a soundness problem. (Its not clear to me if the wrong issue was linked; the linked one is an ICE that was not actually resolved.)
All six of the regressions are to hyper: {check,debug,opt} x {incr-full, full}.
we probably should just accept this cost

Rollup of 5 pull requests #128169 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.9%	[0.2%, 3.0%]	26
Regressions ❌ (secondary)	0.5%	[0.3%, 2.2%]	13
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.9%	[0.2%, 3.0%]	26

the bulk of the regressions are to syn (i.e. 8 out of the 9 that are > 1%).
this was due to a change in how diagnostics handle certain “whitespace” characters (PR #127528); there is a revert proposed in PR #128179, but there is also a PR to address the issue itself as a followup in PR #128200
not marking as triaged until either PR #128179 or PR #128200 is landed.

Rollup of 7 pull requests #128186 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.3%	[0.2%, 0.5%]	11
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.3%	[0.2%, 0.5%]	11

already marked as triaged

Rollup of 9 pull requests #128253 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.5%	[0.4%, 0.5%]	3
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.5%	[0.4%, 0.5%]	3

regressed incr-full for bitmaps-{check,opt} and typenum-check
seems like noise from the graph over time; marking as triaged.

Document 0x10.checked_shl(BITS - 1) does not overflow #128255 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.5%	[0.5%, 0.6%]	4
Regressions ❌ (secondary)	2.2%	[2.2%, 2.2%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.5%	[0.5%, 0.6%]	4

noise, already marked as triaged

Improvements

Remove unnecessary impl sorting in queries and metadata #120812 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.2%	[-2.1%, -0.4%]	2
Improvements ✅ (secondary)	-0.3%	[-0.4%, -0.3%]	2
All ❌✅ (primary)	-1.2%	[-2.1%, -0.4%]	2

rustdoc: clean up and fix ord violations in item sorting #128146 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.7%	[-1.6%, -0.2%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.7%	[-1.6%, -0.2%]	4

Rollup of 6 pull requests #128195 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.4%	[-0.5%, -0.4%]	5
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.4%	[-0.5%, -0.4%]	5

(just noise I think)

Switch from derivative to derive-where #127042 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.2%	[-0.3%, -0.2%]	16
Improvements ✅ (secondary)	-0.5%	[-0.6%, -0.4%]	8
All ❌✅ (primary)	-0.2%	[-0.3%, -0.2%]	16

Always set result during finish() in debug builders #127946 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.5%	[-0.6%, -0.5%]	6
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.5%	[-0.6%, -0.5%]	6

(just noise I think)

Rollup of 6 pull requests #128313 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.0%	[-1.1%, -1.0%]	2
Improvements ✅ (secondary)	-0.9%	[-1.9%, -0.2%]	10
All ❌✅ (primary)	-1.0%	[-1.1%, -1.0%]	2

Mixed

Try to fix ICE from re-interning an AllocId with different allocation contents #127442 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.8%	[0.2%, 2.5%]	4
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.7%	[-1.4%, -0.3%]	7
All ❌✅ (primary)	-	-	0

the regressions are to secondary benchmarks and this is fixing a subtle ICE that arises from a race condition (and may actually represent a chance of miscompilation, maybe?)
marked as triaged

Rollup of 8 pull requests #128155 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.5%	[0.2%, 0.8%]	6
Regressions ❌ (secondary)	0.9%	[0.7%, 1.0%]	7
Improvements ✅ (primary)	-0.5%	[-0.6%, -0.4%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.1%	[-0.6%, 0.8%]	10

regressions are to hyper and exa. Mostly in hyper check-full, check-incr-full, and debug-incr-full.
bulk of time might be from spike in time spent in mir_const_qualif query ?
not marking as triaged, (though it is, to be clear, a relatively minor regression).

Allow optimizing u32::from::<char>. #124905 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.2%	[0.2%, 0.3%]	4
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.2%	[-0.2%, -0.2%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.1%	[-0.2%, 0.3%]	5

regressions are to image opt {full, incr-full}, cargo opt {full, incr-full}, and syn opt incr-unchanged
It appears that its due to extra time spent in LLVM opt, especially lto optimize, which makes sense given that this is meant to be enabling LLVM to attempt more such optimizations?
marked as triaged.

Rollup of 3 pull requests #128301 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.1%	[2.1%, 2.1%]	1
Improvements ✅ (primary)	-0.2%	[-0.3%, -0.2%]	2
Improvements ✅ (secondary)	-1.6%	[-3.0%, -0.2%]	2
All ❌✅ (primary)	-0.2%	[-0.3%, -0.2%]	2

sole regression is to secondary benchmark coercions debug-full.
seems like noise.
marked as triaged

Perform instsimplify before inline to eliminate some trivial calls #128265 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	1.2%	[0.2%, 2.6%]	4
Regressions ❌ (secondary)	0.5%	[0.5%, 0.5%]	1
Improvements ✅ (primary)	-0.5%	[-0.8%, -0.2%]	12
Improvements ✅ (secondary)	-0.3%	[-0.4%, -0.3%]	2
All ❌✅ (primary)	-0.0%	[-0.8%, 2.6%]	16

main primary regressions are to ripgrep opt full and image opt-full
these changes were anticipated during review, seems likely result of changes to inlining decisions
marked as triaged

Rollup of 6 pull requests #128360 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.6%	[0.4%, 0.7%]	4
Regressions ❌ (secondary)	4.4%	[0.3%, 12.0%]	10
Improvements ✅ (primary)	-0.3%	[-0.3%, -0.3%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.1%	[-0.3%, 0.7%]	8

primary regressions are to doc-full for html5ever, stm32f4, libc, and typenum
those are presumably due to PR #126247; pnkfelix thinks the above not worth further investigation
however, Kobzol has pointed out that the secondary regressions are significant, and has identified the root cause as PR #128104
we are in any case planning to revert the changes to dead code analysis (see PR #128404) which should address those regressions.
marked as triaged.