2023-03-21 Triage Log

A mixed week, with some nice wins, but also at least two PR's that were subsequently reverted, such as the upgrade to LLVM 16. We do want to note PR #108944, which cut down on crate metadata, binary sizes, and was an overall win on execution time for many benchmarks.

Triage done by @pnkfelix. Revision range: 00587489..ef03fda3

Summary:

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.8%	[0.2%, 2.3%]	31
Regressions ❌ (secondary)	1.5%	[0.3%, 3.0%]	40
Improvements ✅ (primary)	-1.1%	[-6.7%, -0.2%]	101
Improvements ✅ (secondary)	-3.9%	[-16.8%, -0.5%]	45
All ❌✅ (primary)	-0.6%	[-6.7%, 2.3%]	132

1 Regressions, 4 Improvements, 11 Mixed; 2 of them in rollups 37 artifact comparisons made in total

Regressions

Rollup of 10 pull requests #109206 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.5%	[0.4%, 0.7%]	2
Regressions ❌ (secondary)	1.1%	[0.9%, 1.3%]	4
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.5%	[0.4%, 0.7%]	2

as noted on PR, primary regressions are doc benchmarks, and secondary regressions are tt-muncher ... likely being noisy
marking as triaged.

Improvements

Remove identity_future indirection #104833 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.5%	[-0.5%, -0.5%]	1
Improvements ✅ (secondary)	-3.9%	[-14.3%, -0.9%]	19
All ❌✅ (primary)	-0.5%	[-0.5%, -0.5%]	1

fix ignore header in MSVC test #108809 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.3%	[-0.3%, -0.3%]	1
Improvements ✅ (secondary)	-1.0%	[-1.3%, -0.9%]	6
All ❌✅ (primary)	-0.3%	[-0.3%, -0.3%]	1

fast path for process_obligations #108815 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.5%	[0.5%, 0.5%]	1
Regressions ❌ (secondary)	1.5%	[1.5%, 1.5%]	1
Improvements ✅ (primary)	-2.6%	[-8.6%, -0.3%]	9
Improvements ✅ (secondary)	-12.7%	[-21.9%, -7.8%]	6
All ❌✅ (primary)	-2.3%	[-8.6%, 0.5%]	10

Update host compiler to LLVM 16 #108802 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.7%	[-1.4%, -0.3%]	38
Improvements ✅ (secondary)	-0.9%	[-1.7%, -0.6%]	11
All ❌✅ (primary)	-0.7%	[-1.4%, -0.3%]	38

Mixed

Rollup of 9 pull requests #109130 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.3%	[0.3%, 0.4%]	6
Regressions ❌ (secondary)	0.6%	[0.3%, 1.0%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.5%	[-0.5%, -0.3%]	5
All ❌✅ (primary)	0.3%	[0.3%, 0.4%]	6

the primary regressions are all to stm32f4-0.14.0 incremental (unchanged and patched:negate).
nothing in the rollup stands out as an obvious culprit. (I was briefly curious about #109101, but I think that indeed should only affect compilation of type-erroneous code, IIUC)
marking as triaged, because I do not think these results warrant deeper investigation at this time.

remove obsolete givens from regionck #107376 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.6%	[0.5%, 0.7%]	3
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.4%	[-0.5%, -0.3%]	3
All ❌✅ (primary)	0.6%	[0.5%, 0.7%]	3

primary benchmark regressions were to bitmaps incr-full { check, debug, opt }
This was reverted in PR #109183; but the revert PR didn't register analogous reversal of the performance impact here.
The performance regression to bitmaps incr-full was swallowed by big gains registered by PR #109035
Marking as triaged.

Ensure ptr::read gets all the same LLVM load metadata that dereferencing does #109035 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.8%	[0.3%, 1.9%]	7
Regressions ❌ (secondary)	1.5%	[0.3%, 2.9%]	5
Improvements ✅ (primary)	-0.8%	[-1.3%, -0.3%]	14
Improvements ✅ (secondary)	-2.2%	[-3.7%, -0.4%]	20
All ❌✅ (primary)	-0.2%	[-1.3%, 1.9%]	21

already triaged, this was a big broad win.

Implement checked Shl/Shr at MIR building. #108282 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.3%	[1.3%, 1.3%]	2
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.9%	[-1.3%, -0.5%]	8
All ❌✅ (primary)	-	-	0

already triaged

Flatten/inline format_args!() and (string and int) literal arguments into format_args!() #106824 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.7%	[0.3%, 1.1%]	2
Improvements ✅ (primary)	-0.7%	[-1.3%, -0.3%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.7%	[-1.3%, -0.3%]	4

already triaged

Replace ZST operands and debuginfo by constants. #107270 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	1.2%	[0.4%, 1.9%]	4
Regressions ❌ (secondary)	0.7%	[0.7%, 0.7%]	1
Improvements ✅ (primary)	-0.5%	[-0.6%, -0.3%]	3
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.5%	[-0.6%, 1.9%]	7

already triaged

Wrap the whole LocalInfo in ClearCrossCrate. #108944 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	1.6%	[0.3%, 2.4%]	6
Regressions ❌ (secondary)	1.7%	[1.0%, 2.9%]	16
Improvements ✅ (primary)	-0.8%	[-1.3%, -0.5%]	45
Improvements ✅ (secondary)	-0.8%	[-0.8%, -0.7%]	5
All ❌✅ (primary)	-0.5%	[-1.3%, 2.4%]	51

primary instruction count regression was to unicode normalization (5 variations in [1.5%, 2.4%] range). Slightly more muted for cycle count and wall time there.
as noted by @lqd and @nnethercote on PR comments: crate metadata + binary size improved all around.
overall this is a win
marking as triaged.

Upgrade to LLVM 16 #107224 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	1.2%	[0.3%, 3.7%]	63
Regressions ❌ (secondary)	1.1%	[0.2%, 2.7%]	25
Improvements ✅ (primary)	-1.1%	[-3.4%, -0.5%]	49
Improvements ✅ (secondary)	-1.2%	[-4.3%, -0.3%]	70
All ❌✅ (primary)	0.2%	[-3.4%, 3.7%]	112

instruction count deltas are all over the place
cycle counts and wall-time seem to paint this PR in a much more positive light
cycle counts regressions: [ +0.67%, +2.92%] +1.88% 22 (9); improvements: [-10.51%, -1.01%] -2.38% 141 (30)
wall-time regressions: [ +0.51%, +3.75%] +2.04% 9 (6); improvements: [-10.11%, -0.75%] -2.60% 75 (25)
marking as triaged, at least with respect to performance impact.
doesn't really matter, since it was reverted in PR #109326 anyway

Revert “Auto merge of #107224 - nikic:llvm-16, r=cuviper” #109326 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	1.1%	[0.5%, 3.6%]	51
Regressions ❌ (secondary)	1.2%	[0.3%, 4.5%]	67
Improvements ✅ (primary)	-1.2%	[-3.6%, -0.3%]	64
Improvements ✅ (secondary)	-1.1%	[-2.4%, -0.2%]	25
All ❌✅ (primary)	-0.2%	[-3.6%, 3.6%]	115

already triaged. also, see notes for #107224.

Optimize dep node backtrace and ignore fatal errors #108524 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.4%	[0.4%, 0.4%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.4%	[-0.6%, -0.2%]	9
Improvements ✅ (secondary)	-0.3%	[-0.3%, -0.3%]	1
All ❌✅ (primary)	-0.4%	[-0.6%, 0.4%]	10

already triaged

Remove the assume(!is_null) from Vec::as_ptr #106967 (Comparison Link)

(instructions:u)	mean	range	count
Regressions ❌ (primary)	0.6%	[0.3%, 1.7%]	16
Regressions ❌ (secondary)	0.7%	[0.2%, 1.4%]	15
Improvements ✅ (primary)	-0.8%	[-1.9%, -0.4%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.3%	[-1.9%, 1.7%]	20

results from final PR as it finally landed are a bit different (more broad in number of primary benchmarks effected) than the observations from trial performance run
we expected to see [~2.0%, 2.2%] opt-full regression to regex and serde-derive
we actually got a [~.7%, ~1.7%] regression to them instead; but a bunch of other primary benchmarks saw a slight dip (less than 0.7% of a regression for the new cases, many less than 0.4%)
the cycle-counts and wall-times indicate more of an effect on clap and webrender here, but the effect there shows potentially more benefit than harm
marking as triaged. Its too hard to make strong connections between the changes made here and actual costs/benefits.

2023-03-21 Triage Log

Regressions

Improvements

Mixed

Untriaged Pull Requests