2021-03-23 Triage Log

An overall busy but decent week for performance. While there were some performance regressions they were mostly small, and they were outnumbered by performance gains. Perhaps the most interesting news is not a compiler performance improvement but rather the introduction of no-alias optimizations at the LLVM level. This slightly hurts optimized build time performance in some cases, but it should make some workloads run faster after compilation.

Triage done by @rylev. Revision range: f24ce9b0140d9be5a336954e878d0c1522966bb8..9b6339e4b9747d473270baa42e77e1d2fff39bf4

2 Regressions, 5 Improvements, 3 Mixed 1 of them in rollups

Regressions

Implement (but don't use) valtree and refactor in preparation of use #82936

  • Moderate regression in instruction counts (up to 2.1% on full builds of ctfe-stress-4-opt)
  • Purely an addition of unused code (for a future feature). It is possible that this changed some inlining behavior, but the benchmark in question is susceptible to high variance though it seemed to impact full builds and not incremental builds.
  • The query impacted is eval_to_allocation_raw which is what ctfe stresses, so we'll look into it.

Use TrustedRandomAccess for in-place iterators where possible #79846

  • Moderate regression in instruction counts (up to 1.5% on full builds of deep-vector-debug)
  • This is a change in the standard library that seems to only impact one benchmark: deep-vector-debug full compilation. It looks to be impacting typeck, but I‘m not sure why this would be. It’s also possible that it's noise.
  • There's also a possibility that this has some strange interaction with the performance gained in #83360.

Improvements

Add a check for ASCII characters in to_upper and to_lower #81358

  • Moderate improvement in instruction counts (up to -2.7% on incr-unchanged builds of match-stress-enum-check)

ast/hir: Rename field-related structures #83188

  • Moderate improvement in instruction counts (up to -1.7% on incr-unchanged builds of deep-vector-check)

Revert performance-sensitive change in #82436 #83293

  • Very large improvement in instruction counts (up to -10.6% on incr-full builds of packed-simd-check)

Rollup of 9 pull requests #83360

  • Moderate improvement in instruction counts (up to -1.5% on full builds of deep-vector-debug)

Simplify encoder and decoder #83273

  • Moderate improvement in instruction counts (up to -1.5% on incr-unchanged builds of tokio-webpush-simple-check)

Mixed

feat: Update hashbrown to instantiate less llvm IR #77566

  • Large improvement in instruction counts (up to -5.1% on full builds of cargo-debug)
  • Moderate regression in instruction counts (up to 2.6% on incr-full builds of ctfe-stress-4-debug)
  • Just an update of hashbrown, so we have less visibility into what the changes actually are (without digging deeper into hashbrown itself).The reason the update is important is because hashbrown's newer versions emit less LLVM IR.
  • This is largely a performance gain, but there is one regression in ctfe that remained. The reviewers determined that merging was more important that investigating that issue, but it will be looked into.

Replace closures_captures and upvar_capture with closure_min_captures #82951

  • Moderate improvement in instruction counts (up to -3.4% on incr-unchanged builds of clap-rs-check)
  • Moderate regression in instruction counts (up to 1.2% on incr-unchanged builds of tuple-stress-check)
  • Perf run was done during review, but it seems that the small regressions seen then (all under 1%) have gotten slightly worse and pushed some of them over our arbitrary threshold of 1%.
  • Regression is coming in incremental builds in the encode_query_results_for query which explains why this is impacting incremental builds. While this is minor, we will look into it.

Enable mutable noalias for LLVM >= 12 #82834

  • Large improvement in instruction counts (up to -9.0% on incr-patched: Job builds of regex-debug)
  • Moderate regression in instruction counts (up to 3.8% on incr-unchanged builds of syn-opt)
  • The negative performance impacts are almost exclusively in the optimized build benchmarks. This was deemed to be acceptable since this is largely a change which impacts optimizations. This will hopefully be taken care of in LLVM itself, and some of that compile performance will be won back.
  • More discussion of this can be found here.

Nags requiring follow up