Fix dist/s_fast to lint only files changed in the current branch

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Fixed
    • Priority: Minor - P4
    • WT12.0.0, 9.0.0-rc0
    • Affects Version/s: None
    • Component/s: Tools
    • None
    • Storage Engines - Transactions
    • 209.476
    • SE Transactions - 2026-06-05
    • 1

      Issue Summary

      dist/s_fast (which calls dist/s_all -F) is slow and incorrectly scopes its file checks when a branch has multiple merge commits from develop. On a branch with 36 actually-changed files, fast mode was scanning 439 files, negating the purpose of s_fast. A secondary bug caused s_evergreen_validate to prompt for Evergreen credentials on every s_fast run even when no Evergreen yml files were touched. Additionally, s_copyright and s_evergreen had no fast-mode awareness, adding ~17s and credential-check overhead unconditionally.

      Root Causes

      • The root cause is in last_commit_from_dev() in dist/common_functions.py.
      • It uses git rev-list HEAD...develop | tail -n 1 to find the oldest commit in the symmetric difference, then walks one parent back (~).
      • On a branch with merge commits from develop, this walks all the way back to the very first commit on the branch rather than the actual merge-base.
      • Every script that uses filter_if_fast inherits this wrong base: s_clang_format, s_style, s_string, s_copyright, function.py, and others all scan the inflated file set.
      • s_evergreen had no fast-mode awareness at all — it always ran evg_cfg.py check and the pip-version scan.
      • s_evergreen_validate had a fast-mode guard but it used last_commit_from_dev, so develop's own changes to evergreen.yml files were mistaken for "your changes", causing evergreen validate to run and prompt for credentials on every invocation.
      • s_copyright ran as a sequential (blocking) step in s_all with no fast-mode support, scanning 7,000+ source files unconditionally.

      Changes

      1. Fix last_commit_from_dev() in dist/common_functions.py

      Replace the broken rev-list approach with git merge-base origin/develop HEAD, which correctly returns the most recent common ancestor regardless of how many times develop has been merged in.

      def last_commit_from_dev():
          # Find the most recent common ancestor between the current branch and origin/develop.
          # Using git merge-base correctly handles branches that have merged develop multiple
          # times: it returns the latest develop commit reachable from HEAD, so that
          # filter_if_fast only sees files changed by the author's own commits rather than
          # every file touched by develop since the branch was first created.
          return subprocess.run("git merge-base origin/develop HEAD",
              shell=True, capture_output=True, text=True).stdout.strip()
      

      All scripts using filter_if_fast benefit automatically.

      2. Add fast-mode guard to dist/s_evergreen

      Add check_fast_mode_flag and an early-exit guard that skips the script when no evergreen.*\.yml files are in the changed set.

      check_fast_mode_flag
      
      # In fast mode, skip if no evergreen yml files were modified in our commits.
      if is_fast_mode; then
          search=`git diff --name-only "$(last_commit_from_dev)" | grep -E 'evergreen.*\.yml$'`
          if test -z "$search"; then
              exit 0
          fi
      fi
      

      3. dist/s_evergreen_validate — no change needed

      Its existing last_commit_from_dev-based guard works correctly once fix #1 lands. The spurious credential prompts disappear automatically.

      4. Add fast-mode support to dist/s_copyright

      s_copyright is a sequential (blocking) step in s_all; it was ~17s unconditionally. Its main file scan already uses do_in_parallel, which internally calls filter_if_fast. Adding a single check_fast_mode_flag call is sufficient to enable fast-mode filtering.

      . `dirname -- ${BASH_SOURCE[0]}`/common_functions.sh
      setup_trap
      cd_top
      check_fast_mode_flag   # new — enables filter_if_fast inside do_in_parallel
      

      The five one-off checks at the bottom (test/syscall/wt2336_base/base.run and four special_copyright calls) remain unconditional since they are trivially fast single-file checks.

      Performance

      Measured on a branch with 36 files changed from the merge-base:

      Scenario Wall time
      Before (broken rev-list, 439 files, no s_copyright fast mode) ~99s
      After (merge-base, 36 files, s_copyright fast mode) ~30s
      Improvement ~3.3x

      Acceptance Criteria

      • dist/s_fast on a branch with merge commits from develop scans only the files whose content differs from the merge-base (not every file develop touched since the branch was created).
      • Running dist/s_fast does not prompt for Evergreen credentials when no evergreen*.yml files were modified.
      • s_copyright completes in under 1s in fast mode when the changed file count is small.
      • dist/s_fast on a branch without merge commits behaves identically to before.
      • Files changed in early commits (before any merge from develop) are still linted, since their content at HEAD still differs from the merge-base.

            Assignee:
            Haribabu Kommi
            Reporter:
            Haribabu Kommi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: