__live_restore_fh_read fast path uses WT_RET instead of return, causing spurious lock acquisition and redundant I/O on fully-migrated file handles

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Fixed
    • Priority: Major - P3
    • WT12.0.0, 9.0.0-rc0
    • Affects Version/s: None
    • Component/s: Live Restore
    • None
    • Environment:
      GCC 13, Ubuntu 24.04
    • Storage Engines - Persistence
    • 90.034
    • None
    • 1

      __live_restore_fh_read(src/live_restore/live_restore_fs.c) contains a fast path at line 760 that is intended to return immediately when the destination file is fully migrated (WTI_DEST_COMPLETE), bypassing the file lock and the slow path entirely.

      The fast path dispatches the read with WT_RET rather than a return. The WT_RET macro only returns from the calling function when the inner expression evaluates to a non-zero (error) value. On success, (which is the normal case) execution falls through and continues into the slow path body at line 763, where __wt_readlock is called. The slow path then re-evaluates WTI_DEST_COMPLETE via __live_restore_can_service_read, returns true and issues a second redundant read from the destination before releasing the lock.

      Note: The bug is not a corner case like it fires on every read to every file handle once migration finishes, which is the normal operating state. No data loss or crash occurs, but the fast-path contract is violated, every affected read performs twice the destination I/O, and reads unnecessarily contend with concurrent write locks that the fast path is designed to bypass entirely.

      For ReProduce:
      A Catch2 regression test in the test/catch2/live_restore/api/test_live_restore_fh_read_fast_path.cpp file (tag is [live_restore_fh_read_fast_path]) demonstrates the defect mechanically.

      I intentionally did not include the test code in the PR; we can discuss whether the test code is appropriate for the PR. I have attached the test code. However, I have implemented the fix in the PR.

      Purposed Fix:
      change at line 761 of src/live_restore/live_restore_fs.c, replace WT_RET with return.

        1. test_live_restore_fh_read_fast_path.cpp
          4 kB
        2. after_fix.txt
          0.2 kB
        3. before_fix.txt
          1 kB

            Assignee:
            Luke Pearson
            Reporter:
            Tuna KARABACAK
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: