Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-9636

Provide a way for an application to interrupt a long-running compact command

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • WT11.1.0, 6.2.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
    • StorEng - Refinement Pipeline

      Summary
      Running compact on a large table can take a long time. Compacting a multi-TB table, for example, can last many hours.  Currently, there is no way for a user or application to interrupt compact if they discover that compact is placing too much load on the system.

      MongoDB has a killOp() command that is intended to halt long-running commands. A user would reasonably expect this to work on a compact command. But it won't because killOp can only interrupt a command when it is in the server code.  I.e., it can only interrupt an operation between calls into WiredTiger. In the case of compact, the caller enters WiredTiger and doesn't return until the command ends, possibly many hours later. 

      Motivation

      It would provide a better user experience if a user could halt a long-running compact command. 

      Suggested Solution
      One possible implementation (there may be others) would be to allow the compact command to take a callback function as an argument. WiredTiger could call this periodically during compaction and it would return boolean indicating whether compact should halt.  There is already code in compact that supports ending early based on an application provided timeout. So we could check the callback in the same place and leverage the existing infrastructure for ending the compaction.

            Assignee:
            sue.loverso@mongodb.com Susan LoVerso
            Reporter:
            keith.smith@mongodb.com Keith Smith
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: