Reclaiming oplog may block stepup/stepdown

XMLWordPrintableJSON

    • Replication
    • Fully Compatible
    • ALL
    • v8.1, v8.0, v7.0
    • Repl 2025-06-09
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The OplogCapMaintainerThread deletes oplog with a method called reclaimOplog(), with global and RSTL locks held. This method can take a long time if there is a lot of oplog to reclaim, longer than 30 seconds, resulting in stepup or stepdown crashing due to not being able to obtain the RSTL.

      Fix might be to make reclaimOplog() interruptible, or to take the global lock without the RSTL if this is safe, or both.

              Assignee:
              Solomon Lifshits
              Reporter:
              Matthew Russotto
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

                Created:
                Updated:
                Resolved: