Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-30487

RangeDeleter holds WT transaction open while waiting for majority

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.4.9
    • Affects Version/s: 3.4.7
    • Component/s: Sharding
    • Labels:
      None
    • Fully Compatible
    • ALL

      At log level 5, plus additional debug messages on entry to and exit from _waitForMajority, we see 3 WT transactions (536, 537, 538) begun and commited for the 3 documents deleted, but then an additional WT transaction (540) is created and remains open while _waitForMajority is called, and then is rolled back.

      2017-08-02T16:48:48.562-0400 I SHARDING [RangeDeleter] Deleter starting delete for: test.c from { _id: MinKey } -> { _id: MaxKey }, with opId: 358
      2017-08-02T16:48:48.562-0400 D SHARDING [RangeDeleter] begin removal of { : MinKey } to { : MaxKey } in test.c with write concern: { w: 1, j: false, wtimeout: 0 }
      
      2017-08-02T16:48:48.562-0400 D STORAGE  [RangeDeleter] WT begin_transaction for snapshot id 536
      2017-08-02T16:48:48.563-0400 D STORAGE  [RangeDeleter] WT commit_transaction for snapshot id 536
      2017-08-02T16:48:48.563-0400 D STORAGE  [RangeDeleter] WT begin_transaction for snapshot id 537
      2017-08-02T16:48:48.563-0400 D STORAGE  [RangeDeleter] WT commit_transaction for snapshot id 537
      2017-08-02T16:48:48.563-0400 D STORAGE  [RangeDeleter] WT begin_transaction for snapshot id 538
      2017-08-02T16:48:48.563-0400 D STORAGE  [RangeDeleter] WT commit_transaction for snapshot id 538
      
      2017-08-02T16:48:48.563-0400 D STORAGE  [RangeDeleter] WT begin_transaction for snapshot id 540
      2017-08-02T16:48:48.563-0400 D SHARDING [RangeDeleter] end removal of { : MinKey } to { : MaxKey } in test.c (took 0ms)
      2017-08-02T16:48:48.563-0400 I SHARDING [RangeDeleter] rangeDeleter deleted 3 documents for test.c from { _id: MinKey } -> { _id: MaxKey }
      2017-08-02T16:48:48.563-0400 I SHARDING [RangeDeleter] xxx enter _waitForMajority
      2017-08-02T16:48:48.580-0400 I SHARDING [RangeDeleter] xxx exit _waitForMajority
      2017-08-02T16:48:48.580-0400 D STORAGE  [RangeDeleter] WT rollback_transaction for snapshot id 540
      

      This can result in a very long running transaction if there is replication lag, which can result in the instance getting stuck with a full cache, and that can result in a stall of as much as an hour until the _waitForMajority times out.

            Assignee:
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Reporter:
            bruce.lucas@mongodb.com Bruce Lucas (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: