Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-39847

Migrating session info can trigger fassert when destination shard has transaction history truncated by oplog

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • None
    • 3.6.12, 4.0.7, 4.1.9
    • Sharding
    • None
    • Fully Compatible
    • ALL
    • v4.0, v3.6
    • Sharding 2019-03-11

    Description

      Note: description is based on v4.0 code, code organization in current master changed a little bit, but the story is the same.

      1. Shard0 does retryable write on stmt: 0.
      2. Shard0 migrates chunk to Shard1.
      3. History of stmt: 0 gets transferred to Shard1.
      4. Time passes such that history of stmt: 0 in the oplog gets rolled over.
      5. Shard0 migrates chunk to Shard1 again. (Note: to trigger this bug, shard0 should not have rolled over the oplog yet!)
      6. Shard1 checks if stmt is already executed.
      7. Shard1 has stmt already in the cached map, but when it tries to retrieve the actual oplog, it'll realize that the oplog was already truncated and throw IncompleteTransactionHistory.
      8. The exception gets caught but it lets it through as an attempt to "repair/recover" lost history.
      9. However, after it inserts the oplog entry and the commit callback gets executed, it will find out that the stmt is already in the map and triggers the fassert.

      Attachments

        Issue Links

          Activity

            People

              randolph@mongodb.com Randolph Tan
              randolph@mongodb.com Randolph Tan
              Votes:
              1 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: