Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-39847

Migrating session info can trigger fassert when destination shard has transaction history truncated by oplog

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.12, 4.0.7, 4.1.9
    • Affects Version/s: None
    • Component/s: Sharding
    • None
    • Fully Compatible
    • ALL
    • v4.0, v3.6
    • Sharding 2019-03-11

      Note: description is based on v4.0 code, code organization in current master changed a little bit, but the story is the same.

      1. Shard0 does retryable write on stmt: 0.
      2. Shard0 migrates chunk to Shard1.
      3. History of stmt: 0 gets transferred to Shard1.
      4. Time passes such that history of stmt: 0 in the oplog gets rolled over.
      5. Shard0 migrates chunk to Shard1 again. (Note: to trigger this bug, shard0 should not have rolled over the oplog yet!)
      6. Shard1 checks if stmt is already executed.
      7. Shard1 has stmt already in the cached map, but when it tries to retrieve the actual oplog, it'll realize that the oplog was already truncated and throw IncompleteTransactionHistory.
      8. The exception gets caught but it lets it through as an attempt to "repair/recover" lost history.
      9. However, after it inserts the oplog entry and the commit callback gets executed, it will find out that the stmt is already in the map and triggers the fassert.

            randolph@mongodb.com Randolph Tan
            randolph@mongodb.com Randolph Tan
            1 Vote for this issue
            12 Start watching this issue