Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-46466

Race with findAndModify retryable write and session migration

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Critical - P2 Critical - P2
    • 3.6.18, 4.0.17
    • Affects Version/s: 3.6.0, 4.0.0
    • Component/s: Sharding
    • None
    • Fully Compatible
    • ALL
    • v4.0, v3.6
    • Sharding 2020-03-09


      1. FindAndModify write with txnNumber 10 is executed in shardA
      2. Migration of chunk from shardA to shardB starts.
      3. Session migration thread pulled oplog for write in step#1 and passed all the checks and about to write oplog here
      4. A new retryable write with txnNumber 11 starts and successfully writes to oplog.
      5. Session migration thread writes oplog for txnNumber 10. Primary successfully wrote an oplog with higher optime but lower txnNumber.


      Secondaries can potentially hit this fassert:

      Note: this race is no longer possible in v4.2 because we checkout the session when session migration thread tries to process the oplog entries, so the interleaving is no longer possible.

      Here are the conditions to hit to this race:

      • running older than v4.2
      • using retryable writes with findAndModify
      • migrations happening while using retryable write

            randolph@mongodb.com Randolph Tan
            randolph@mongodb.com Randolph Tan
            0 Vote for this issue
            13 Start watching this issue