mongodump mishandles txns at start of dump

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: mongodump
    • 2
    • Tools and Replicator
    • 15

      Problem Statement/Rationale

      mongodump does this to start backing up oplog:

      • Grab the oldest active txn's optime.
      • If there's no active txn, grab the oplog's latest optime.

      The problem here is that, between those 2 steps, another txn could start. Thus, we'd lose that transaction in initial sync.

      Steps to Reproduce

      This requires some multi-op transaction. The only way we expect these to happen in a replset is if a transaction exceeds 16 MiB. As rare as user transactions are at all, such large txns are doubly so. In tandem with the small race window, the following scenario is arguably more academic than practical.

      • A transaction is prepared in the window between the two above-described fetches.
      • The commit would land after >=1 of the transaction's documents is read from the source.

      Expected Results

      The oplog buffer should contain the full transaction.

      Actual Results

      The oplog buffer currently will miss this transaction.

      Expected fix

      Reverse the order of steps: fetch the oplog, then check the transactions to see if we should go back earlier. When reading the txn, set afterClusterTime to the oplog fetch's timestamp.

      Note that server's logic here is more complex: it reads the oplog, then txn, then oplog again. This is because it's possible for the top of the oplog to be a commit. The server doesn't tolerate that very well. Mongodump, though, effectively ignores the commit in that case, so we can use the simpler workflow.

      Additional Notes

      This issue is more likely an issue in [MGOMIRROR-616|mongomirror].

            Assignee:
            Unassigned
            Reporter:
            Felipe Gasper
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: