Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-13252

Reset txn id of stable update restored with unstable prepared tombstone

    • Storage Engines
    • 5
    • 2024-07-23 - Mining crypto
    • v8.0, v7.0, v6.0, v5.0

      In the BF, we noticed that txn IDs in the update chain are preserved after node restart, causing false positive write conflict errors and eventually leading to evergreen timeout.

      Consider the following scenario
      1) Insert {k1:v1} happens at TS(100) with txnid = 9000.
      2) Remove {k1:v1} happens at TS(300) with txnid = 9001.
      3) An unclean shutdown happens with the lastCheckpointTS at TS(200).
      4) Node restarts with a recoveryTS (stable ts) as TS(200).
      5) Startup recovery oplog replay phase re-applies the remove op (step #2) but fails with a WT_ROLLBACK error.

      • Say, the remove operation's snapshot _min_txn_id = snapshot _max_txn_id=10, but the earlier update (step #1 insert op) has a txn id of 9000, causing it to fail the txn id visibility check.

      The expected behavior in the above example would be that the txn ID of the earlier update (step #1 insert op) after the crash would be reset to WT_TXN_NONE(0) by RTS. As a result, both the txn ID visibility check and timestamp check would pass, allowing the remove operation to succeed without a write conflict error.

        1. Screenshot 2024-07-11 at 1.24.51 AM.png
          208 kB
          Suganthi Mani
        2. test_bf.py
          4 kB
          Monica Ng

            Assignee:
            monica.ng@mongodb.com Monica Ng
            Reporter:
            suganthi.mani@mongodb.com Suganthi Mani
            Votes:
            0 Vote for this issue
            Watchers:
            21 Start watching this issue

              Created:
              Updated:
              Resolved: