Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-67538

Multi-doc transactions should fail if on an old, incompatible snapshot

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Critical - P2 Critical - P2
    • 5.0.14, 6.0.2, 6.1.0-rc1, 6.2.0-rc0
    • Affects Version/s: 6.0.0, 5.0.10, 6.1.0-rc1
    • Component/s: None
    • None
    • Fully Compatible
    • ALL
    • v6.1
    • Hide

      Run this noPassthrough jstest.

      /**
       * @tags: [
       *   requires_replication,
       * ]
       */
      (function() {
      'use strict';
      
      const replTest = new ReplSetTest({nodes: 2});
      replTest.startSet();
      replTest.initiate();
      
      const primary = replTest.getPrimary();
      const db = primary.getDB('test');
      
      assert.commandWorked(db['c'].insertOne({_id: 5, num: 0}));
      
      const s0 = db.getMongo().startSession();
      s0.startTransaction();
      assert.commandWorked(s0.getDatabase('test')['c'].deleteOne({_id: 5}));
      s0.commitTransaction();
      
      const clusterTime = s0.getClusterTime().clusterTime;
      
      assert.commandWorked(db['c'].createIndex({'value': 1}));
      
      // Start a transaction whose snapshot predates the completion of the index build, and which reserves
      // an oplog entry (i.e. writes) after the index build commits.
      try {
          const s1 = db.getMongo().startSession();
          s1.startTransaction({readConcern: {level: "snapshot", atClusterTime: clusterTime}});
          s1.getDatabase('test').c.insertOne({_id: 5, num: 1});
      
          // Transaction should have failed.
          assert(0);
      } catch (e) {
          assert(e.hasOwnProperty("errorLabels"), tojson(e));
          assert.contains("TransientTransactionError", e.errorLabels, tojson(e));
          assert.eq(e["code"], ErrorCodes.SnapshotUnavailable, tojson(e));
      }
      
      replTest.stopSet();
      })();
      
      Show
      Run this noPassthrough jstest. /** * @tags: [ * requires_replication, * ] */ (function() { 'use strict'; const replTest = new ReplSetTest({nodes: 2}); replTest.startSet(); replTest.initiate(); const primary = replTest.getPrimary(); const db = primary.getDB('test'); assert.commandWorked(db['c'].insertOne({_id: 5, num: 0})); const s0 = db.getMongo().startSession(); s0.startTransaction(); assert.commandWorked(s0.getDatabase('test')['c'].deleteOne({_id: 5})); s0.commitTransaction(); const clusterTime = s0.getClusterTime().clusterTime; assert.commandWorked(db['c'].createIndex({'value': 1})); // Start a transaction whose snapshot predates the completion of the index build, and which reserves // an oplog entry (i.e. writes) after the index build commits. try { const s1 = db.getMongo().startSession(); s1.startTransaction({readConcern: {level: "snapshot", atClusterTime: clusterTime}}); s1.getDatabase('test').c.insertOne({_id: 5, num: 1}); // Transaction should have failed. assert(0); } catch (e) { assert(e.hasOwnProperty("errorLabels"), tojson(e)); assert.contains("TransientTransactionError", e.errorLabels, tojson(e)); assert.eq(e["code"], ErrorCodes.SnapshotUnavailable, tojson(e)); } replTest.stopSet(); })();
    • Execution Team 2022-07-25, Execution Team 2022-08-08, Execution Team 2022-08-22, Execution Team 2022-09-05
    • 100

      In the following scenario:

      1. Multi-doc transaction starts, reading from a snapshot @ timestamp (9,1).
      2. (10,1): index build on collection A completes.
      3. (11,1): transaction writes to collection A, which involves updating the index that was built at (10, 1).

       
      At step 3 the transaction doesn't fail with SnapshotUnavailable code name + TransientTransactionError label + "Unable to read from a snapshot due to pending collection catalog changes; please retry the operation" message.
      This is a bug that opens the opportunity for a race condition to happen, which can result in data inconsistency as the index key won't get updated.

      Adding _indexCatalogEntry->isReady(opCtx) in _indexKeysOrWriteToSideTable resolves the issue: step 3 fails with SnapshotUnavailable, instead of silently progressing with an invalid, stale snapshot.

            Assignee:
            josef.ahmad@mongodb.com Josef Ahmad
            Reporter:
            josef.ahmad@mongodb.com Josef Ahmad
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: