Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-40898

Transaction table updates may be applied out of order when retryable writes and transactions are in the same secondary batch

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.1.12
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • Fully Compatible
    • ALL
    • Hide
      
      load("jstests/libs/write_concern_util.js");
      
      let rst = new ReplSetTest({nodes: 2});
      rst.startSet();
      rst.initiate();
      
      let primary = rst.getPrimary();
      let secondary = rst.getSecondary();
      
      // Pause replication so ops are applied in a single batch on secondary.
      stopServerReplication(secondary);
      
      const session = primary.getDB("test").getMongo().startSession({causalConsistency: false});
      const sessionDb = session.getDatabase("test");
      const sessionColl = sessionDb["test"];
      
      // Create collection.
      sessionColl.insert({});
      
      var k = 0;
      jsTestLog("Running some retryable writes.");
      for (var i = 0; i < 5; i++) {
          assert.commandWorked(primary.getDB("test").runCommand({
              insert: 'user',
              documents: [{x: k, retryableWrite: 1}],
              lsid: {id: session.getSessionId().id},
              txnNumber: NumberLong(k)
          }));
          k++
      }
      
      jsTestLog("Running some transactions.");
      for (var i = 0; i < 5; i++) {
          assert.commandWorked(sessionDb.runCommand({
              insert: "test",
              documents: [{x: k}],
              readConcern: {level: "snapshot"},
              txnNumber: NumberLong(k),
              startTransaction: true,
              autocommit: false
          }));
          assert.commandWorked(sessionDb.adminCommand(
              {commitTransaction: 1, writeConcern: {w: 1}, txnNumber: NumberLong(k), autocommit: false}));
          k++
      }
      
      restartServerReplication(secondary);
      // expect db hash mismatch on 'config.transactions'
      rst.stopSet();
      
      
      Show
      load( "jstests/libs/write_concern_util.js" ); let rst = new ReplSetTest({nodes: 2}); rst.startSet(); rst.initiate(); let primary = rst.getPrimary(); let secondary = rst.getSecondary(); // Pause replication so ops are applied in a single batch on secondary. stopServerReplication(secondary); const session = primary.getDB( "test" ).getMongo().startSession({causalConsistency: false }); const sessionDb = session.getDatabase( "test" ); const sessionColl = sessionDb[ "test" ]; // Create collection. sessionColl.insert({}); var k = 0; jsTestLog( "Running some retryable writes." ); for ( var i = 0; i < 5; i++) { assert.commandWorked(primary.getDB( "test" ).runCommand({ insert: 'user' , documents: [{x: k, retryableWrite: 1}], lsid: {id: session.getSessionId().id}, txnNumber: NumberLong(k) })); k++ } jsTestLog( "Running some transactions." ); for ( var i = 0; i < 5; i++) { assert.commandWorked(sessionDb.runCommand({ insert: "test" , documents: [{x: k}], readConcern: {level: "snapshot" }, txnNumber: NumberLong(k), startTransaction: true , autocommit: false })); assert.commandWorked(sessionDb.adminCommand( {commitTransaction: 1, writeConcern: {w: 1}, txnNumber: NumberLong(k), autocommit: false })); k++ } restartServerReplication(secondary); // expect db hash mismatch on 'config.transactions' rst.stopSet();
    • Sharding 2019-05-06, Sharding 2019-05-20
    • 0

      When applying operations on a secondary, we may need to update the config.transactions table for either retryable writes or multi-statement transactions. We will update the transactions table by producing "derived" operations in the SessionUpdateTracker. These ops represent writes to the config.transactions table and are scheduled onto oplog writer threads as normal ops. When we iterate through each operation in a batch and assign them to applier threads, we may defer the transactions table updates for retryable writes by storing them in a map from session ids to oplog entries. If we don't explicitly flush the session updates during the scheduling of ops inside the first call to SyncTail::_fillWriterVectors, then we will schedule those ops later on, after an explicit flush. For transactions table updates for multi-statement transaction oplog entries, however, we return derived ops immediately and schedule them right away.

      The consequence of this is that the transaction table update for a retryable write may get applied after the transaction table update for a multi-statement transaction, even if the retryable write appeared before the multi-statement transaction in the oplog. We may consider fixing this by making multi-statement transactions updates to the transactions table deferred as well, but also making sure that the "in-progress" and "committed" writes are kept distinct.

            Assignee:
            blake.oler@mongodb.com Blake Oler
            Reporter:
            william.schultz@mongodb.com William Schultz (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: