Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-40898

Transaction table updates may be applied out of order when retryable writes and transactions are in the same secondary batch

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.1.12
    • Component/s: Replication
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

       
      load("jstests/libs/write_concern_util.js");
       
      let rst = new ReplSetTest({nodes: 2});
      rst.startSet();
      rst.initiate();
       
      let primary = rst.getPrimary();
      let secondary = rst.getSecondary();
       
      // Pause replication so ops are applied in a single batch on secondary.
      stopServerReplication(secondary);
       
      const session = primary.getDB("test").getMongo().startSession({causalConsistency: false});
      const sessionDb = session.getDatabase("test");
      const sessionColl = sessionDb["test"];
       
      // Create collection.
      sessionColl.insert({});
       
      var k = 0;
      jsTestLog("Running some retryable writes.");
      for (var i = 0; i < 5; i++) {
          assert.commandWorked(primary.getDB("test").runCommand({
              insert: 'user',
              documents: [{x: k, retryableWrite: 1}],
              lsid: {id: session.getSessionId().id},
              txnNumber: NumberLong(k)
          }));
          k++
      }
       
      jsTestLog("Running some transactions.");
      for (var i = 0; i < 5; i++) {
          assert.commandWorked(sessionDb.runCommand({
              insert: "test",
              documents: [{x: k}],
              readConcern: {level: "snapshot"},
              txnNumber: NumberLong(k),
              startTransaction: true,
              autocommit: false
          }));
          assert.commandWorked(sessionDb.adminCommand(
              {commitTransaction: 1, writeConcern: {w: 1}, txnNumber: NumberLong(k), autocommit: false}));
          k++
      }
       
      restartServerReplication(secondary);
      // expect db hash mismatch on 'config.transactions'
      rst.stopSet();
      
      

      Show
        load( "jstests/libs/write_concern_util.js" );   let rst = new ReplSetTest({nodes: 2}); rst.startSet(); rst.initiate();   let primary = rst.getPrimary(); let secondary = rst.getSecondary();   // Pause replication so ops are applied in a single batch on secondary. stopServerReplication(secondary);   const session = primary.getDB( "test" ).getMongo().startSession({causalConsistency: false }); const sessionDb = session.getDatabase( "test" ); const sessionColl = sessionDb[ "test" ];   // Create collection. sessionColl.insert({});   var k = 0; jsTestLog( "Running some retryable writes." ); for ( var i = 0; i < 5; i++) { assert.commandWorked(primary.getDB( "test" ).runCommand({ insert: 'user' , documents: [{x: k, retryableWrite: 1}], lsid: {id: session.getSessionId().id}, txnNumber: NumberLong(k) })); k++ }   jsTestLog( "Running some transactions." ); for ( var i = 0; i < 5; i++) { assert.commandWorked(sessionDb.runCommand({ insert: "test" , documents: [{x: k}], readConcern: {level: "snapshot" }, txnNumber: NumberLong(k), startTransaction: true , autocommit: false })); assert.commandWorked(sessionDb.adminCommand( {commitTransaction: 1, writeConcern: {w: 1}, txnNumber: NumberLong(k), autocommit: false })); k++ }   restartServerReplication(secondary); // expect db hash mismatch on 'config.transactions' rst.stopSet();
    • Sprint:
      Sharding 2019-05-06, Sharding 2019-05-20
    • Linked BF Score:
      0

      Description

      When applying operations on a secondary, we may need to update the config.transactions table for either retryable writes or multi-statement transactions. We will update the transactions table by producing "derived" operations in the SessionUpdateTracker. These ops represent writes to the config.transactions table and are scheduled onto oplog writer threads as normal ops. When we iterate through each operation in a batch and assign them to applier threads, we may defer the transactions table updates for retryable writes by storing them in a map from session ids to oplog entries. If we don't explicitly flush the session updates during the scheduling of ops inside the first call to SyncTail::_fillWriterVectors, then we will schedule those ops later on, after an explicit flush. For transactions table updates for multi-statement transaction oplog entries, however, we return derived ops immediately and schedule them right away.

      The consequence of this is that the transaction table update for a retryable write may get applied after the transaction table update for a multi-statement transaction, even if the retryable write appeared before the multi-statement transaction in the oplog. We may consider fixing this by making multi-statement transactions updates to the transactions table deferred as well, but also making sure that the "in-progress" and "committed" writes are kept distinct.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              blake.oler Blake Oler
              Reporter:
              william.schultz William Schultz (Inactive)
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: