Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-39074

Update outside transaction erasing update from committed prepared transaction

    • Fully Compatible
    • ALL
    • Hide

      Run this concurrency workload in concurrency_sharded_replication.yml (may need to be repeated):

      'use strict';
      
      load('jstests/concurrency/fsm_libs/extend_workload.js');  // for extendWorkload
      load('jstests/concurrency/fsm_workloads/multi_statement_transaction_simple.js');  // for $config
      
      var $config = extendWorkload($config, function($config, $super) {
      
          $config.threadCount = 20;
          $config.iterations = 100;
      
          $config.data.counter = 0;
      
          /**
           * Updates a document that may be written to by the transaction run in the base workload.
           */
          $config.states.updateTxnDoc = function updateTxnDoc(db, collName) {
              this.counter += 1;
      
              // Choose a random document that may be written to by the base workload. The base collection
              // contains documents with _id ranging from 0 to the number of accounts.
              const transactionDocId = Random.randInt(this.numAccounts);
              const threadUniqueField = 'thread' + this.tid;
              assertWhenOwnColl.writeOK(db[collName].update({_id: transactionDocId},
                                                            {$set: {[threadUniqueField]: this.counter}}));
          };
      
          $config.transitions = {
              init: {transferMoney: 1},
              transferMoney: {transferMoney: 0.5, checkMoneyBalance: 0.1, updateTxnDoc: 0.4},
              checkMoneyBalance: {transferMoney: 0.5, updateTxnDoc: 0.5},
              updateTxnDoc: {transferMoney: 0.5, updateTxnDoc: 0.5},
          };
      
          return $config;
      });
      Show
      Run this concurrency workload in concurrency_sharded_replication.yml (may need to be repeated): 'use strict' ; load( 'jstests/concurrency/fsm_libs/extend_workload.js' ); // for extendWorkload load( 'jstests/concurrency/fsm_workloads/multi_statement_transaction_simple.js' ); // for $config var $config = extendWorkload($config, function($config, $ super ) { $config.threadCount = 20; $config.iterations = 100; $config.data.counter = 0; /** * Updates a document that may be written to by the transaction run in the base workload. */ $config.states.updateTxnDoc = function updateTxnDoc(db, collName) { this .counter += 1; // Choose a random document that may be written to by the base workload. The base collection // contains documents with _id ranging from 0 to the number of accounts. const transactionDocId = Random.randInt( this .numAccounts); const threadUniqueField = 'thread' + this .tid; assertWhenOwnColl.writeOK(db[collName].update({_id: transactionDocId}, {$set: {[threadUniqueField]: this .counter}})); }; $config.transitions = { init: {transferMoney: 1}, transferMoney: {transferMoney: 0.5, checkMoneyBalance: 0.1, updateTxnDoc: 0.4}, checkMoneyBalance: {transferMoney: 0.5, updateTxnDoc: 0.5}, updateTxnDoc: {transferMoney: 0.5, updateTxnDoc: 0.5}, }; return $config; });
    • Repl 2019-02-11, Repl 2019-02-25, Storage NYC 2019-03-25

      In the attached concurrency workload, multi-statement transactions simulate transferring money between accounts by subtracting some amount from a document's "balance" field and adding it to another document's balance, periodically asserting the sum of all balances never changes (through a collection scan in a transaction with snapshot read concern).

      The workload also concurrently updates other fields in these documents outside of a transaction and when run against a sharded cluster, it fails because the sum of all balances does change, implying one of the updates from a committed transaction was lost. Every thread observes the unexpected sum, so this doesn't seem to be an issue with reading at the wrong timestamp, and the non-transaction updates don't modify the "balance" field and use $set, not replacement updates.

      This failure goes away if updates set ignore_prepare=false (currently set to true here), but from talking with the replication team, they don't believe this should be required for local updates. If that's true, then it's possible the non-transaction updates are somehow overwriting the updates of committed prepared transactions.

      Example failure in evergreen.

            Votes:
            0 Vote for this issue
            Watchers:
            25 Start watching this issue

              Created:
              Updated:
              Resolved: