Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-56839

Index seeks concurrent with recently-committed prepared transactions can return wrong results

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.4.7, 5.0.0-rc2, 4.2.16, 5.1.0-rc0
    • Affects Version/s: 4.2.0, 4.4.0, 5.0.0
    • Component/s: None
    • None
    • Fully Compatible
    • ALL
    • v5.0, v4.4, v4.2
    • Hide

      Create this failpoint:

      diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp
      index f91a65eb05..b91ce66b6f 100644
      --- a/src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp
      +++ b/src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp
      @@ -82,6 +82,7 @@ namespace {
      
       MONGO_FAIL_POINT_DEFINE(WTCompactIndexEBUSY);
       MONGO_FAIL_POINT_DEFINE(WTEmulateOutOfOrderNextIndexKey);
      +MONGO_FAIL_POINT_DEFINE(WTIndexPauseAfterSearchNear);
      
       using std::string;
       using std::vector;
      @@ -1050,6 +1051,16 @@ protected:
      
               LOGV2_TRACE_CURSOR(20089, "cmp: {cmp}", "cmp"_attr = cmp);
      
      +        WTIndexPauseAfterSearchNear.executeIf(
      +            [](const BSONObj&) {
      +                LOGV2(0, "hanging in search_near");
      +                WTIndexPauseAfterSearchNear.pauseWhileSet();
      +            },
      +            [&](const BSONObj& data) {
      +                LOGV2(0, "indexName", "name"_attr = data["indexName"].str());
      +                return data["indexName"].str() == _idx.indexName();
      +            });
      +
      

      And run this test:

      /**
       * Reproduces a bug where searching for a key returns an adjacent key in a recently-committed
       * prepared transaction.
       *
       * Create an index with a single key, "a". Insert a new key for "b" in a prepared transaction. This
       * creates a prepared, but uncommitted index entry before the key we want to search for, "c", which
       * doesn't exist. We will query (search_near internally) for "c" and the cursor will initially land
       * on "a". This is less than they key were searching for, so the cursor is advanced to the next key,
       * expecting to land on something greater than or equal to "c". Before this happens, the prepared
       * transaction commits, making "b" visible. As a result, the cursor lands on and returns "b" even
       * though we queried for "c".
       */
      (function() {
      "use strict";
      load("jstests/core/txns/libs/prepare_helpers.js");
      load("jstests/libs/fail_point_util.js");
      load('jstests/libs/parallel_shell_helpers.js');
      
      const replTest = new ReplSetTest({nodes: 1});
      replTest.startSet();
      replTest.initiate();
      
      const primary = replTest.getPrimary();
      const dbName = 'test';
      const collName = 'coll';
      
      const db = primary.getDB(dbName);
      assert.commandWorked(db[collName].createIndex({x: 1}));
      assert.commandWorked(db[collName].insert({x: 'a'}));
      
      const session = primary.startSession({causalConsistency: false});
      const sessionDB = session.getDatabase(dbName);
      const sessionColl = sessionDB.getCollection(collName);
      session.startTransaction();
      sessionColl.insert({x: 'b'});
      let prepareTimestamp = PrepareHelpers.prepareTransaction(session);
      
      let failpoint = configureFailPoint(primary, "WTIndexPauseAfterSearchNear", {indexName: 'x_1'});
      
      // After the query on 'c' starts, we commit the transaction and advance the cursor, landing on 'b'
      // instead of returning nothing.
      const awaitShell = startParallelShell(function() {
          assert.eq(0, db.coll.findOne({x: 'c'}));  // fails (or crashes in debug builds)
      }, primary.port);
      
      failpoint.wait();
      assert.commandWorked(PrepareHelpers.commitTransaction(session, prepareTimestamp));
      failpoint.off();
      awaitShell();
      
      replTest.stopSet();
      })();
      
      Show
      Create this failpoint: diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp index f91a65eb05..b91ce66b6f 100644 --- a/src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp +++ b/src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp @@ -82,6 +82,7 @@ namespace { MONGO_FAIL_POINT_DEFINE(WTCompactIndexEBUSY); MONGO_FAIL_POINT_DEFINE(WTEmulateOutOfOrderNextIndexKey); +MONGO_FAIL_POINT_DEFINE(WTIndexPauseAfterSearchNear); using std::string; using std::vector; @@ -1050,6 +1051,16 @@ protected: LOGV2_TRACE_CURSOR(20089, "cmp: {cmp}", "cmp"_attr = cmp); + WTIndexPauseAfterSearchNear.executeIf( + [](const BSONObj&) { + LOGV2(0, "hanging in search_near"); + WTIndexPauseAfterSearchNear.pauseWhileSet(); + }, + [&](const BSONObj& data) { + LOGV2(0, "indexName", "name"_attr = data["indexName"].str()); + return data["indexName"].str() == _idx.indexName(); + }); + And run this test: /** * Reproduces a bug where searching for a key returns an adjacent key in a recently-committed * prepared transaction. * * Create an index with a single key, "a" . Insert a new key for "b" in a prepared transaction. This * creates a prepared, but uncommitted index entry before the key we want to search for , "c" , which * doesn't exist. We will query (search_near internally) for "c" and the cursor will initially land * on "a" . This is less than they key were searching for , so the cursor is advanced to the next key, * expecting to land on something greater than or equal to "c" . Before this happens, the prepared * transaction commits, making "b" visible. As a result, the cursor lands on and returns "b" even * though we queried for "c" . */ ( function () { "use strict" ; load( "jstests/core/txns/libs/prepare_helpers.js" ); load( "jstests/libs/fail_point_util.js" ); load( 'jstests/libs/parallel_shell_helpers.js' ); const replTest = new ReplSetTest({nodes: 1}); replTest.startSet(); replTest.initiate(); const primary = replTest.getPrimary(); const dbName = 'test' ; const collName = 'coll' ; const db = primary.getDB(dbName); assert.commandWorked(db[collName].createIndex({x: 1})); assert.commandWorked(db[collName].insert({x: 'a' })); const session = primary.startSession({causalConsistency: false }); const sessionDB = session.getDatabase(dbName); const sessionColl = sessionDB.getCollection(collName); session.startTransaction(); sessionColl.insert({x: 'b' }); let prepareTimestamp = PrepareHelpers.prepareTransaction(session); let failpoint = configureFailPoint(primary, "WTIndexPauseAfterSearchNear" , {indexName: 'x_1' }); // After the query on 'c' starts, we commit the transaction and advance the cursor, landing on 'b' // instead of returning nothing. const awaitShell = startParallelShell( function () { assert.eq(0, db.coll.findOne({x: 'c' })); // fails (or crashes in debug builds) }, primary.port); failpoint.wait(); assert.commandWorked(PrepareHelpers.commitTransaction(session, prepareTimestamp)); failpoint.off(); awaitShell(); replTest.stopSet(); })();
    • Execution Team 2021-05-31
    • 50

      Read-only queries that perform index scans have the potential to return the wrong keys if they scan near recently-committed prepared transactions.

      The bug is as follows, in this order:

      • An index has a key for "a" and it is visible to all operations.
      • A key for "b" is not visible because it is prepared and uncommitted by a transaction.
      • Another operation queries for a key, "c". This operation uses search_near() for a key starting with "c". Because there are no keys afterward and because read-only queries ignore prepared updates, the cursor lands on an adjacent key, "a".
      • The prepared transaction involving "b" commits.
      • Because "a" compares less than "c", we call next() on the cursor to uphold our contract to return a key that will compare greater than or equal to "c" (note this is the case for the actual key we would store for "c"). Due to the concurrently committed transaction, the call to next() actually lands on "b", which is newly visible, as it compares after "a". We expect that a call to next() guarantees we land on a key after "c", however, that is not the case, and leads us to return keys other than the ones we were searching for.

      This does not affect write operations since they enforce and block on prepare conflicts. This only affects read-only queries.

            Assignee:
            louis.williams@mongodb.com Louis Williams
            Reporter:
            louis.williams@mongodb.com Louis Williams
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: