Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-45060

Operations can use a Collection without having a storage snapshot where that collection is visible

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 4.3.2
    • Component/s: None
    • None
    • ALL
    • Hide

      Using this failpoint:

      Unable to find source-code formatter for language: diff. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
      diff --git a/src/mongo/db/catalog/uncommitted_collections.cpp b/src/mongo/db/catalog/uncommitted_collections.cpp
      index db9a6e3c97..b48b88e9d9 100644
      --- a/src/mongo/db/catalog/uncommitted_collections.cpp
      +++ b/src/mongo/db/catalog/uncommitted_collections.cpp
      @@ -34,9 +34,14 @@
       #include "mongo/db/catalog/collection_catalog.h"
       #include "mongo/db/catalog/uncommitted_collections.h"
       #include "mongo/db/storage/durable_catalog.h"
      +#include "mongo/logv2/log.h"
       #include "mongo/util/assert_util.h"
      +#include "mongo/util/fail_point.h"
       
       namespace mongo {
      +
      +MONGO_FAIL_POINT_DEFINE(hangAfterRegisteringCollection);
      +
       namespace {
       const auto getUncommittedCollections =
           OperationContext::declareDecoration<UncommittedCollections>();
      @@ -69,9 +74,20 @@ void UncommittedCollections::addToTxn(OperationContext* opCtx,
       
       
           opCtx->recoveryUnit()->registerPreCommitHook(
      -        [collListUnowned, uuid, createTime](OperationContext* opCtx) {
      +        [collListUnowned, nss, uuid, createTime](OperationContext* opCtx) {
                   UncommittedCollections::commit(opCtx, uuid, createTime, collListUnowned.lock().get());
      +
      +            hangAfterRegisteringCollection.executeIf(
      +                [&](const BSONObj& data) {
      +                    LOGV2(46156012, "hanging after registering collection.", "nss"_attr = nss);
      +                    hangAfterRegisteringCollection.pauseWhileSet(opCtx);
      +                },
      +                [&](const BSONObj& data) {
      +                    auto collElem = data["collection"];
      +                    return !collElem || collElem.str() == nss.ns();
      +                });
               });
      +
           opCtx->recoveryUnit()->onCommit(
               [collListUnowned, collPtr, createTime](boost::optional<Timestamp> commitTs) {
                   // Verify that the collection was given a minVisibleTimestamp equal to the transactions
      

      This no_passthrough test fails:

      (function() {
      "use strict";
      
      load("jstests/libs/fail_point_util.js");
      
      const replSet = new ReplSetTest({nodes: 1});
      replSet.startSet();
      replSet.initiate();
      
      const primary = replSet.getPrimary();
      const primaryDB = primary.getDB("test");
      const primaryColl = primaryDB.getCollection("coll");
      
      // Set failpoint
      let failPoint =
          configureFailPoint(primaryDB, "hangAfterRegisteringCollection", {collection: "test.coll"});
      
      // Implicitly create collection. This will hang on the failpoint.
      let awaitCreate = startParallelShell(function() {
          assert.commandWorked(db.getMongo().getCollection('test.coll').insert({a: 1}));
      }, primary.port);
      
      // Wait for failpoint to hit.
      failPoint.wait();
      
      // Should fail
      primaryColl.createIndex({a: 1});
      
      failPoint.off();
      
      awaitCreate();
      replSet.stopSet()
      })();
      
      Show
      Using this failpoint: Unable to find source-code formatter for language: diff. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml diff --git a/src/mongo/db/catalog/uncommitted_collections.cpp b/src/mongo/db/catalog/uncommitted_collections.cpp index db9a6e3c97..b48b88e9d9 100644 --- a/src/mongo/db/catalog/uncommitted_collections.cpp +++ b/src/mongo/db/catalog/uncommitted_collections.cpp @@ -34,9 +34,14 @@ #include "mongo/db/catalog/collection_catalog.h" #include "mongo/db/catalog/uncommitted_collections.h" #include "mongo/db/storage/durable_catalog.h" +#include "mongo/logv2/log.h" #include "mongo/util/assert_util.h" +#include "mongo/util/fail_point.h" namespace mongo { + +MONGO_FAIL_POINT_DEFINE(hangAfterRegisteringCollection); + namespace { const auto getUncommittedCollections = OperationContext::declareDecoration<UncommittedCollections>(); @@ -69,9 +74,20 @@ void UncommittedCollections::addToTxn(OperationContext* opCtx, opCtx->recoveryUnit()->registerPreCommitHook( - [collListUnowned, uuid, createTime](OperationContext* opCtx) { + [collListUnowned, nss, uuid, createTime](OperationContext* opCtx) { UncommittedCollections::commit(opCtx, uuid, createTime, collListUnowned.lock().get()); + + hangAfterRegisteringCollection.executeIf( + [&]( const BSONObj& data) { + LOGV2(46156012, "hanging after registering collection." , "nss" _attr = nss); + hangAfterRegisteringCollection.pauseWhileSet(opCtx); + }, + [&]( const BSONObj& data) { + auto collElem = data[ "collection" ]; + return !collElem || collElem.str() == nss.ns(); + }); }); + opCtx->recoveryUnit()->onCommit( [collListUnowned, collPtr, createTime](boost::optional<Timestamp> commitTs) { // Verify that the collection was given a minVisibleTimestamp equal to the transactions This no_passthrough test fails: ( function () { "use strict" ; load( "jstests/libs/fail_point_util.js" ); const replSet = new ReplSetTest({nodes: 1}); replSet.startSet(); replSet.initiate(); const primary = replSet.getPrimary(); const primaryDB = primary.getDB( "test" ); const primaryColl = primaryDB.getCollection( "coll" ); // Set failpoint let failPoint = configureFailPoint(primaryDB, "hangAfterRegisteringCollection" , {collection: "test.coll" }); // Implicitly create collection. This will hang on the failpoint. let awaitCreate = startParallelShell( function () { assert.commandWorked(db.getMongo().getCollection( 'test.coll' ).insert({a: 1})); }, primary.port); // Wait for failpoint to hit. failPoint.wait(); // Should fail primaryColl.createIndex({a: 1}); failPoint.off(); awaitCreate(); replSet.stopSet() })();
    • Execution Team 2020-01-13, Execution Team 2020-01-27, Execution Team 2020-02-10, Execution Team 2020-02-24, Execution Team 2020-03-09
    • 20

      After SERVER-43859, collection creation only takes a MODE_IX lock. The registration of the Collection object in the CollectionCatalog is no longer atomic with the commit of the storage transaction, which was previously protected by a MODE_X lock.

      We use a pre-commit hook to register collections in the CollectionCatalog before committing the WriteUnitOfWork. The collection only becomes visible in the durable catalog once the WUOW commits.

      There is now a window of time where the collection can be registered in the CollectionCatalog, but not visible to any storage snapshots, even those reading without a timestamp (so minVisibleSnapshot does not help). This causes certain debug invariants to fail when confirming that the in-memory IndexCatalog is consistent with the DurableCatalog. See here for example.

            Assignee:
            maria.vankeulen@mongodb.com Maria van Keulen
            Reporter:
            louis.williams@mongodb.com Louis Williams
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: