Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-26179

Do not join the TaskRunner within a runner task in CollectionBulkLoaderImpl::init

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3.15
    • Component/s: Replication
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      Apply the following patch to have the cloner attempt to build an index that the IndexCatalog reports already exists since buildIndexes=false.

      python buildscripts/resmoke.py --executor=replica_sets jstests/replsets/buildindexes.js
      

      diff --git a/jstests/replsets/buildindexes.js b/jstests/replsets/buildindexes.js
      index f6a8a78..16abd10 100644
      --- a/jstests/replsets/buildindexes.js
      +++ b/jstests/replsets/buildindexes.js
      @@ -5,15 +5,22 @@
           var name = "buildIndexes";
           var host = getHostName();
       
      -    var replTest = new ReplSetTest({name: name, nodes: 3});
      +    var replTest = new ReplSetTest({name: name, nodes: 2});
       
      -    var nodes = replTest.startSet();
      +    replTest.startSet();
      +    replTest.initiate();
      +
      +    // Create an index before having the secondary start its initial sync to verify that the
      +    // 'buildIndexes=false' mode causes index builds to be ignored during the cloning process.
      +    assert.commandWorked(replTest.getPrimary().getDB("test").mycoll.createIndex({field: 1}));
      +
      +    replTest.add({});
       
           var config = replTest.getReplSetConfig();
           config.members[2].priority = 0;
           config.members[2].buildIndexes = false;
      -
      -    replTest.initiate(config);
      +    config.version = 2;
      +    assert.commandWorked(replTest.getPrimary().adminCommand({replSetReconfig: config}));
       
           var master = replTest.getPrimary().getDB(name);
           var slaveConns = replTest.liveNodes.slaves;
      

      Show
      Apply the following patch to have the cloner attempt to build an index that the IndexCatalog reports already exists since buildIndexes=false . python buildscripts/resmoke.py --executor=replica_sets jstests/replsets/buildindexes.js diff --git a/jstests/replsets/buildindexes.js b/jstests/replsets/buildindexes.js index f6a8a78..16abd10 100644 --- a/jstests/replsets/buildindexes.js +++ b/jstests/replsets/buildindexes.js @@ -5,15 +5,22 @@ var name = "buildIndexes"; var host = getHostName(); - var replTest = new ReplSetTest({name: name, nodes: 3}); + var replTest = new ReplSetTest({name: name, nodes: 2}); - var nodes = replTest.startSet(); + replTest.startSet(); + replTest.initiate(); + + // Create an index before having the secondary start its initial sync to verify that the + // 'buildIndexes=false' mode causes index builds to be ignored during the cloning process. + assert.commandWorked(replTest.getPrimary().getDB("test").mycoll.createIndex({field: 1})); + + replTest.add({}); var config = replTest.getReplSetConfig(); config.members[2].priority = 0; config.members[2].buildIndexes = false; - - replTest.initiate(config); + config.version = 2; + assert.commandWorked(replTest.getPrimary().adminCommand({replSetReconfig: config})); var master = replTest.getPrimary().getDB(name); var slaveConns = replTest.liveNodes.slaves;
    • Sprint:
      Repl 2016-10-10

      Description

      The issue here was that the CollectionBulkLoaderImpl was created in the runner task, and init was called in the task which led to a failure where the loader wasn't returned. The destructor was called, which waited for runner to join, which couldn't happen since this was within the active task in the runner.

      Original description

      The CollectionBulkLoaderImpl is constructed with an active TaskRunner because it is given the same instance as StorageInterfaceImpl::createCollectionForBulkLoading() created. If CollectionBulkLoaderImpl::init() returns an error status, then CollectionBulkLoaderImpl::commit() is never called, which means the CollectionBulkLoaderImpl never calls TaskRunner::runSynchronousTask() to set TaskRunner::_active back to false.

      Thread 3 (Thread 0x7f3dacc7f700 (LWP 28390)):
      #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
      #1  0x00007f3dc7f8815c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /data/mci/toolchain-builder/build-gcc-v2.sh-cAV/x86_64-mongodb-linux/libstdc++-v3/include/x86_64-mongodb-linux/bits/gthr-default.h:864
      #2  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../gcc-5.3.0/libstdc++-v3/src/c++11/condition_variable.cc:53
      #3  0x00007f3dc6d17c6b in wait<(lambda at src/mongo/db/repl/task_runner.cpp:130:25)> (this=0x7f3dccca3938, __lock=..., __p=...) at /opt/mongodbtoolchain/v2/include/c++/5.3.0/condition_variable:98
      #4  mongo::repl::TaskRunner::join (this=0x7f3dccca3900) at src/mongo/db/repl/task_runner.cpp:130
      #5  0x00007f3dc6d59498 in mongo::repl::CollectionBulkLoaderImpl::~CollectionBulkLoaderImpl (this=0x7f3dccca3a80) at src/mongo/db/repl/collection_bulk_loader_impl.cpp:81
      #6  0x00007f3dc6d59cbe in mongo::repl::CollectionBulkLoaderImpl::~CollectionBulkLoaderImpl (this=0x7f3dccca3a80) at src/mongo/db/repl/collection_bulk_loader_impl.cpp:80
      #7  0x00007f3dc6d57af5 in operator() (__ptr=0x7f3dccca393c, this=<optimized out>) at /opt/mongodbtoolchain/v2/include/c++/5.3.0/bits/unique_ptr.h:76
      #8  ~unique_ptr (this=<optimized out>) at /opt/mongodbtoolchain/v2/include/c++/5.3.0/bits/unique_ptr.h:236
      #9  operator() (this=<optimized out>, txn=0x7f3dccb3f2c0) at src/mongo/db/repl/storage_interface_impl.cpp:297
      #10 std::_Function_handler<mongo::Status (mongo::OperationContext*), mongo::repl::StorageInterfaceImpl::createCollectionForBulkLoading(mongo::NamespaceString const&, mongo::CollectionOptions const&, mongo::BSONObj, std::vector<mongo::BSONObj, std::allocator<mongo::BSONObj> > const&)::$_0>::_M_invoke(std::_Any_data const&, mongo::OperationContext*&&) (__functor=..., __args=<optimized out>) at /opt/mongodbtoolchain/v2/include/c++/5.3.0/functional:1856
      #11 0x00007f3dc6d192aa in operator() (this=0x80, __args=<optimized out>) at /opt/mongodbtoolchain/v2/include/c++/5.3.0/functional:2271
      #12 operator() (this=0x7f3dc95d6bf0, txn=<optimized out>, taskStatus=...) at src/mongo/db/repl/task_runner.cpp:230
      #13 std::_Function_handler<mongo::repl::TaskRunner::NextAction (mongo::OperationContext*, mongo::Status const&), mongo::repl::TaskRunner::runSynchronousTask(std::function<mongo::Status (mongo::OperationContext*)>, mongo::repl::TaskRunner::NextAction)::$_4>::_M_invoke(std::_Any_data const&, mongo::OperationContext*&&, mongo::Status const&) (__functor=..., __args=..., __args=...) at /opt/mongodbtoolchain/v2/include/c++/5.3.0/functional:1856
      #14 0x00007f3dc6d189ed in operator() (this=0x7f3dccca393c, __args=..., __args=...) at /opt/mongodbtoolchain/v2/include/c++/5.3.0/functional:2271
      #15 mongo::repl::(anonymous namespace)::runSingleTask(std::function<mongo::repl::TaskRunner::NextAction (mongo::OperationContext*, mongo::Status const&)> const&, mongo::OperationContext*, mongo::Status const&) (task=..., txn=<optimized out>, status=...) at src/mongo/db/repl/task_runner.cpp:66
      #16 0x00007f3dc6d182c0 in mongo::repl::TaskRunner::_runTasks (this=0x7f3dccca3900) at src/mongo/db/repl/task_runner.cpp:151
      #17 0x00007f3dc611d8ad in operator() (this=<optimized out>) at /opt/mongodbtoolchain/v2/include/c++/5.3.0/functional:2271
      #18 mongo::ThreadPool::_doOneTask (this=0x7f3dcd065dc0, lk=0x7f3dacc7e6d0) at src/mongo/util/concurrency/thread_pool.cpp:326
      #19 0x00007f3dc611ebcd in mongo::ThreadPool::_consumeTasks (this=0x7f3dcd065dc0) at src/mongo/util/concurrency/thread_pool.cpp:278
      #20 0x00007f3dc611e5d5 in mongo::ThreadPool::_workerThreadBody (pool=0x7f3dcd065dc0, threadName=...) at src/mongo/util/concurrency/thread_pool.cpp:228
      #21 0x00007f3dc69162d0 in std::(anonymous namespace)::execute_native_thread_routine (__p=<optimized out>) at ../../../../../gcc-5.3.0/libstdc++-v3/src/c++11/thread.cc:84
      #22 0x00007f3dc302e184 in start_thread (arg=0x7f3dacc7f700) at pthread_create.c:312
      #23 0x00007f3dc2d5b37d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
      ...
      Thread 9 (Thread 0x7f3daec83700 (LWP 28354)):
      #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
      #1  0x00007f3dc7f8815c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /data/mci/toolchain-builder/build-gcc-v2.sh-cAV/x86_64-mongodb-linux/libstdc++-v3/include/x86_64-mongodb-linux/bits/gthr-default.h:864
      #2  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../gcc-5.3.0/libstdc++-v3/src/c++11/condition_variable.cc:53
      #3  0x00007f3dc6d18f7b in wait<(lambda at src/mongo/db/repl/task_runner.cpp:250:31)> (this=0x100000000, __lock=..., __p=...) at /opt/mongodbtoolchain/v2/include/c++/5.3.0/condition_variable:98
      #4  mongo::repl::TaskRunner::runSynchronousTask(std::function<mongo::Status (mongo::OperationContext*)>, mongo::repl::TaskRunner::NextAction) (this=<optimized out>, func=..., nextAction=<optimized out>) at src/mongo/db/repl/task_runner.cpp:250
      #5  0x00007f3dc6d54454 in mongo::repl::StorageInterfaceImpl::createCollectionForBulkLoading (this=<optimized out>, nss=..., options=..., idIndexSpec=..., secondaryIndexSpecs=...) at src/mongo/db/repl/storage_interface_impl.cpp:255
      #6  0x00007f3dc7073bba in mongo::repl::CollectionCloner::_beginCollectionCallback (this=0x7f3dccb92690, cbd=...) at src/mongo/db/repl/collection_cloner.cpp:341
      #7  0x00007f3dc7074bec in operator() (this=<optimized out>, __args=...) at /opt/mongodbtoolchain/v2/include/c++/5.3.0/functional:2271
      #8  operator() (this=<optimized out>, txn=<optimized out>, status=...) at src/mongo/db/repl/collection_cloner.cpp:118
      #9  std::_Function_handler<mongo::repl::TaskRunner::NextAction (mongo::OperationContext*, mongo::Status const&), mongo::repl::CollectionCloner::CollectionCloner(mongo::executor::TaskExecutor*, mongo::OldThreadPool*, mongo::HostAndPort const&, mongo::NamespaceString const&, mongo::CollectionOptions const&, std::function<void (mongo::Status const&)> const&, mongo::repl::StorageInterface*)::$_0::operator()(std::function<void (mongo::executor::TaskExecutor::CallbackArgs const&)> const&) const::{lambda(mongo::OperationContext*, mongo::Status const&)#1}>::_M_invoke(std::_Any_data const&, mongo::OperationContext*&&, mongo::Status const&) (__functor=..., __args=..., __args=...) at /opt/mongodbtoolchain/v2/include/c++/5.3.0/functional:1856
      #10 0x00007f3dc6d189ed in operator() (this=0x7f3daec82094, __args=..., __args=...) at /opt/mongodbtoolchain/v2/include/c++/5.3.0/functional:2271
      #11 mongo::repl::(anonymous namespace)::runSingleTask(std::function<mongo::repl::TaskRunner::NextAction (mongo::OperationContext*, mongo::Status const&)> const&, mongo::OperationContext*, mongo::Status const&) (task=..., txn=<optimized out>, status=...) at src/mongo/db/repl/task_runner.cpp:66
      #12 0x00007f3dc6d182c0 in mongo::repl::TaskRunner::_runTasks (this=0x7f3dccb92c18) at src/mongo/db/repl/task_runner.cpp:151
      #13 0x00007f3dc611d8ad in operator() (this=<optimized out>) at /opt/mongodbtoolchain/v2/include/c++/5.3.0/functional:2271
      #14 mongo::ThreadPool::_doOneTask (this=0x7f3dcd064000, lk=0x7f3daec826d0) at src/mongo/util/concurrency/thread_pool.cpp:326
      #15 0x00007f3dc611ebcd in mongo::ThreadPool::_consumeTasks (this=0x7f3dcd064000) at src/mongo/util/concurrency/thread_pool.cpp:278
      #16 0x00007f3dc611e5d5 in mongo::ThreadPool::_workerThreadBody (pool=0x7f3dcd064000, threadName=...) at src/mongo/util/concurrency/thread_pool.cpp:228
      #17 0x00007f3dc69162d0 in std::(anonymous namespace)::execute_native_thread_routine (__p=<optimized out>) at ../../../../../gcc-5.3.0/libstdc++-v3/src/c++11/thread.cc:84
      #18 0x00007f3dc302e184 in start_thread (arg=0x7f3daec83700) at pthread_create.c:312
      #19 0x00007f3dc2d5b37d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
      


      This issue does not affect the version of initial sync in MongoDB 3.2.

      python buildscripts/resmoke.py --executor=replica_sets jstests/replsets/buildindexes.js --mongodSetParameters='{use3dot2InitialSync: true, initialSyncOplogBuffer: "inMemoryBlockingQueue"}'
      

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: