Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-73106

[v4.4] Chunk migration attempts to wait for replication with session checked out when getLastErrorDefaults are used in replica set config, leading to server crash

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.4.19
    • Affects Version/s: 4.4.1
    • Component/s: Sharding
    • None
    • Sharding EMEA
    • Fully Compatible
    • ALL
    • Sharding EMEA 2023-02-06

      Chunk migration attempts to avoid waiting for replication when persisting the range deletion task document. However it may still end up using a write concern which requires waiting for replication due to the presence of getLastErrorDefaults in the replica set config. This leads to an invariant failure in ReplicationCoordinatorImpl::awaitReplication() due to the OperationContext having a logical session checked out.

      // It is illegal to wait for write concern with a session checked out, so persist the
      // range deletion task with an immediately satsifiable write concern and then wait for
      // majority after yielding the session.
      migrationutil::persistRangeDeletionTaskLocally(
          outerOpCtx, recipientDeletionTask, WriteConcernOptions());
      
      [ShardedClusterFixture:job0:shard0:primary] {"t":{"$date":"2023-01-19T20:54:21.510+00:00"},"s":"D2", "c":"REPL",     "id":22549,   "ctx":"migrateThread","msg":"Waiting for write concern. OpTime: {replOpTime}, write concern: {writeConcern}","attr":{"replOpTime":{"ts":{"$timestamp":{"t":1674161661,"i":576}},"t":1},"writeConcern":{"w":2,"wtimeout":0,"provenance":"getLastErrorDefaults"}}}
      [ShardedClusterFixture:job0:shard0:primary] {"t":{"$date":"2023-01-19T20:54:21.510+00:00"},"s":"F",  "c":"-",        "id":23079,   "ctx":"migrateThread","msg":"Invariant failure","attr":{"expr":"OperationContextSession::get(opCtx) == nullptr","file":"src/mongo/db/repl/replication_coordinator_impl.cpp","line":1897}}
      
      #0  0x00007f465f176e87 in raise () from /lib/x86_64-linux-gnu/libc.so.6
      #1  0x00007f465f1787f1 in abort () from /lib/x86_64-linux-gnu/libc.so.6
      #2  0x00005577fd59e947 in mongo::invariantFailed (expr=<optimized out>, expr@entry=0x5577ff7147b8 "OperationContextSession::get(opCtx) == nullptr", file=<optimized out>, file@entry=0x5577ff710b68 "src/mongo/db/repl/replication_coordinator_impl.cpp", line=<optimized out>, line@entry=1897) at src/mongo/util/assert_util.cpp:117
      #3  0x00005577fd2b9a07 in mongo::invariantWithLocation<bool> (testOK=<optimized out>, line=1897, file=0x5577ff710b68 "src/mongo/db/repl/replication_coordinator_impl.cpp", expr=0x5577ff7147b8 "OperationContextSession::get(opCtx) == nullptr") at src/mongo/util/invariant.h:69
      #4  mongo::repl::ReplicationCoordinatorImpl::awaitReplication (this=0x557802323000, opCtx=0x55780cd00fc0, opTime=..., writeConcern=...) at src/mongo/db/repl/replication_coordinator_impl.cpp:1897
      #5  0x00005577fddded87 in mongo::waitForWriteConcern (opCtx=0x55780cd00fc0, replOpTime=..., writeConcern=..., result=result@entry=0x7f4635baa810) at src/mongo/db/write_concern.cpp:354
      #6  0x00005577fdaafb86 in mongo::ServiceEntryPointMongod::Hooks::waitForWriteConcern(mongo::OperationContext*, mongo::CommandInvocation const*, mongo::repl::OpTime const&, mongo::BSONObjBuilder&) const::{lambda()#1}::operator()() const (__closure=__closure@entry=0x7f4635baa900) at src/mongo/db/service_entry_point_mongod.cpp:120
      #7  0x00005577fdaaff5b in mongo::ServiceEntryPointMongod::Hooks::waitForWriteConcern (this=<optimized out>, opCtx=<optimized out>, invocation=<optimized out>, lastOpBeforeRun=..., commandResponseBuilder=...) at src/mongo/db/service_entry_point_mongod.cpp:127
      #8  0x00005577fdabcbb4 in <lambda(auto:45&&)>::operator()<mongo::BSONObjBuilder> (bb=..., __closure=<synthetic pointer>) at src/mongo/db/service_entry_point_common.cpp:776
      #9  mongo::(anonymous namespace)::runCommandImpl (opCtx=<optimized out>, invocation=<optimized out>, request=..., replyBuilder=0x5578077a35e0, startOperationTime=..., behaviors=..., extraFieldsBuilder=0x7f4635bab050, sessionOptions=...) at src/mongo/db/service_entry_point_common.cpp:810
      #10 0x00005577fdabfeff in mongo::(anonymous namespace)::execCommandDatabase (behaviors=..., replyBuilder=<optimized out>, request=..., command=<optimized out>, opCtx=<optimized out>) at src/mongo/db/service_entry_point_common.cpp:1203
      #11 <lambda()>::operator()(void) const (__closure=<optimized out>) at src/mongo/db/service_entry_point_common.cpp:1391
      #12 0x00005577fdac0ecb in mongo::(anonymous namespace)::receivedCommands (behaviors=..., message=..., opCtx=<optimized out>) at src/mongo/db/service_entry_point_common.cpp:1316
      #13 mongo::ServiceEntryPointCommon::handleRequest (opCtx=0x55780cd00fc0, m=..., behaviors=...) at src/mongo/db/service_entry_point_common.cpp:1740
      #14 0x00005577fdaae95c in mongo::ServiceEntryPointMongod::handleRequest (this=<optimized out>, opCtx=<optimized out>, m=...) at src/mongo/db/service_entry_point_mongod.cpp:291
      #15 0x00005577fe18e509 in mongo::(anonymous namespace)::loopbackBuildResponse (opCtx=0x55780cd00fc0, lastError=0x7f4635bace90, toSend=...) at src/mongo/db/dbdirectclient.cpp:146
      #16 0x00005577fe18ea68 in mongo::DBDirectClient::call (this=<optimized out>, toSend=..., response=..., assertOk=<optimized out>, actualServer=<optimized out>) at src/mongo/db/dbdirectclient.cpp:151
      #17 0x00005577fef04252 in mongo::DBClientBase::runCommandWithTarget (this=0x7f4635bace00, request=...) at src/mongo/client/dbclient_base.cpp:221
      #18 0x00005577fdbe67ef in mongo::DBClientBase::runCommand (this=this@entry=0x7f4635bace00, request=...) at src/mongo/client/dbclient_base.h:256
      #19 0x00005577fdbe9c71 in mongo::PersistentTaskStore<mongo::RangeDeletionTask>::add (this=this@entry=0x7f4635bacf70, opCtx=opCtx@entry=0x55780cd00fc0, task=..., writeConcern=...) at src/mongo/db/s/persistent_task_store.h:64
      #20 0x00005577fdbdfdc3 in mongo::migrationutil::persistRangeDeletionTaskLocally (opCtx=0x55780cd00fc0, deletionTask=..., writeConcern=...) at src/mongo/db/s/migration_util.cpp:604
      #21 0x00005577fdbcb16a in mongo::MigrationDestinationManager::_migrateDriver (this=0x55780245ada8, outerOpCtx=<optimized out>) at src/mongo/db/s/migration_destination_manager.cpp:1098
      #22 0x00005577fdbccf41 in mongo::MigrationDestinationManager::_migrateThread (this=0x55780245ada8) at src/mongo/db/s/migration_destination_manager.cpp:898
      

            Assignee:
            antonio.fuschetto@mongodb.com Antonio Fuschetto
            Reporter:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: