Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-23944

Failure to commit chunk migration due to shutdown should not fassert

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.3.14
    • Affects Version/s: 3.2.5, 3.3.5
    • Component/s: Sharding
    • Fully Compatible
    • ALL
    • Sharding 16 (06/24/16), Sharding 2016-10-10
    • 18

      If the commit chunk migration code fails to apply the metadata change transaction to the config server, it will do a best-effort attempt to figure out whether the operation was actually applied or not. If this check fails for any reason, we currently terminate the server in order to avoid data corruption or loss.

      Before terminating the server, we should check whether it is being shutdown and if so, we can avoid introducing a fatal assertion.

      [js_test:multi_mongos2] 2016-04-27T00:18:01.698 0000 d20511| 2016-04-27T00:18:01.195 0000 I -        [conn8] Fatal assertion 34431 CallbackCanceled: Callback canceled
      [js_test:multi_mongos2] 2016-04-27T00:18:01.698 0000 d20511| 2016-04-27T00:18:01.195 0000 I -        [conn8]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.698 0000 d20511|
      [js_test:multi_mongos2] 2016-04-27T00:18:01.699 0000 d20511| ***aborting after fassert() failure
      [js_test:multi_mongos2] 2016-04-27T00:18:01.699 0000 d20511|
      [js_test:multi_mongos2] 2016-04-27T00:18:01.699 0000 d20511|
      [js_test:multi_mongos2] 2016-04-27T00:18:01.700 0000 d20511| 2016-04-27T00:18:01.198 0000 W SHARDING [signalProcessingThread] error encountered while cleaning up distributed ping entry for ip-10-45-46-73:20511:1461716245:2082352190 :: caused by :: ShutdownInProgress: Shutdown in progress
      [js_test:multi_mongos2] 2016-04-27T00:18:01.701 0000 d20511| 2016-04-27T00:18:01.198 0000 I CONTROL  [signalProcessingThread] now exiting
      [js_test:multi_mongos2] 2016-04-27T00:18:01.701 0000 d20511| 2016-04-27T00:18:01.198 0000 I NETWORK  [signalProcessingThread] shutdown: going to close listening sockets...
      [js_test:multi_mongos2] 2016-04-27T00:18:01.701 0000 d20511| 2016-04-27T00:18:01.198 0000 I NETWORK  [signalProcessingThread] closing listening socket: 13
      [js_test:multi_mongos2] 2016-04-27T00:18:01.702 0000 d20511| 2016-04-27T00:18:01.198 0000 I NETWORK  [signalProcessingThread] closing listening socket: 14
      [js_test:multi_mongos2] 2016-04-27T00:18:01.703 0000 d20511| 2016-04-27T00:18:01.198 0000 I NETWORK  [signalProcessingThread] removing socket file: /tmp/mongodb-20511.sock
      [js_test:multi_mongos2] 2016-04-27T00:18:01.703 0000 d20511| 2016-04-27T00:18:01.198 0000 I NETWORK  [signalProcessingThread] shutdown: going to flush diaglog...
      [js_test:multi_mongos2] 2016-04-27T00:18:01.703 0000 d20511| 2016-04-27T00:18:01.198 0000 I STORAGE  [signalProcessingThread] WiredTigerKVEngine shutting down
      [js_test:multi_mongos2] 2016-04-27T00:18:01.704 0000 d20511| 2016-04-27T00:18:01.203 0000 F -        [conn8] Got signal: 6 (Aborted).
      [js_test:multi_mongos2] 2016-04-27T00:18:01.704 0000 d20511|
      [js_test:multi_mongos2] 2016-04-27T00:18:01.704 0000 d20511|  0x15edd22 0x15eca49 0x15ed332 0x3cd2e0f7e0 0x3cd2a32625 0x3cd2a33e05 0x1573be1 0x121095d 0x1216283 0xc86bbb 0xc88933 0x11caa60 0xdc5eb5 0x9f4e5a 0x1597cd1 0x3cd2e07aa1 0x3cd2ae893d
      [js_test:multi_mongos2] 2016-04-27T00:18:01.705 0000 d20511| ----- BEGIN BACKTRACE -----
      [js_test:multi_mongos2] 2016-04-27T00:18:01.736 0000 d20511|  mongod(mongo::printStackTrace(std::ostream&) 0x32) [0x15edd22]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.736 0000 d20511|  mongod( 0x11ECA49) [0x15eca49]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.736 0000 d20511|  mongod( 0x11ED332) [0x15ed332]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.737 0000 d20511|  libpthread.so.0( 0xF7E0) [0x3cd2e0f7e0]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.737 0000 d20511|  libc.so.6(gsignal 0x35) [0x3cd2a32625]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.738 0000 d20511|  libc.so.6(abort 0x175) [0x3cd2a33e05]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.738 0000 d20511|  mongod(mongo::fassertFailedWithStatus(int, mongo::Status const&) 0xB1) [0x1573be1]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.740 0000 d20511|  mongod(mongo::MigrationSourceManager::commitDonateChunk(mongo::OperationContext*) 0x343D) [0x121095d]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.740 0000 d20511|  mongod( 0xE16283) [0x1216283]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.741 0000 d20511|  mongod(mongo::Command::run(mongo::OperationContext*, mongo::rpc::RequestInterface const&, mongo::rpc::ReplyBuilderInterface*) 0x80B) [0xc86bbb]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.742 0000 d20511|  mongod(mongo::Command::execCommand(mongo::OperationContext*, mongo::Command*, mongo::rpc::RequestInterface const&, mongo::rpc::ReplyBuilderInterface*) 0x8B3) [0xc88933]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.743 0000 d20511|  mongod(mongo::runCommands(mongo::OperationContext*, mongo::rpc::RequestInterface const&, mongo::rpc::ReplyBuilderInterface*) 0x260) [0x11caa60]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.743 0000 d20511|  mongod(mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&) 0xB35) [0xdc5eb5]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.744 0000 d20511|  mongod(mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*) 0xEA) [0x9f4e5a]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.744 0000 d20511|  mongod(mongo::PortMessageServer::handleIncomingMsg(void*) 0x311) [0x1597cd1]
      [js_test:multi_mongos2] 2016-04-27T00:18:01.745 0000 d20511|  libpthread.so.0( 0x7AA1) [0x3cd2e07aa1]
      

            Assignee:
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Reporter:
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: