Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-30632

determinePresplittingPoints() can divide by zero (SIGFPE) when numShards is zero

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.0-rc3
    • Affects Version/s: 3.6.0-rc1
    • Component/s: Sharding
    • Labels:
      None
    • Fully Compatible
    • ALL
    • Sharding 2017-09-11, Sharding 2017-10-23, Sharding 2017-11-13
    • 0

      In determinePresplittingPoints(), the numChunks variable becomes twice the number of shards when the number of initial chunks is unspecified in the "shardCollection" command request. However, it is possible for the determinePresplittingPoints() function to be called with numShards=0 if the output vector from ShardRegistry::getAllShardIds() is left unchanged.

      Note: This crash occurred after the "_configsvrAddShard" command had already completed, so it isn't clear to me what would have caused the sharding catalog to have behaved this way.

      // hashes are signed, 64-bit ints. So we divide the range (-MIN long, +MAX long)
      // into intervals of size (2^64/numChunks) and create split points at the
      // boundaries.  The logic below ensures that initial chunks are all
      // symmetric around 0.
      long long intervalSize = (std::numeric_limits<long long>::max() / numChunks) * 2;
      long long current = 0;
      
      $ /opt/mongodbtoolchain/gdb/bin/gdb ./mongod ./dump_conn10.16907.core
      ...
      Core was generated by `/data/mci/b8f5344d4794ffe82ac2e080adf4fa5e/src/mongod --oplogSize 40 --port 215'.
      Program terminated with signal SIGFPE, Arithmetic exception.
      #0  0x00007f86771ee405 in mongo::(anonymous namespace)::determinePresplittingPoints (opCtx=<optimized out>, allSplits=0x7f8633d21f40, initSplits=0x7f8633d21f20, request=..., shardKeyPattern=..., proposedKey=..., isEmpty=true, numShards=0) at src/mongo/db/s/config/configsvr_shard_collection_command.cpp:477
      [Current thread is 1 (LWP 18646)]
      (gdb) bt
      #0  0x00007f86771ee405 in mongo::(anonymous namespace)::determinePresplittingPoints (opCtx=<optimized out>, allSplits=0x7f8633d21f40, initSplits=0x7f8633d21f20, request=..., shardKeyPattern=..., proposedKey=..., isEmpty=true, numShards=0) at src/mongo/db/s/config/configsvr_shard_collection_command.cpp:477
      #1  mongo::(anonymous namespace)::ConfigSvrShardCollectionCommand::run (this=this@entry=0x7f86797a9b00 <mongo::(anonymous namespace)::configsvrShardCollectionCmd>, opCtx=opCtx@entry=0x7f867e0ac640, dbname=..., cmdObj=..., result=...) at src/mongo/db/s/config/configsvr_shard_collection_command.cpp:804
      #2  0x00007f8678362e48 in mongo::BasicCommand::enhancedRun (this=0x7f86797a9b00 <mongo::(anonymous namespace)::configsvrShardCollectionCmd>, opCtx=0x7f867e0ac640, request=..., result=...) at src/mongo/db/commands.cpp:376
      #3  0x00007f86772963c4 in mongo::(anonymous namespace)::runCommandImpl (opCtx=opCtx@entry=0x7f867e0ac640, command=command@entry=0x7f86797a9b00 <mongo::(anonymous namespace)::configsvrShardCollectionCmd>, request=..., replyBuilder=replyBuilder@entry=0x7f867df1eec0, startOperationTime=..., startOperationTime@entry=...) at src/mongo/db/service_entry_point_mongod.cpp:472
      #4  0x00007f8677297bd9 in mongo::(anonymous namespace)::execCommandDatabase (opCtx=opCtx@entry=0x7f867e0ac640, command=command@entry=0x7f86797a9b00 <mongo::(anonymous namespace)::configsvrShardCollectionCmd>, request=..., replyBuilder=<optimized out>, this=<optimized out>, this=<optimized out>, this=<optimized out>) at src/mongo/db/service_entry_point_mongod.cpp:692
      #5  0x00007f86772994f5 in mongo::(anonymous namespace)::runCommands (opCtx=opCtx@entry=0x7f867e0ac640, message=..., this=<optimized out>, this=<optimized out>) at src/mongo/db/service_entry_point_mongod.cpp:799
      #6  0x00007f867729a517 in mongo::ServiceEntryPointMongod::handleRequest (this=<optimized out>, opCtx=0x7f867e0ac640, m=...) at src/mongo/db/service_entry_point_mongod.cpp:1066
      #7  0x00007f86772a2f3e in mongo::ServiceStateMachine::_processMessage (this=this@entry=0x7f867df3c1d0, guard=...) at src/mongo/transport/service_state_machine.cpp:317
      #8  0x00007f86772a10eb in mongo::ServiceStateMachine::_runNextInGuard (this=this@entry=0x7f867df3c1d0, guard=...) at src/mongo/transport/service_state_machine.cpp:406
      #9  0x00007f86772a287f in mongo::ServiceStateMachine::runNext (this=0x7f867df3c1d0) at src/mongo/transport/service_state_machine.cpp:372
      #10 0x00007f867729e1b1 in mongo::ServiceEntryPointImpl::<lambda()>::operator() (__closure=0x7f867e055d60) at src/mongo/transport/service_entry_point_impl.cpp:89
      #11 std::_Function_handler<void(), mongo::ServiceEntryPointImpl::startSession(mongo::transport::SessionHandle)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /opt/mongodbtoolchain/v2/include/c++/5.4.0/functional:1871
      #12 0x00007f8678831494 in std::function<void ()>::operator()() const (this=<optimized out>) at /opt/mongodbtoolchain/v2/include/c++/5.4.0/functional:2267
      #13 mongo::(anonymous namespace)::runFunc (ctx=0x7f867e055d40) at src/mongo/transport/service_entry_point_utils.cpp:55
      #14 0x00007f8673dddaa1 in ?? ()
      #15 0x00007f8633d25700 in ?? ()
      #16 0x0000000000000000 in ?? ()
      

            Assignee:
            randolph@mongodb.com Randolph Tan
            Reporter:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: