[SERVER-30632] determinePresplittingPoints() can divide by zero (SIGFPE) when numShards is zero Created: 13/Aug/17  Updated: 30/Oct/23  Resolved: 06/Nov/17

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.6.0-rc1
Fix Version/s: 3.6.0-rc3

Type: Bug Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Randolph Tan
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2017-09-11, Sharding 2017-10-23, Sharding 2017-11-13
Participants:
Linked BF Score: 0

 Description   

In determinePresplittingPoints(), the numChunks variable becomes twice the number of shards when the number of initial chunks is unspecified in the "shardCollection" command request. However, it is possible for the determinePresplittingPoints() function to be called with numShards=0 if the output vector from ShardRegistry::getAllShardIds() is left unchanged.

Note: This crash occurred after the "_configsvrAddShard" command had already completed, so it isn't clear to me what would have caused the sharding catalog to have behaved this way.

// hashes are signed, 64-bit ints. So we divide the range (-MIN long, +MAX long)
// into intervals of size (2^64/numChunks) and create split points at the
// boundaries.  The logic below ensures that initial chunks are all
// symmetric around 0.
long long intervalSize = (std::numeric_limits<long long>::max() / numChunks) * 2;
long long current = 0;

$ /opt/mongodbtoolchain/gdb/bin/gdb ./mongod ./dump_conn10.16907.core
...
Core was generated by `/data/mci/b8f5344d4794ffe82ac2e080adf4fa5e/src/mongod --oplogSize 40 --port 215'.
Program terminated with signal SIGFPE, Arithmetic exception.
#0  0x00007f86771ee405 in mongo::(anonymous namespace)::determinePresplittingPoints (opCtx=<optimized out>, allSplits=0x7f8633d21f40, initSplits=0x7f8633d21f20, request=..., shardKeyPattern=..., proposedKey=..., isEmpty=true, numShards=0) at src/mongo/db/s/config/configsvr_shard_collection_command.cpp:477
[Current thread is 1 (LWP 18646)]
(gdb) bt
#0  0x00007f86771ee405 in mongo::(anonymous namespace)::determinePresplittingPoints (opCtx=<optimized out>, allSplits=0x7f8633d21f40, initSplits=0x7f8633d21f20, request=..., shardKeyPattern=..., proposedKey=..., isEmpty=true, numShards=0) at src/mongo/db/s/config/configsvr_shard_collection_command.cpp:477
#1  mongo::(anonymous namespace)::ConfigSvrShardCollectionCommand::run (this=this@entry=0x7f86797a9b00 <mongo::(anonymous namespace)::configsvrShardCollectionCmd>, opCtx=opCtx@entry=0x7f867e0ac640, dbname=..., cmdObj=..., result=...) at src/mongo/db/s/config/configsvr_shard_collection_command.cpp:804
#2  0x00007f8678362e48 in mongo::BasicCommand::enhancedRun (this=0x7f86797a9b00 <mongo::(anonymous namespace)::configsvrShardCollectionCmd>, opCtx=0x7f867e0ac640, request=..., result=...) at src/mongo/db/commands.cpp:376
#3  0x00007f86772963c4 in mongo::(anonymous namespace)::runCommandImpl (opCtx=opCtx@entry=0x7f867e0ac640, command=command@entry=0x7f86797a9b00 <mongo::(anonymous namespace)::configsvrShardCollectionCmd>, request=..., replyBuilder=replyBuilder@entry=0x7f867df1eec0, startOperationTime=..., startOperationTime@entry=...) at src/mongo/db/service_entry_point_mongod.cpp:472
#4  0x00007f8677297bd9 in mongo::(anonymous namespace)::execCommandDatabase (opCtx=opCtx@entry=0x7f867e0ac640, command=command@entry=0x7f86797a9b00 <mongo::(anonymous namespace)::configsvrShardCollectionCmd>, request=..., replyBuilder=<optimized out>, this=<optimized out>, this=<optimized out>, this=<optimized out>) at src/mongo/db/service_entry_point_mongod.cpp:692
#5  0x00007f86772994f5 in mongo::(anonymous namespace)::runCommands (opCtx=opCtx@entry=0x7f867e0ac640, message=..., this=<optimized out>, this=<optimized out>) at src/mongo/db/service_entry_point_mongod.cpp:799
#6  0x00007f867729a517 in mongo::ServiceEntryPointMongod::handleRequest (this=<optimized out>, opCtx=0x7f867e0ac640, m=...) at src/mongo/db/service_entry_point_mongod.cpp:1066
#7  0x00007f86772a2f3e in mongo::ServiceStateMachine::_processMessage (this=this@entry=0x7f867df3c1d0, guard=...) at src/mongo/transport/service_state_machine.cpp:317
#8  0x00007f86772a10eb in mongo::ServiceStateMachine::_runNextInGuard (this=this@entry=0x7f867df3c1d0, guard=...) at src/mongo/transport/service_state_machine.cpp:406
#9  0x00007f86772a287f in mongo::ServiceStateMachine::runNext (this=0x7f867df3c1d0) at src/mongo/transport/service_state_machine.cpp:372
#10 0x00007f867729e1b1 in mongo::ServiceEntryPointImpl::<lambda()>::operator() (__closure=0x7f867e055d60) at src/mongo/transport/service_entry_point_impl.cpp:89
#11 std::_Function_handler<void(), mongo::ServiceEntryPointImpl::startSession(mongo::transport::SessionHandle)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /opt/mongodbtoolchain/v2/include/c++/5.4.0/functional:1871
#12 0x00007f8678831494 in std::function<void ()>::operator()() const (this=<optimized out>) at /opt/mongodbtoolchain/v2/include/c++/5.4.0/functional:2267
#13 mongo::(anonymous namespace)::runFunc (ctx=0x7f867e055d40) at src/mongo/transport/service_entry_point_utils.cpp:55
#14 0x00007f8673dddaa1 in ?? ()
#15 0x00007f8633d25700 in ?? ()
#16 0x0000000000000000 in ?? ()



 Comments   
Comment by Githook User [ 06/Nov/17 ]

Author:

{'name': 'Randolph Tan', 'username': 'renctan', 'email': 'randolph@10gen.com'}

Message: SERVER-30632 Always check that there are shards when sharding collection
Branch: master
https://github.com/mongodb/mongo/commit/3432a9283ff1f12ef941b44d00cdb990495752f4

Comment by Kaloian Manassiev [ 14/Aug/17 ]

esha.maharishi, this was probably introduced as part of your work to cleanup the shardCollection command.

Generated at Thu Feb 08 04:24:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.