[SERVER-20010] Segfault while dropping an index that failed to build Created: 18/Aug/15  Updated: 06/Dec/22  Resolved: 08/Dec/17

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 2.6.9
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kyle Erf Assignee: Backlog - Query Team (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

3 node replica set


Issue Links:
Related
is related to SERVER-17923 Creating/dropping multiple background... Closed
Assigned Teams:
Query
Operating System: Linux
Participants:

 Description   

We have a user hitting a situation where they

1. started a background build on index A, a sparse index
2. attempted to build index B concurrently on the same collection, which failed with

 Btree::insert: key too large to index 


3. this failure closed the collection's cursors, which in turn also caused the build of index A to fail
4. the background index A build fails, but the index is still listed in db.col.getIndexes()
5. the user attempts to drop to index A with the intention of rebuilding it, but on removing the index, they get

2015-08-12T18:01:55.540-0700 [conn503836] SEVERE: Invalid access at address: 0
2015-08-12T18:01:55.546-0700 [conn503836] SEVERE: Got signal: 11 (Segmentation fault).
Backtrace:0x1219651 0x1218a2e 0x1218b1f 0x7f2172d84cb0 0x8dcd27 0xb999d5 0x948df7 0x94bf59 0xa31b8a 0xa33d21 0xa35468 0xd74992 0xbb2bd2 0xbb41f0 0x7728f8 0x11cedeb 0x7f2172d7ce9a 0x7f21720902ed
 /home/madmin/deploy/lib/mongodb/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x1219651]
 /home/madmin/deploy/lib/mongodb/bin/mongod() [0x1218a2e]
 /home/madmin/deploy/lib/mongodb/bin/mongod() [0x1218b1f]
 /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0) [0x7f2172d84cb0]
 /home/madmin/deploy/lib/mongodb/bin/mongod(_ZN5mongo12IndexCatalog23killMatchingIndexBuildsERKNS0_17IndexKillCriteriaE+0x217) [0x8dcd27]
 /home/madmin/deploy/lib/mongodb/bin/mongod(_ZN5mongo12IndexBuilder23killMatchingIndexBuildsEPNS_10CollectionERKNS_12IndexCatalog17IndexKillCriteriaE+0x15) [0xb999d5]
 /home/madmin/deploy/lib/mongodb/bin/mongod(_ZN5mongo14CmdDropIndexes15stopIndexBuildsEPNS_8DatabaseERKNS_7BSONObjE+0x407) [0x948df7]
 /home/madmin/deploy/lib/mongodb/bin/mongod(_ZN5mongo14CmdDropIndexes3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x179) [0x94bf59]
 /home/madmin/deploy/lib/mongodb/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x3a) [0xa31b8a]
 /home/madmin/deploy/lib/mongodb/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x1691) [0xa33d21]
 /home/madmin/deploy/lib/mongodb/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x6f8) [0xa35468]
 /home/madmin/deploy/lib/mongodb/bin/mongod(_ZN5mongo11newRunQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x23d2) [0xd74992]
 /home/madmin/deploy/lib/mongodb/bin/mongod() [0xbb2bd2]
 /home/madmin/deploy/lib/mongodb/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x5a0) [0xbb41f0]
 /home/madmin/deploy/lib/mongodb/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x98) [0x7728f8]
 /home/madmin/deploy/lib/mongodb/bin/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x50b) [0x11cedeb]
 /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f2172d7ce9a]
 /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f21720902ed]



 Comments   
Comment by Ramon Fernandez Marina [ 11/Sep/15 ]

Sorry you're running into this issue klimashkin. This ticket is scheduled for the next 2.6 dot release, but there's currently no target date for that. As Eric mentions above this bug is not present on the MongoDB 3.0 series, so if this issue is critical for you you may want to consider upgrading to MongoDB 3.0.6.

Regards,
Ramón.

Comment by Paul Klimashkin [ 11/Sep/15 ]

Really critical issue. Our mongodb crashed twice a day after update to 2.6.11

Comment by Eric Milkie [ 18/Aug/15 ]

The work for SERVER-17923 prevents this from happening in version 3.0 and 3.2 branches. It hasn't been backported to the 2.6 branch.

Comment by J Rassi [ 18/Aug/15 ]

Looks like client->curop() is returning null in IndexCatalog::killMatchingIndexBuilds(), which sounds like a vaguely familiar race condition. Related to SERVER-16274 / SERVER-15871? milkie, would you mind taking a look?

addr2line output:

$ addr2line -e ./mongod -ifC 0x1219651 0x1218a2e 0x1218b1f 0x7f2172d84cb0 0x8dcd27 0xb999d5 0x948df7 0x94bf59 0xa31b8a 0xa33d21 0xa35468 0xd74992 0xbb2bd2 0xbb41f0 0x7728f8 0x11cedeb 0x7f2172d7ce9a 0x7f21720902ed
mongo::printStackTrace(std::ostream&)
/srv/10gen/mci-exec/mci/shell/src/src/mongo/util/stacktrace.cpp:304
abruptQuit
/srv/10gen/mci-exec/mci/shell/src/src/mongo/util/signal_handlers.cpp:107
abruptQuitWithAddrSignal
/srv/10gen/mci-exec/mci/shell/src/src/mongo/util/signal_handlers.cpp:201
??
??:0
mongo::IndexCatalog::killMatchingIndexBuilds(mongo::IndexCatalog::IndexKillCriteria const&)
/srv/10gen/mci-exec/mci/shell/src/src/mongo/db/curop.h:219
mongo::IndexBuilder::killMatchingIndexBuilds(mongo::Collection*, mongo::IndexCatalog::IndexKillCriteria const&)
/srv/10gen/mci-exec/mci/shell/src/src/mongo/db/index_builder.cpp:128
mongo::CmdDropIndexes::stopIndexBuilds(mongo::Database*, mongo::BSONObj const&)
/srv/10gen/mci-exec/mci/shell/src/src/mongo/db/commands/drop_indexes.cpp:88
~vector
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_vector.h:272
mongo::CmdDropIndexes::run(std::string const&, mongo::BSONObj&, int, std::string&, mongo::BSONObjBuilder&, bool)
/srv/10gen/mci-exec/mci/shell/src/src/mongo/db/commands/drop_indexes.cpp:108
mongo::_execCommand(mongo::Command*, std::string const&, mongo::BSONObj&, int, std::string&, mongo::BSONObjBuilder&, bool)
/srv/10gen/mci-exec/mci/shell/src/src/mongo/db/dbcommands.cpp:1385
mongo::Command::execCommand(mongo::Command*, mongo::Client&, int, char const*, mongo::BSONObj&, mongo::BSONObjBuilder&, bool)
/srv/10gen/mci-exec/mci/shell/src/src/mongo/db/dbcommands.cpp:1650
mongo::_runCommands(char const*, mongo::BSONObj&, mongo::_BufBuilder<mongo::TrivialAllocator>&, mongo::BSONObjBuilder&, bool, int)
/srv/10gen/mci-exec/mci/shell/src/src/mongo/db/dbcommands.cpp:1722
mongo::newRunQuery(mongo::Message&, mongo::QueryMessage&, mongo::CurOp&, mongo::Message&)
/srv/10gen/mci-exec/mci/shell/src/src/mongo/db/query/new_find.cpp:442
receivedQuery
/srv/10gen/mci-exec/mci/shell/src/src/mongo/db/instance.cpp:269
mongo::assembleResponse(mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&)
/srv/10gen/mci-exec/mci/shell/src/src/mongo/db/instance.cpp:510
std::string::_M_rep() const
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/basic_string.h:283
~basic_string
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/basic_string.h:478
~HostAndPort
/srv/10gen/mci-exec/mci/shell/src/src/mongo/util/net/hostandport.h:31
mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*)
/srv/10gen/mci-exec/mci/shell/src/src/mongo/db/db.cpp:203
boost::shared_ptr<mongo::Socket>::operator->() const
/srv/10gen/mci-exec/mci/shell/src/src/third_party/boost/boost/smart_ptr/shared_ptr.hpp:424
mongo::PortMessageServer::handleIncomingMsg(void*)
/srv/10gen/mci-exec/mci/shell/src/src/mongo/util/net/message_server_port.cpp:210
??
??:0
??
??:0

Generated at Thu Feb 08 03:52:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.