[SERVER-14342] Shard crashes on split operation if collection doesn't exist on that shard Created: 24/Jun/14  Updated: 04/Feb/15  Resolved: 17/Jul/14

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.4.10
Fix Version/s: 2.4.11

Type: Bug Priority: Major - P3
Reporter: Ger Hartnett Assignee: Randolph Tan
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
duplicates SERVER-11178 Create IndexCatalog and remove Catalo... Closed
Related
Operating System: ALL
Participants:

 Description   
Issue Status as of Aug 21, 2014

ISSUE SUMMARY
On a sharded cluster, a shard may crash when a splitChunkCommand operation is issued on a non-existing namespace. One way to reproduce this is to connect to a shard, drop a sharded collection and then issue the split command through mongos on that collection.

Non-sharded systems are not affected by this issue.

USER IMPACT
The shard where the namespace doesn't exist logs a stack trace in the log and crashes.

WORKAROUNDS
N/A

AFFECTED VERSIONS
MongoDB 2.4 production releases up to 2.4.10 are affected by this issue.

FIX VERSION
The fix is included in the 2.4.11 production release.

RESOLUTION DETAILS
Do not perform any split operations if the namespace doesn't exist on a given shard.

Original description

Mon Jun 16 15:17:05.222 Invalid access at address: 0xbc from thread: conn48
Mon Jun 16 15:17:05.222 Got signal: 11 (Segmentation fault).
Mon Jun 16 15:17:05.227 Backtrace:
0xde8c31 0x6d0b19 0x6d10a2 0x7f1613e04030 0xce06dc 0x8e1f9a 0x8e2a1d 0x8e3f72 0xa89910 0xa8e1dc 0xa0182e 0xa02b73 0x6eb838 0xdd51fe 0x7f1613dfbb50 0x7f161319f0ed 
 /intucell/packages/mongodb-2.4.10/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xde8c31]
 /intucell/packages/mongodb-2.4.10/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x6d0b19]
 /intucell/packages/mongodb-2.4.10/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x262) [0x6d10a2]
 /lib/x86_64-linux-gnu/libpthread.so.0(+0xf030) [0x7f1613e04030]
 /intucell/packages/mongodb-2.4.10/bin/mongod(_ZN5mongo17SplitChunkCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x8e9c) [0xce06dc]
 /intucell/packages/mongodb-2.4.10/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x3a) [0x8e1f9a]
 /intucell/packages/mongodb-2.4.10/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x71d) [0x8e2a1d]
 /intucell/packages/mongodb-2.4.10/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x5f2) [0x8e3f72]
 /intucell/packages/mongodb-2.4.10/bin/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x40) [0xa89910]
 /intucell/packages/mongodb-2.4.10/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0xd7c) [0xa8e1dc]
 /intucell/packages/mongodb-2.4.10/bin/mongod() [0xa0182e]
 /intucell/packages/mongodb-2.4.10/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x393) [0xa02b73]
 /intucell/packages/mongodb-2.4.10/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x98) [0x6eb838]
 /intucell/packages/mongodb-2.4.10/bin/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x42e) [0xdd51fe]
 /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50) [0x7f1613dfbb50]
 /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f161319f0ed]
*** End of input ***

$ addr2line -i -e mongod 0xce06dc
/mntfast/data/slave/Linux_64bit_V2.4/mongo/src/mongo/db/namespace_details-inl.h:101
/mntfast/data/slave/Linux_64bit_V2.4/mongo/src/mongo/db/namespace_details.h:214
/mntfast/data/slave/Linux_64bit_V2.4/mongo/src/mongo/db/namespace_details-inl.h:72
/mntfast/data/slave/Linux_64bit_V2.4/mongo/src/mongo/s/d_split.cpp:807

d_split.cpp v 2.4.10

...
                    Client::ReadContext ctx( ns );
                    NamespaceDetails *d = nsdetails( ns );
 
807:                const IndexDetails *idx = d->findIndexByPrefix( keyPattern ,
                                                                    true ); /* exclude multikeys */
                    if ( idx == NULL ) {
                        break;
                    }

This can be triggered when the namespace does not exists when the split command was issued. One way to reproduce this is to connect to the shard, drop the collection and then issue the split command through mongos.



 Comments   
Comment by Githook User [ 17/Jul/14 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: SERVER-14342 Invalid access: seg fault in SplitChunkCommand::run
Branch: v2.4
https://github.com/mongodb/mongo/commit/4ad6d96e91f4a1d5a22ae64547a9129daf807285

Comment by Randolph Tan [ 27/Jun/14 ]

SERVER-11178 changed the code to use Collection class instead and added check to verify that it's not NULL.

Generated at Thu Feb 08 03:34:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.