-
Type:
Bug
-
Resolution: Done
-
Priority:
Critical - P2
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Linux
-
None
-
0
-
None
-
None
-
None
-
None
-
None
-
None
I'm using mongodb 2.0.7 with two shards, three replicas each.
All crashes occurred during draining one of the shards.
several times during last week I experienced following mongos crashes:
Sun Sep 30 15:23:34 [conn271] ns: musicgroup.entity could not initialize cursor across all shards because : stale config detected for ns: musicgroup.entity ParallelCursor::_ init @ mongodb-sh1/<...> attempt: 0 Sun Sep 30 15:23:34 [mongosMain] connection accepted from 192.168.18.6:54797 #447 Sun Sep 30 15:23:34 [conn447] authenticate: { authenticate: 1, user: "rating", nonce: "7642e5d689939a9c", key: "2ee0b88e7572c5fac957830fa727075d" } Sun Sep 30 15:23:36 [conn230] ns: rating.entity could not initialize cursor across all shards because : stale config detected for ns: rating.entity ParallelCursor::_init @ mongodb-sh1/<...> attempt: 0 Sun Sep 30 15:23:40 [conn380] ns: rating.entity could not initialize cursor across all shards because : stale config detected for ns: rating.entity ParallelCursor::_init @ mongodb-sh1/<...< attempt: 0 Sun Sep 30 15:23:41 [mongosMain] connection accepted from 192.168.18.6:54904 #448 Sun Sep 30 15:23:41 [conn448] authenticate: { authenticate: 1, user: "rating", nonce: "83c1b541f74be683", key: "510ff1afd1d2cf9d41128135fdeccd43" } Sun Sep 30 15:23:41 [conn448] getaddrinfo("host1f.load.net") failed: Name or service not known Sun Sep 30 15:23:41 [conn448] DBException in process: could not initialize cursor across all shards because : socket exception @ mongodb-sh1/<...> Sun Sep 30 15:23:41 [conn448] getaddrinfo("host1f.load.net") failed: Name or service not known Sun Sep 30 15:23:41 [conn448] DBException in process: could not initialize cursor across all shards because : socket exception @ mongodb-sh1/<...> Sun Sep 30 15:23:41 [conn448] getaddrinfo("host1f.load.net") failed: Name or service not known Sun Sep 30 15:23:41 [conn448] DBException in process: could not initialize cursor across all shards because : socket exception @ mongodb-sh1/<...> Sun Sep 30 15:23:41 [conn448] getaddrinfo("host1f.load.net") failed: Name or service not known Sun Sep 30 15:23:41 [conn448] DBException in process: could not initialize cursor across all shards because : socket exception @ mongodb-sh1/<...> Sun Sep 30 15:23:41 [conn327] getaddrinfo("host1f.load.net") failed: Name or service not known Sun Sep 30 15:23:41 [conn327] warning: could not get last error from a shard host1g.load.net :: caused by :: socket exception Received signal 6 Backtrace: 0x528c54 0x7ff4a8b77af0 0x7ff4a8b77a75 0x7ff4a8b7b5c0 0x7ff4a8b70941 0x730908 0x712fa5 0x700860 0x72ad97 0x73c8d1 0x5aad20 0x7ff4a9ef89ca 0x7ff4a8c2a70d mongos(_ZN5mongo17printStackAndExitEi+0x64)[0x528c54] /lib/libc.so.6(+0x33af0)[0x7ff4a8b77af0] /lib/libc.so.6(gsignal+0x35)[0x7ff4a8b77a75] /lib/libc.so.6(abort+0x180)[0x7ff4a8b7b5c0] /lib/libc.so.6(__assert_fail+0xf1)[0x7ff4a8b70941] mongos(_ZN5mongo10ClientInfo12getLastErrorERKNS_7BSONObjERNS_14BSONObjBuilderEb+0x3898)[0x730908] mongos(_ZN5mongo7Command20runAgainstRegisteredEPKcRNS_7BSONObjERNS_14BSONObjBuilderEi+0x805)[0x712fa5] mongos(_ZN5mongo14SingleStrategy7queryOpERNS_7RequestE+0x4d0)[0x700860] mongos(_ZN5mongo7Request7processEi+0x157)[0x72ad97] mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x71)[0x73c8d1] mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x260)[0x5aad20] /lib/libpthread.so.0(+0x69ca)[0x7ff4a9ef89ca] /lib/libc.so.6(clone+0x6d)[0x7ff4a8c2a70d] ********************* Wed Oct 3 18:28:32 [conn273] creating new connection to:host1d.load.net:27017 Wed Oct 3 18:28:32 [conn273] getaddrinfo("host1d.load.net") failed: Name or service not known Wed Oct 3 18:28:32 [conn273] warning: could not get last error from a shard host1d.load.net:27017 :: caused by :: socket exception Wed Oct 3 18:28:32 [conn393] Request::process ns: chords.entity msg id:169529 attempt: 0 Wed Oct 3 18:28:32 [conn393] CursorCache::get id: 4966267460045049647 Wed Oct 3 18:28:32 [conn446] Request::process ns: admin.$cmd msg id:183399 attempt: 0 Wed Oct 3 18:28:32 [conn446] single query: admin.$cmd { ismaster: 1 } ntoreturn: -1 options : 0 Received signal 6 Backtrace: 0x528c54 0x7fed6cc0faf0 0x7fed6cc0fa75 0x7fed6cc135c0 0x7fed6cc08941 0x730908 0x712fa5 0x700860 0x72ad97 0x73c8d1 0x5aad20 0x7fed6df909ca 0x7fed6ccc270d Wed Oct 3 18:28:32 [conn446] Request::process ns: admin.$cmd msg id:183400 attempt: 0 Wed Oct 3 18:28:32 [conn446] single query: admin.$cmd { ismaster: 1 } ntoreturn: -1 options : 0 mongos(_ZN5mongo17printStackAndExitEi+0x64)[0x528c54] /lib/libc.so.6(+0x33af0)[0x7fed6cc0faf0] /lib/libc.so.6(gsignal+0x35)[0x7fed6cc0fa75] /lib/libc.so.6(abort+0x180)[0x7fed6cc135c0] /lib/libc.so.6(__assert_fail+0xf1)[0x7fed6cc08941] mongos(_ZN5mongo10ClientInfo12getLastErrorERKNS_7BSONObjERNS_14BSONObjBuilderEb+0x3898)[0x730908] mongos(_ZN5mongo7Command20runAgainstRegisteredEPKcRNS_7BSONObjERNS_14BSONObjBuilderEi+0x805)[0x712fa5] mongos(_ZN5mongo14SingleStrategy7queryOpERNS_7RequestE+0x4d0)[0x700860] mongos(_ZN5mongo7Request7processEi+0x157)[0x72ad97] mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x71)[0x73c8d1] mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x260)[0x5aad20] /lib/libpthread.so.0(+0x69ca)[0x7fed6df909ca] /lib/libc.so.6(clone+0x6d)[0x7fed6ccc270d] === CursorCache at shutdown - sharded: 2 passthrough: 1900 ********************** Thu Oct 4 21:31:09 [conn131] reconnect host2f.load.net:27017 ok Thu Oct 4 21:31:09 [conn249] getaddrinfo("host2f.load.net") failed: Name or service not known Thu Oct 4 21:31:09 [conn249] warning: could not get last error from a shard host2f.load.net:27017 :: caused by :: socket exception Received signal 6 Backtrace: 0x528c54 0x7f652e6d9af0 0x7f652e6d9a75 0x7f652e6dd5c0 0x7f652e6d2941 0x730908 0x712fa5 0x700860 0x72ad97 0x73c8d1 0x5aad20 0x7f652fa5a9ca 0x7f652e78c70d mongos(_ZN5mongo17printStackAndExitEi+0x64)[0x528c54] /lib/libc.so.6(+0x33af0)[0x7f652e6d9af0] /lib/libc.so.6(gsignal+0x35)[0x7f652e6d9a75] /lib/libc.so.6(abort+0x180)[0x7f652e6dd5c0] /lib/libc.so.6(__assert_fail+0xf1)[0x7f652e6d2941] mongos(_ZN5mongo10ClientInfo12getLastErrorERKNS_7BSONObjERNS_14BSONObjBuilderEb+0x3898)[0x730908] mongos(_ZN5mongo7Command20runAgainstRegisteredEPKcRNS_7BSONObjERNS_14BSONObjBuilderEi+0x805)[0x712fa5] mongos(_ZN5mongo14SingleStrategy7queryOpERNS_7RequestE+0x4d0)[0x700860] mongos(_ZN5mongo7Request7processEi+0x157)[0x72ad97] mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x71)[0x73c8d1] mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x260)[0x5aad20] /lib/libpthread.so.0(+0x69ca)[0x7f652fa5a9ca] /lib/libc.so.6(clone+0x6d)[0x7f652e78c70d] === Logstream::get called in uninitialized state Thu Oct 4 21:31:09 [conn188] Assertion: 13548:BufBuilder grow() > 64MB 0x512279 0x4b4bfc 0x4b64bb 0x55942a 0x559ec9 0x55a247 0x55a41e 0x75cc98 0x75e408 0x5878dc 0x6f2d38 0x6f7b18 0x6f86c3 0x72ad0d 0x73c8d1 0x5aad20 0x7f652fa5a9ca 0x7f652e78c70d CursorCache at shutdown - sharded: 2 passthrough: 967 Received signal 11 Backtrace: 0x528c54 0x7f652e6d9af0 0x4fa867 0x4eb598 0x4eeafc 0x4ef270 0x4f3be9 0x4f4638 0x6df524 0x6f8d29 0x72ad97 0x73c8d1 0x5aad20 0x7f652fa5a9ca 0x7f652e78c70d mongos(_ZN5mongo17printStackAndExitEi+0x64)[0x528c54] /lib/libc.so.6(+0x33af0)[0x7f652e6d9af0] Received signal 11 Backtrace: 0x528c54 0x7f652e6d9af0 0x54b4e6 0x5610c6 0x562e90 0x54048a 0x55102a 0x54abc0 0x54b295 0x550b4e 0x550d01 0x5495b0 0x54a97a 0x6e6877 0x5380cd 0x539984 0x53a45b 0x560f59 0x563474 0x5acb89 mongos(_ZN5mongo17printStackAndExitEi+0x64)[0x528c54] /lib/libc.so.6(+0x33af0)[0x7f652e6d9af0] mongos(_ZNK5mongo10FieldRange10nontrivialEv+0x87)[0x4fa867] mongos(_ZN5mongo10FieldRangeaNERKS0_+0x1b8)[0x4eb598] mongos(_ZN5mongo13FieldRangeSet16processOpElementEPKcRKNS_11BSONElementEbb+0x27c)[0x4eeafc] mongos(_ZN5mongo13FieldRangeSet17processQueryFieldERKNS_11BSONElementEb+0x220)[0x4ef270] mongos(_ZN5mongo13FieldRangeSetC1EPKcRKNS_7BSONObjEbb+0x179)[0x4f3be9] mongos(_ZN5mongo16OrRangeGeneratorC1EPKcRKNS_7BSONObjEb+0x48)[0x4f4638] mongos(_ZNK5mongo12ChunkManager17getShardsForQueryERSt3setINS_5ShardESt4lessIS2_ESaIS2_EERKNS_7BSONObjE+0x34)[0x6df524] mongos(_ZN5mongo13ShardStrategy7queryOpERNS_7RequestE+0x559)[0x6f8d29] mongos(_ZN5mongo7Request7processEi+0x157)[0x72ad97] mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x71)[0x73c8d1] mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x260)[0x5aad20] /lib/libpthread.so.0(+0x69ca)[0x7f652fa5a9ca] /lib/libc.so.6(clone+0x6d)[0x7f652e78c70d]
All mongod's are up and have nothing strange in their logs.