-
Type:
Bug
-
Resolution: Done
-
Priority:
Critical - P2
-
None
-
Affects Version/s: 1.6.4
-
Component/s: Stability
-
None
-
Environment:CentOS release 5.4 (Final) , mongodb-linux-x86_64-1.6.4.tgz
-
Linux
-
None
-
0
-
None
-
None
-
None
-
None
-
None
-
None
This is a Mongo cluster consists of 3 machines running one configsvr plus one mongos each and other 6 machines running one primary shardsvr plus one secondary shardsvr each. All shard mongods and most of configsvrs(2/3) shutted down last night.
This sentence appears frequently among the logs: 'got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends', while i'm sure that no one gave out any shutting down instruction ('cause i am currently the only user of this cluster, and i've just left my python scripts running on the same machines of the configsvrs loading some data into the cluster before i got to bed )
Survivors:
configsvr0 (strange?)
mongos0
mongos1
mongos2
all arbiters(arbiter0 ~ arbiter5)
Cluster construction:
10.210.208.10
configsvr0 at 10.210.208.10:59900
mongos0 at 10.210.208.10:58800
10.210.208.11
configsvr1 at 10.210.208.11:59901
mongos1 at 10.210.208.11:58801
10.210.208.12
configsvr1 at 10.210.208.12:59902
mongos2 at 10.210.208.12:58802
arbiter0 at 10.210.208.12:50009
arbiter1 at 10.210.208.12:50019
arbiter2 at 10.210.208.12:50029
arbiter3 at 10.210.208.12:50039
arbiter4 at 10.210.208.12:50049
arbiter5 at 10.210.208.12:50059
10.210.208.23
shardsvr0 pri at 10.210.208.23:50000
shardsvr1 sec at 10.210.208.23:50011
10.210.208.24
shardsvr1 pri at 10.210.208.24:50010
shardsvr0 sec at 10.210.208.24:50001
10.210.208.25
shardsvr2 pri at 10.210.208.25:50020
shardsvr3 pri at 10.210.208.25:50031
10.210.208.27
shardsvr3 pri at 10.210.208.27:50030
shardsvr2 sec at 10.210.208.27:50021
10.210.208.28
shardsvr4 pri at 10.210.208.28:50040
shardsvr5 sec at 10.210.208.28:50051
10.210.208.29
shardsvr5 pri at 10.210.208.29:50050
shardsvr4 sec at 10.210.208.29:50041
Logs around the shutting down:
[configsvr0]
Mon Dec 13 00:17:19 [conn26] CMD fsync: sync:1 lock:0
Mon Dec 13 00:17:19 [conn26] fsync from getlasterror
Mon Dec 13 00:17:20 [conn13] CMD fsync: sync:1 lock:0
Mon Dec 13 00:17:20 [conn13] fsync from getlasterror
Mon Dec 13 00:17:20 [conn13] CMD fsync: sync:1 lock:0
Mon Dec 13 00:17:20 [conn13] fsync from getlasterror
Mon Dec 13 00:17:25 [conn9] end connection 10.210.208.23:52061
Mon Dec 13 00:17:25 [conn11] end connection 10.210.208.25:36326
Mon Dec 13 00:17:25 [conn19] end connection 10.210.208.29:59391
Mon Dec 13 00:17:25 [conn37] end connection 10.210.208.29:41741
Mon Dec 13 00:17:25 [conn12] end connection 10.210.208.27:50387
Mon Dec 13 00:17:25 [conn16] end connection 10.210.208.27:50390
Mon Dec 13 00:17:25 [conn13] end connection 10.210.208.28:54518
Mon Dec 13 00:17:25 [conn30] end connection 10.210.208.28:41073
Mon Dec 13 00:17:28 [conn10] end connection 10.210.208.24:34116
Mon Dec 13 00:17:41 [conn38] end connection 10.210.208.12:60115
Mon Dec 13 00:17:41 [conn29] end connection 10.210.208.10:39314
Mon Dec 13 00:17:41 [conn26] end connection 10.210.208.11:52043
Mon Dec 13 00:17:43 [conn39] CMD fsync: sync:1 lock:0
Mon Dec 13 00:17:43 [conn39] end connection 10.210.208.12:56668
Mon Dec 13 00:17:48 [conn35] CMD fsync: sync:1 lock:0
Mon Dec 13 00:17:48 [conn35] end connection 10.210.208.10:47180
Mon Dec 13 00:17:49 [conn31] CMD fsync: sync:1 lock:0
Mon Dec 13 00:17:50 [conn31] end connection 10.210.208.11:54136
Mon Dec 13 00:18:11 [initandlisten] connection accepted from 10.210.208.10:52738 #40
Mon Dec 13 00:18:11 [initandlisten] connection accepted from 10.210.208.12:56015 #41
[mongos0]
Sun Dec 12 23:47:31 connection accepted from 127.0.0.1:39476 #40
Sun Dec 12 23:47:31 [conn40] end connection 127.0.0.1:39476
Sun Dec 12 23:50:49 connection accepted from 127.0.0.1:39477 #41
Sun Dec 12 23:50:49 [conn41] end connection 127.0.0.1:39477
Mon Dec 13 00:17:41 [Balancer] ~ScopedDBConnection: _conn != null
Mon Dec 13 00:17:41 [Balancer] caught exception while doing balance: DBClientBase::findOne: transport error
Mon Dec 13 00:17:48 [LockPinger] ~ScopedDBConnection: _conn != null
Mon Dec 13 00:17:48 [LockPinger] warning: couldn't ping: SyncClusterConnection::udpate prepare failed: DBClientBase::findOne: transport error10.210.208.11:59901:{}DBClientBase::findOne: transport error10.210.208.12:59902:{}
Mon Dec 13 00:18:11 [Balancer] SyncClusterConnection connecting to [10.210.208.10:59900]
Mon Dec 13 00:18:11 [Balancer] SyncClusterConnection connecting to [10.210.208.11:59901]
Mon Dec 13 00:18:11 [Balancer] SyncClusterConnection connect fail to: 10.210.208.11:59901 errmsg: couldn't connect to server 10.210.208.11:59901
Mon Dec 13 00:18:11 [Balancer] SyncClusterConnection connecting to [10.210.208.12:59902]
Mon Dec 13 00:18:11 [Balancer] SyncClusterConnection connect fail to: 10.210.208.12:59902 errmsg: couldn't connect to server 10.210.208.12:59902
Mon Dec 13 00:18:11 [Balancer] trying reconnect to 10.210.208.11:59901
Mon Dec 13 00:18:11 [Balancer] reconnect 10.210.208.11:59901 failed couldn't connect to server 10.210.208.11:59901
Mon Dec 13 00:18:11 [Balancer] MessagingPort say send() errno:9 Bad file descriptor 10.210.208.11:59901
Mon Dec 13 00:18:11 [Balancer] trying reconnect to 10.210.208.12:59902
Mon Dec 13 00:18:11 [Balancer] reconnect 10.210.208.12:59902 failed couldn't connect to server 10.210.208.12:59902
Mon Dec 13 00:18:11 [Balancer] MessagingPort say send() errno:9 Bad file descriptor 10.210.208.12:59902
Mon Dec 13 00:18:11 [Balancer] ~ScopedDBConnection: _conn != null
Mon Dec 13 00:18:11 [Balancer] caught exception while doing balance: DBClientBase::findOne: transport error
Mon Dec 13 00:18:13 [conn37] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:18:13 [conn37] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:18:13 [conn37] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:18:13 [conn37] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:18:13 [conn37] DBException in process: dbconnectionpool: connect failed s003/10.210.208.27:50030,10.210.208.25:50031 : connect failed to set s003/10.210.208.27:50030,10.210.208.25:50031
[configsvr1]
Mon Dec 13 00:17:19 [conn18] CMD fsync: sync:1 lock:0
Mon Dec 13 00:17:19 [conn18] fsync from getlasterror
Mon Dec 13 00:17:19 [conn18] CMD fsync: sync:1 lock:0
Mon Dec 13 00:17:19 [conn18] fsync from getlasterror
Mon Dec 13 00:17:25 [conn24] end connection 10.210.208.29:33776
Mon Dec 13 00:17:25 [conn41] end connection 10.210.208.29:38668
Mon Dec 13 00:17:25 [conn21] end connection 10.210.208.27:52271
Mon Dec 13 00:17:25 [conn18] end connection 10.210.208.28:60717
Mon Dec 13 00:17:25 [conn34] end connection 10.210.208.28:43880
Mon Dec 13 00:17:25 [conn17] end connection 10.210.208.27:52268
Mon Dec 13 00:17:25 [conn14] end connection 10.210.208.23:47991
Mon Dec 13 00:17:25 [conn16] end connection 10.210.208.25:48129
Mon Dec 13 00:17:25 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:25 [interruptThread] now exiting
Mon Dec 13 00:17:25 dbexit:
Mon Dec 13 00:17:25 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:25 [interruptThread] closing listening socket: 6
Mon Dec 13 00:17:25 [interruptThread] closing listening socket: 7
Mon Dec 13 00:17:25 [interruptThread] closing listening socket: 8
Mon Dec 13 00:17:25 [interruptThread] closing listening socket: 9
Mon Dec 13 00:17:25 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:25 flushing op log and files
Mon Dec 13 00:17:25 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:25 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:25 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:25 closeAllFiles() finished
Mon Dec 13 00:17:25 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:25 dbexit: really exiting now
[mongos1]
Mon Dec 13 00:15:19 [conn214] autosplitting wb.message size: 209715562 shard: ns:wb.message at: s003:s003/10.210.208.27:50030,10.210.208.25:50031 lastmod: 19|35 min:
max:
{ _id: "221101202741529267" }on:
{ _id: "221101201854330112" }(splitThreshold 209715200)
Mon Dec 13 00:15:19 [conn214] config change: { _id: "xxxxxx-2010-12-12T16:15:19-201", server: "xxxxxx", time: new Date(1292170519946), what: "split", ns: "wb.message", details: { before: { min:
, max:
{ _id: "221101202741529267" }}, left: { min:
{ _id: "221101201853081284" }, max:
{ _id: "221101201854330112" }}, right: { min:
{ _id: "221101201854330112" }, max:
{ _id: "221101202741529267" } } } }
Mon Dec 13 00:17:25 [conn214] MessagingPort recv() errno:104 Connection reset by peer 10.210.208.27:50030
Mon Dec 13 00:17:25 [conn214] SocketException: 9001 socket exception
Mon Dec 13 00:17:25 [conn214] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 [conn214] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 [conn216] end connection 127.0.0.1:47942
Mon Dec 13 00:17:25 ERROR: couldn't unsert sharding : DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 ERROR: couldn't unsert sharding : DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 ERROR: couldn't unsert sharding : DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 ERROR: couldn't unsert sharding : DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 ERROR: couldn't unsert sharding : DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 [conn214] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 [conn214] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 [conn214] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 [conn214] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 [conn214] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 [conn214] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 [conn214] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:17:25 [conn214] DBException in process: dbconnectionpool: connect failed s003/10.210.208.27:50030,10.210.208.25:50031 : connect failed to set s003/10.210.208.27:50030,10.210.208.25:50031
Mon Dec 13 00:17:25 [conn214] DBException in process: dbconnectionpool: connect failed s003/10.210.208.27:50030,10.210.208.25:50031 : connect failed to set s003/10.210.208.27:50030,10.210.208.25:50031
Mon Dec 13 00:17:25 [conn214] trying reconnect to 10.210.208.23:50000
Mon Dec 13 00:17:25 [conn214] reconnect 10.210.208.23:50000 failed couldn't connect to server 10.210.208.23:50000
Mon Dec 13 00:17:25 [conn214] MessagingPort say send() errno:9 Bad file descriptor 10.210.208.23:50000
Mon Dec 13 00:17:28 [conn214] trying reconnect to 10.210.208.23:50000
Mon Dec 13 00:17:28 [conn214] reconnect 10.210.208.23:50000 failed couldn't connect to server 10.210.208.23:50000
[configsvr2]
Mon Dec 13 00:17:21 [conn11] CMD fsync: sync:1 lock:0
Mon Dec 13 00:17:21 [conn11] fsync from getlasterror
Mon Dec 13 00:17:21 [conn11] CMD fsync: sync:1 lock:0
Mon Dec 13 00:17:21 [conn11] fsync from getlasterror
Mon Dec 13 00:17:26 [conn7] end connection 10.210.208.23:57380
Mon Dec 13 00:17:26 [conn9] end connection 10.210.208.25:56757
Mon Dec 13 00:17:26 [conn17] end connection 10.210.208.29:36447
Mon Dec 13 00:17:26 [conn34] end connection 10.210.208.29:56668
Mon Dec 13 00:17:26 [conn10] end connection 10.210.208.27:48560
Mon Dec 13 00:17:26 [conn14] end connection 10.210.208.27:48564
Mon Dec 13 00:17:26 [conn27] end connection 10.210.208.28:58957
Mon Dec 13 00:17:26 [conn11] end connection 10.210.208.28:54975
Mon Dec 13 00:17:27 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:27 [interruptThread] now exiting
Mon Dec 13 00:17:27 dbexit:
Mon Dec 13 00:17:27 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:27 [interruptThread] closing listening socket: 6
Mon Dec 13 00:17:27 [interruptThread] closing listening socket: 7
Mon Dec 13 00:17:27 [interruptThread] closing listening socket: 8
Mon Dec 13 00:17:27 [interruptThread] closing listening socket: 9
Mon Dec 13 00:17:27 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:27 flushing op log and files
Mon Dec 13 00:17:27 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:27 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:27 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:27 closeAllFiles() finished
Mon Dec 13 00:17:27 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:27 dbexit: really exiting now
[mongos2]
Sun Dec 12 22:48:47 [conn33] ~ScopedDBConnection: _conn != null
Sun Dec 12 22:48:47 [conn33] ERROR: splitIfShould failed: write $cmd failed on a shard: { "got" : { "id" : "wb.message-_id\"221101202746363817\"", "lastmod" :
, "ns" : "wb.message", "min" :
{ "_id" : "221101202746363817" }, "max" :
{ "_id" : "221101202856575202" }, "shard" : "s003" }, "whatFailed" : { "ns" : "config.chunks", "q" : { "query" :
{ "ns" : "wb.message" }, "orderby" :
{ "lastmod" : -1 }}, "res" : { "lastmod" :
{ "t" : 19000 , "i" : 17 } } }, "errmsg" : "pre-condition failed", "ok" : 0 } 10.210.208.10:59900
Sun Dec 12 22:49:01 [Balancer] SyncClusterConnection connecting to [10.210.208.10:59900]
Sun Dec 12 22:49:01 [Balancer] SyncClusterConnection connecting to [10.210.208.11:59901]
Sun Dec 12 22:49:01 [Balancer] SyncClusterConnection connecting to [10.210.208.12:59902]
Mon Dec 13 00:17:42 [Balancer] ~ScopedDBConnection: _conn != null
Mon Dec 13 00:17:42 [Balancer] caught exception while doing balance: DBClientBase::findOne: transport error
Mon Dec 13 00:17:44 [LockPinger] ~ScopedDBConnection: _conn != null
Mon Dec 13 00:17:44 [LockPinger] warning: couldn't ping: SyncClusterConnection::udpate prepare failed: DBClientBase::findOne: transport error10.210.208.11:59901:{}DBClientBase::findOne: transport error10.210.208.12:59902:{}
Mon Dec 13 00:17:50 [conn33] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:17:50 [conn33] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:17:50 [conn33] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:17:50 [conn33] DBException in process: DBClientBase::findOne: transport error
Mon Dec 13 00:17:50 [conn33] DBException in process: dbconnectionpool: connect failed s004/10.210.208.28:50040,10.210.208.29:50041 : connect failed to set s004/10.210.208.28:50040,10.210.208.29:50041
Mon Dec 13 00:17:50 [conn33] DBException in process: dbconnectionpool: connect failed s004/10.210.208.28:50040,10.210.208.29:50041 : connect failed to set s004/10.210.208.28:50040,10.210.208.29:50041
Mon Dec 13 00:17:50 [conn33] DBException in process: dbconnectionpool: connect failed s004/10.210.208.28:50040,10.210.208.29:50041 : connect failed to set s004/10.210.208.28:50040,10.210.208.29:50041
Mon Dec 13 00:17:50 [conn33] DBException in process: dbconnectionpool: connect failed s004/10.210.208.28:50040,10.210.208.29:50041 : connect failed to set s004/10.210.208.28:50040,10.210.208.29:50041
Mon Dec 13 00:17:50 [conn33] DBException in process: dbconnectionpool: connect failed s004/10.210.208.28:50040,10.210.208.29:50041 : connect failed to set s004/10.210.208.28:50040,10.210.208.29:50041
[arbiter0]
Sun Dec 12 12:05:48 [ReplSetHealthPollTask] replSet 10.210.208.24:50001 STARTUP2
Sun Dec 12 12:05:50 [ReplSetHealthPollTask] replSet 10.210.208.24:50001 RECOVERING
Sun Dec 12 12:06:32 [ReplSetHealthPollTask] replSet 10.210.208.24:50001 SECONDARY
Mon Dec 13 00:17:26 [conn7] end connection 10.210.208.23:55982
Mon Dec 13 00:17:27 [ReplSetHealthPollTask] replSet info 10.210.208.23:50000 is now down (or slow to respond)
Mon Dec 13 00:17:29 [conn8] replSet info voting yea for 1
Mon Dec 13 00:17:29 [conn8] end connection 10.210.208.24:59841
Mon Dec 13 00:17:30 [ReplSetHealthPollTask] replSet info 10.210.208.24:50001 is now down (or slow to respond)
[arbiter1]
Sun Dec 12 12:05:40 [ReplSetHealthPollTask] replSet 10.210.208.23:50011 STARTUP2
Sun Dec 12 12:05:40 [initandlisten] connection accepted from 10.210.208.23:46807 #7
Sun Dec 12 12:05:42 [ReplSetHealthPollTask] replSet 10.210.208.23:50011 RECOVERING
Sun Dec 12 12:06:00 [ReplSetHealthPollTask] replSet 10.210.208.23:50011 SECONDARY
Mon Dec 13 00:17:26 [conn7] end connection 10.210.208.23:46807
Mon Dec 13 00:17:28 [ReplSetHealthPollTask] replSet info 10.210.208.23:50011 is now down (or slow to respond)
Mon Dec 13 00:17:29 [conn6] end connection 10.210.208.24:59102
Mon Dec 13 00:17:31 [ReplSetHealthPollTask] replSet info 10.210.208.24:50010 is now down (or slow to respond)
[arbiter2]
Sun Dec 12 12:06:05 [ReplSetHealthPollTask] replSet info 10.210.208.27:50021 is now up
Sun Dec 12 12:06:05 [ReplSetHealthPollTask] replSet 10.210.208.27:50021 STARTUP2
Sun Dec 12 12:06:05 [initandlisten] connection accepted from 10.210.208.27:35212 #7
Sun Dec 12 12:06:07 [ReplSetHealthPollTask] replSet 10.210.208.27:50021 SECONDARY
Mon Dec 13 00:17:26 [conn6] end connection 10.210.208.25:40042
Mon Dec 13 00:17:26 [conn7] end connection 10.210.208.27:35212
Mon Dec 13 00:17:27 [ReplSetHealthPollTask] replSet info 10.210.208.27:50021 is now down (or slow to respond)
Mon Dec 13 00:17:27 [ReplSetHealthPollTask] replSet info 10.210.208.25:50020 is now down (or slow to respond)
[arbiter3]
Sun Dec 12 12:05:51 [ReplSetHealthPollTask] replSet info 10.210.208.25:50031 is now up
Sun Dec 12 12:05:51 [ReplSetHealthPollTask] replSet 10.210.208.25:50031 STARTUP2
Sun Dec 12 12:05:53 [ReplSetHealthPollTask] replSet 10.210.208.25:50031 RECOVERING
Sun Dec 12 12:06:51 [ReplSetHealthPollTask] replSet 10.210.208.25:50031 SECONDARY
Mon Dec 13 00:17:26 [conn7] end connection 10.210.208.25:47388
Mon Dec 13 00:17:26 [conn6] end connection 10.210.208.27:54764
Mon Dec 13 00:17:27 [ReplSetHealthPollTask] replSet info 10.210.208.27:50030 is now down (or slow to respond)
Mon Dec 13 00:17:27 [ReplSetHealthPollTask] replSet info 10.210.208.25:50031 is now down (or slow to respond)
[arbiter4]
Sun Dec 12 12:06:12 [ReplSetHealthPollTask] replSet info 10.210.208.29:50041 is now up
Sun Dec 12 12:06:12 [ReplSetHealthPollTask] replSet 10.210.208.29:50041 STARTUP2
Sun Dec 12 12:06:14 [initandlisten] connection accepted from 10.210.208.29:43618 #9
Sun Dec 12 12:06:14 [ReplSetHealthPollTask] replSet 10.210.208.29:50041 SECONDARY
Mon Dec 13 00:17:26 [conn9] end connection 10.210.208.29:43618
Mon Dec 13 00:17:26 [conn8] end connection 10.210.208.28:43289
Mon Dec 13 00:17:28 [ReplSetHealthPollTask] replSet info 10.210.208.28:50040 is now down (or slow to respond)
Mon Dec 13 00:17:28 [ReplSetHealthPollTask] replSet info 10.210.208.29:50041 is now down (or slow to respond)
[arbiter5]
Sun Dec 12 12:06:09 [ReplSetHealthPollTask] replSet info 10.210.208.28:50051 is now up
Sun Dec 12 12:06:09 [ReplSetHealthPollTask] replSet 10.210.208.28:50051 STARTUP2
Sun Dec 12 12:06:10 [initandlisten] connection accepted from 10.210.208.28:50840 #9
Sun Dec 12 12:06:11 [ReplSetHealthPollTask] replSet 10.210.208.28:50051 SECONDARY
Mon Dec 13 00:17:26 [conn8] end connection 10.210.208.29:34396
Mon Dec 13 00:17:26 [conn9] end connection 10.210.208.28:50840
Mon Dec 13 00:17:27 [ReplSetHealthPollTask] replSet info 10.210.208.28:50051 is now down (or slow to respond)
Mon Dec 13 00:17:28 [ReplSetHealthPollTask] replSet info 10.210.208.29:50050 is now down (or slow to respond)
[shardsvr0 pri]
Mon Dec 13 00:16:28 [conn11] insert wb.message 342ms
Mon Dec 13 00:16:32 [conn11] insert wb.message 352ms
Mon Dec 13 00:16:34 [conn11] insert wb.message 509ms
Mon Dec 13 00:16:35 [conn11] insert wb.message 405ms
Mon Dec 13 00:16:36 [conn11] insert wb.message 605ms
Mon Dec 13 00:16:37 [conn11] insert wb.message 236ms
Mon Dec 13 00:17:19 [conn11] insert wb.message 442ms
Mon Dec 13 00:17:20 [conn11] insert wb.message 364ms
Mon Dec 13 00:17:28 [conn11] insert wb.message 187ms
Mon Dec 13 00:17:31 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:31 [interruptThread] now exiting
Mon Dec 13 00:17:31 dbexit:
Mon Dec 13 00:17:31 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:31 [interruptThread] closing listening socket: 23
Mon Dec 13 00:17:31 [interruptThread] closing listening socket: 24
Mon Dec 13 00:17:31 [interruptThread] closing listening socket: 25
Mon Dec 13 00:17:31 [interruptThread] closing listening socket: 26
Mon Dec 13 00:17:31 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:31 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:31 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:31 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:31 [conn3] end connection 127.0.0.1:55238
Mon Dec 13 00:17:31 closeAllFiles() finished
Mon Dec 13 00:17:31 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:31 dbexit: really exiting now
[shardsvr1 sec]
Sun Dec 12 21:18:41 done allocating datafile /data0/mongo/ssv50011/test.0, size: 64MB, took 0.094 secs
Sun Dec 12 21:18:41 allocating new datafile /data0/mongo/ssv50011/test.1, filling with zeroes...
Sun Dec 12 21:18:41 [rs_sync] building new index on
for test.system.js
Sun Dec 12 21:18:41 [rs_sync] done for 0 records 0secs
Sun Dec 12 21:18:41 done allocating datafile /data0/mongo/ssv50011/test.1, size: 128MB, took 0.186 secs
Sun Dec 12 23:47:41 [rs_sync] building new index on
for wb.system.js
Sun Dec 12 23:47:41 [rs_sync] done for 0 records 0.002secs
Mon Dec 13 00:17:31 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:31 [interruptThread] now exiting
Mon Dec 13 00:17:31 dbexit:
Mon Dec 13 00:17:31 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:31 [interruptThread] closing listening socket: 21
Mon Dec 13 00:17:31 [interruptThread] closing listening socket: 22
Mon Dec 13 00:17:31 [interruptThread] closing listening socket: 23
Mon Dec 13 00:17:31 [interruptThread] closing listening socket: 24
Mon Dec 13 00:17:31 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:31 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:31 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:31 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:31 [conn3] end connection 127.0.0.1:56366
Mon Dec 13 00:17:31 closeAllFiles() finished
Mon Dec 13 00:17:31 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:31 dbexit: really exiting now
[shardsvr1 pri]
Mon Dec 13 00:16:43 [conn21] query admin.$cmd ntoreturn:1 command:
reslen:85 1475ms
Mon Dec 13 00:16:43 [conn16] query admin.$cmd ntoreturn:1 command:
reslen:85 1474ms
Mon Dec 13 00:16:43 [conn11] query admin.$cmd ntoreturn:1 command:
reslen:85 1472ms
Mon Dec 13 00:16:44 [conn10] insert wb.message 276ms
Mon Dec 13 00:16:45 [conn10] insert wb.message 150ms
Mon Dec 13 00:16:47 [conn10] insert wb.message 426ms
Mon Dec 13 00:16:48 [conn10] insert wb.message 391ms
Mon Dec 13 00:17:25 [conn10] insert wb.message 885ms
Mon Dec 13 00:17:26 [conn10] insert wb.message 161ms
Mon Dec 13 00:17:27 [conn7] end connection 10.210.208.23:42169
Mon Dec 13 00:17:27 [conn23] end connection 10.210.208.28:44860
Mon Dec 13 00:17:27 [conn10] insert wb.message 806ms
Mon Dec 13 00:17:27 [conn9] end connection 10.210.208.23:42173
Mon Dec 13 00:17:28 [ReplSetHealthPollTask] replSet info 10.210.208.23:50011 is now down (or slow to respond)
Mon Dec 13 00:17:30 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:30 [interruptThread] now exiting
Mon Dec 13 00:17:30 dbexit:
Mon Dec 13 00:17:30 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:30 [interruptThread] closing listening socket: 10
Mon Dec 13 00:17:30 [interruptThread] closing listening socket: 11
Mon Dec 13 00:17:30 [interruptThread] closing listening socket: 12
Mon Dec 13 00:17:30 [interruptThread] closing listening socket: 13
Mon Dec 13 00:17:30 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:30 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:30 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:30 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:30 [conn3] end connection 127.0.0.1:59773
Mon Dec 13 00:17:30 closeAllFiles() finished
Mon Dec 13 00:17:30 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:30 dbexit: really exiting now
[shardsvr0 sec]
Sun Dec 12 19:16:13 [initandlisten] connection accepted from 10.210.208.12:38017 #15
Sun Dec 12 19:16:22 [initandlisten] connection accepted from 10.210.208.12:38048 #16
Mon Dec 13 00:17:27 [conn2] end connection 10.210.208.23:37048
Mon Dec 13 00:17:27 [conn5] end connection 10.210.208.11:38353
Mon Dec 13 00:17:27 [conn4] end connection 10.210.208.11:44394
Mon Dec 13 00:17:27 [conn7] end connection 10.210.208.11:52965
Mon Dec 13 00:17:27 [conn6] end connection 10.210.208.11:52963
Mon Dec 13 00:17:27 [conn8] end connection 10.210.208.11:52979
Mon Dec 13 00:17:27 [initandlisten] connection accepted from 10.210.208.11:40674 #17
Mon Dec 13 00:17:29 [ReplSetHealthPollTask] replSet info 10.210.208.23:50000 is now down (or slow to respond)
Mon Dec 13 00:17:29 [rs Manager] replSet info electSelf 1
Mon Dec 13 00:17:29 [rs Manager] replSet PRIMARY
Mon Dec 13 00:17:30 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:30 [interruptThread] now exiting
Mon Dec 13 00:17:30 dbexit:
Mon Dec 13 00:17:30 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:30 [interruptThread] closing listening socket: 10
Mon Dec 13 00:17:30 [interruptThread] closing listening socket: 11
Mon Dec 13 00:17:30 [interruptThread] closing listening socket: 12
Mon Dec 13 00:17:30 [interruptThread] closing listening socket: 13
Mon Dec 13 00:17:30 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:30 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:30 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:30 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:30 [conn3] end connection 127.0.0.1:48491
Mon Dec 13 00:17:30 closeAllFiles() finished
Mon Dec 13 00:17:30 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:30 dbexit: really exiting now
[shardsvr2 pri]
Mon Dec 13 00:16:19 [conn8] getmore local.oplog.rs cid:8347868128797114205 getMore: { ts:
} bytes:20 nreturned:0 3000ms
Mon Dec 13 00:16:40 [conn8] getmore local.oplog.rs cid:8347868128797114205 getMore: { ts:
} bytes:20 nreturned:0 3000ms
Mon Dec 13 00:17:26 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:26 [interruptThread] now exiting
Mon Dec 13 00:17:26 dbexit:
Mon Dec 13 00:17:26 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:26 [interruptThread] closing listening socket: 17
Mon Dec 13 00:17:26 [interruptThread] closing listening socket: 18
Mon Dec 13 00:17:26 [interruptThread] closing listening socket: 19
Mon Dec 13 00:17:26 [interruptThread] closing listening socket: 20
Mon Dec 13 00:17:26 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:26 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:26 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:26 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:26 [conn3] end connection 127.0.0.1:42282
Mon Dec 13 00:17:26 closeAllFiles() finished
Mon Dec 13 00:17:26 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:26 dbexit: really exiting now
[shardsvr3 sec]
Sun Dec 12 19:16:13 [initandlisten] connection accepted from 10.210.208.12:38172 #16
Sun Dec 12 19:16:21 [initandlisten] connection accepted from 10.210.208.12:38191 #17
Mon Dec 13 00:17:26 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:26 [interruptThread] now exiting
Mon Dec 13 00:17:26 dbexit:
Mon Dec 13 00:17:26 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:26 [interruptThread] closing listening socket: 17
Mon Dec 13 00:17:26 [interruptThread] closing listening socket: 18
Mon Dec 13 00:17:26 [interruptThread] closing listening socket: 19
Mon Dec 13 00:17:26 [interruptThread] closing listening socket: 20
Mon Dec 13 00:17:26 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:26 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:26 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:26 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:26 [conn3] end connection 127.0.0.1:37911
Mon Dec 13 00:17:26 [conn2] end connection 10.210.208.27:41132
Mon Dec 13 00:17:26 [rs_sync] MessagingPort flush send() errno:9 Bad file descriptor 10.210.208.27:50030
Mon Dec 13 00:17:26 [rs_sync] caught exception (socket exception) in destructor (~PiggyBackData)
Mon Dec 13 00:17:26 [rs_sync] replSet syncThread: 10278 dbclient error communicating with server
Mon Dec 13 00:17:26 closeAllFiles() finished
Mon Dec 13 00:17:26 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:26 dbexit: really exiting now
[shardsvr3 pri]
Mon Dec 13 00:17:07 [conn12] insert wb.message 717ms
Mon Dec 13 00:17:07 [conn12] insert wb.message 175ms
Mon Dec 13 00:17:16 [conn12] insert wb.message 110ms
Mon Dec 13 00:17:16 [conn12] insert wb.message 414ms
Mon Dec 13 00:17:28 [conn12] insert wb.message 114ms
Mon Dec 13 00:17:29 [conn8] end connection 10.210.208.25:52401
Mon Dec 13 00:17:29 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:29 [interruptThread] now exiting
Mon Dec 13 00:17:29 dbexit:
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 22
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 23
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 24
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 25
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:29 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:29 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:29 [conn3] end connection 127.0.0.1:57920
Mon Dec 13 00:17:29 closeAllFiles() finished
Mon Dec 13 00:17:29 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:29 [conn12] end connection 10.210.208.11:46296
Mon Dec 13 00:17:29 dbexit: really exiting now
[shardsvr2 sec]
Sun Dec 12 19:16:15 [initandlisten] connection accepted from 10.210.208.12:60017 #15
Sun Dec 12 19:16:23 [initandlisten] connection accepted from 10.210.208.12:60040 #16
Mon Dec 13 00:17:29 [conn1] end connection 10.210.208.25:43951
Mon Dec 13 00:17:29 [rs_sync] replSet syncThread: 10278 dbclient error communicating with server
Mon Dec 13 00:17:29 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:29 [interruptThread] now exiting
Mon Dec 13 00:17:29 dbexit:
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 22
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 23
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 24
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 25
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:29 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:29 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:29 [conn3] end connection 127.0.0.1:51839
Mon Dec 13 00:17:29 closeAllFiles() finished
Mon Dec 13 00:17:29 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:29 dbexit: really exiting now
[shardsvr4 pri]
Mon Dec 13 00:17:10 [conn10] insert wb.message 439ms
Mon Dec 13 00:17:10 [conn10] insert wb.message 162ms
Mon Dec 13 00:17:12 [conn10] insert wb.message 253ms
Mon Dec 13 00:17:12 [conn5] query admin.$cmd ntoreturn:1 command:
reslen:81 188ms
Mon Dec 13 00:17:13 [conn10] insert wb.message 101ms
Mon Dec 13 00:17:18 [conn10] insert wb.message 223ms
Mon Dec 13 00:17:29 [conn7] end connection 10.210.208.29:48747
Mon Dec 13 00:17:29 [conn9] end connection 10.210.208.27:37248
Mon Dec 13 00:17:29 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:29 [interruptThread] now exiting
Mon Dec 13 00:17:29 dbexit:
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 39
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 40
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 41
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 42
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:29 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:29 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:29 [conn3] end connection 127.0.0.1:40654
Mon Dec 13 00:17:29 closeAllFiles() finished
Mon Dec 13 00:17:29 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:29 dbexit: really exiting now
[shardsvr5 sec]
Sun Dec 12 19:16:16 [initandlisten] connection accepted from 10.210.208.12:44809 #17
Sun Dec 12 19:16:23 [initandlisten] connection accepted from 10.210.208.12:44821 #18
Mon Dec 13 00:17:29 [conn2] end connection 10.210.208.29:48221
Mon Dec 13 00:17:29 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:29 [interruptThread] now exiting
Mon Dec 13 00:17:29 dbexit:
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 39
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 40
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 41
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 42
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:29 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:29 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:29 [conn5] end connection 127.0.0.1:58885
Mon Dec 13 00:17:29 [rs_sync] MessagingPort flush send() errno:9 Bad file descriptor 10.210.208.29:50050
Mon Dec 13 00:17:29 [rs_sync] caught exception (socket exception) in destructor (~PiggyBackData)
Mon Dec 13 00:17:29 [rs_sync] replSet syncThread: 10278 dbclient error communicating with server
Mon Dec 13 00:17:29 closeAllFiles() finished
Mon Dec 13 00:17:29 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:29 dbexit: really exiting now
[shardsvr5 pri]
Mon Dec 13 00:17:15 [conn9] getmore local.oplog.rs cid:4329212966760209332 getMore: { ts:
} bytes:20 nreturned:0 3000ms
Mon Dec 13 00:17:27 [conn9] getmore local.oplog.rs cid:4329212966760209332 getMore: { ts:
} bytes:20 nreturned:0 3021ms
Mon Dec 13 00:17:29 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:29 [interruptThread] now exiting
Mon Dec 13 00:17:29 dbexit:
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 21
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 22
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 23
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 24
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:29 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:29 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:29 [conn4] end connection 127.0.0.1:45056
Mon Dec 13 00:17:29 closeAllFiles() finished
Mon Dec 13 00:17:29 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:29 dbexit: really exiting now
Mon Dec 13 00:17:29 [conn8] end connection 10.210.208.28:36734
[shardsvr4 sec]
Sun Dec 12 19:16:23 [initandlisten] connection accepted from 10.210.208.12:37386 #17
Sun Dec 12 19:17:20 [initandlisten] connection accepted from 10.210.208.24:44125 #18
Mon Dec 13 00:17:29 got kill or ctrl c or hup signal 1 (Hangup), will terminate after current cmd ends
Mon Dec 13 00:17:29 [interruptThread] now exiting
Mon Dec 13 00:17:29 dbexit:
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to close listening sockets...
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 21
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 22
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 23
Mon Dec 13 00:17:29 [interruptThread] closing listening socket: 24
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to flush oplog...
Mon Dec 13 00:17:29 [interruptThread] shutdown: going to close sockets...
Mon Dec 13 00:17:29 [interruptThread] shutdown: waiting for fs preallocator...
Mon Dec 13 00:17:29 [interruptThread] shutdown: closing all files...
Mon Dec 13 00:17:29 [conn3] end connection 127.0.0.1:59354
Mon Dec 13 00:17:29 [conn4] end connection 10.210.208.27:37102
Mon Dec 13 00:17:29 [conn1] end connection 10.210.208.28:56130
Mon Dec 13 00:17:29 [rs_sync] MessagingPort flush send() errno:9 Bad file descriptor 10.210.208.28:50040
Mon Dec 13 00:17:29 [rs_sync] caught exception (socket exception) in destructor (~PiggyBackData)
Mon Dec 13 00:17:29 [rs_sync] replSet syncThread: 10278 dbclient error communicating with server
Mon Dec 13 00:17:29 [conn5] end connection 10.210.208.11:35309
Mon Dec 13 00:17:29 [conn6] end connection 10.210.208.11:59503
Mon Dec 13 00:17:29 closeAllFiles() finished
Mon Dec 13 00:17:29 [interruptThread] shutdown: removing fs lock...
Mon Dec 13 00:17:29 dbexit: really exiting now