-
Type:
Bug
-
Resolution: Duplicate
-
Priority:
Major - P3
-
None
-
Affects Version/s: 2.0.2
-
Component/s: None
-
Environment:Debian Linux with official 10gen mongodb 2.0.2 debs
-
Linux
11Setup is 3 shards. One collection was just rebalancing or had just been to the 3rd shard (xxxB2).
The primary of this shard had been rs.stepDown():ed and no new primary had been elected yet.
Mongos crashed on one machine leaving this in the log:
Mon Jan 30 16:30:28 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongo1:27027
{ setName: "xxxB2", ismaster: false, secondary: true, hosts: [ "mongo1:27027" ], passives: [ "mongo4:27027" ], arbiters: [ "task2:27027" ], me: "mongo1:27027", maxBsonObjectSize: 16777216, ok: 1.0 }Mon Jan 30 16:30:28 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongo4:27027
{ setName: "xxxB2", ismaster: false, secondary: true, hosts: [ "mongo1:27027" ], passives: [ "mongo4:27027" ], arbiters: [ "task2:27027" ], passive: true, me: "mongo4:27027", maxBsonObjectSize: 16777216, ok: 1.0 }Mon Jan 30 16:30:30 [conn38697] end connection 172.16.49.111:35486
Mon Jan 30 16:30:30 [mongosMain] connection accepted from 172.16.49.111:35489 #38698
Mon Jan 30 16:30:34 [conn38698] end connection 172.16.49.111:35489
Mon Jan 30 16:30:34 [mongosMain] connection accepted from 172.16.49.111:35492 #38699
Mon Jan 30 16:30:38 [Balancer] SyncClusterConnection connecting to [config1:27019]
Mon Jan 30 16:30:38 [Balancer] SyncClusterConnection connecting to [config2:27019]
Mon Jan 30 16:30:38 [Balancer] SyncClusterConnection connecting to [config3:27019]
Mon Jan 30 16:30:38 [Balancer] could not acquire lock 'balancer/mongo2:27020:1327595773:1804289383' (another update won)
Mon Jan 30 16:30:38 [Balancer] distributed lock 'balancer/mongo2:27020:1327595773:1804289383' was not acquired.
Mon Jan 30 16:30:40 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongo1:27027
Mon Jan 30 16:30:40 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongo4:27027
{ setName: "xxxB2", ismaster: false, secondary: true, hosts: [ "mongo1:27027" ], passives: [ "mongo4:27027" ], arbiters: [ "task2:27027" ], passive: true, me: "mongo4:27027", maxBsonObjectSize: 16777216, ok: 1.0 }Mon Jan 30 16:30:45 [conn38699] ~ScopedDbConnection: _conn != null
Mon Jan 30 16:30:45 [conn38699] DBException in process: socket exception
Mon Jan 30 16:30:45 [conn38699] SocketException handling request, closing client connection: 9001 socket exception [2] server [172.16.49.111:35492]
Mon Jan 30 16:30:46 [mongosMain] connection accepted from 172.16.49.111:35496 #38700
Mon Jan 30 16:30:47 [conn38700] end connection 172.16.49.111:35496
Mon Jan 30 16:30:47 [mongosMain] connection accepted from 172.16.49.111:35497 #38701
Mon Jan 30 16:30:47 [conn38701] Socket recv() errno:104 Connection reset by peer 172.16.49.111:27027
Mon Jan 30 16:30:47 [conn38701] SocketException: remote: 172.16.49.111:27027 error: 9001 socket exception [1] server [172.16.49.111:27027]
Mon Jan 30 16:30:47 [conn38701] DBClientCursor::init call() failed
Mon Jan 30 16:30:47 [conn38701] end connection 172.16.49.111:35497
Mon Jan 30 16:30:47 [mongosMain] connection accepted from 172.16.49.111:35498 #38702
Mon Jan 30 16:30:47 [conn38702] got not master for: mongo1:27027
Received signal 11
Backtrace: