[SERVER-4821] Mongos signal 11 (sigsegv) after DBClientCursor::init failure Created: 31/Jan/12  Updated: 25/Jul/14  Resolved: 02/Feb/12

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 2.0.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Johnny Boy Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: mongos
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian Linux with official 10gen mongodb 2.0.2 debs


Operating System: Linux
Participants:

 Description   

11Setup is 3 shards. One collection was just rebalancing or had just been to the 3rd shard (xxxB2).
The primary of this shard had been rs.stepDown():ed and no new primary had been elected yet.

Mongos crashed on one machine leaving this in the log:

Mon Jan 30 16:30:28 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongo1:27027

{ setName: "xxxB2", ismaster: false, secondary: true, hosts: [ "mongo1:27027" ], passives: [ "mongo4:27027" ], arbiters: [ "task2:27027" ], me: "mongo1:27027", maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jan 30 16:30:28 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongo4:27027

{ setName: "xxxB2", ismaster: false, secondary: true, hosts: [ "mongo1:27027" ], passives: [ "mongo4:27027" ], arbiters: [ "task2:27027" ], passive: true, me: "mongo4:27027", maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jan 30 16:30:30 [conn38697] end connection 172.16.49.111:35486
Mon Jan 30 16:30:30 [mongosMain] connection accepted from 172.16.49.111:35489 #38698
Mon Jan 30 16:30:34 [conn38698] end connection 172.16.49.111:35489
Mon Jan 30 16:30:34 [mongosMain] connection accepted from 172.16.49.111:35492 #38699
Mon Jan 30 16:30:38 [Balancer] SyncClusterConnection connecting to [config1:27019]
Mon Jan 30 16:30:38 [Balancer] SyncClusterConnection connecting to [config2:27019]
Mon Jan 30 16:30:38 [Balancer] SyncClusterConnection connecting to [config3:27019]
Mon Jan 30 16:30:38 [Balancer] could not acquire lock 'balancer/mongo2:27020:1327595773:1804289383' (another update won)
Mon Jan 30 16:30:38 [Balancer] distributed lock 'balancer/mongo2:27020:1327595773:1804289383' was not acquired.
Mon Jan 30 16:30:40 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongo1:27027

{ setName: "xxxB2", ismaster: false, secondary: true, hosts: [ "mongo1:27027" ], passives: [ "mongo4:27027" ], arbiters: [ "task2:27027" ], me: "mongo1:27027", maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jan 30 16:30:40 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongo4:27027

{ setName: "xxxB2", ismaster: false, secondary: true, hosts: [ "mongo1:27027" ], passives: [ "mongo4:27027" ], arbiters: [ "task2:27027" ], passive: true, me: "mongo4:27027", maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jan 30 16:30:45 [conn38699] ~ScopedDbConnection: _conn != null
Mon Jan 30 16:30:45 [conn38699] DBException in process: socket exception
Mon Jan 30 16:30:45 [conn38699] SocketException handling request, closing client connection: 9001 socket exception [2] server [172.16.49.111:35492]
Mon Jan 30 16:30:46 [mongosMain] connection accepted from 172.16.49.111:35496 #38700
Mon Jan 30 16:30:47 [conn38700] end connection 172.16.49.111:35496
Mon Jan 30 16:30:47 [mongosMain] connection accepted from 172.16.49.111:35497 #38701
Mon Jan 30 16:30:47 [conn38701] Socket recv() errno:104 Connection reset by peer 172.16.49.111:27027
Mon Jan 30 16:30:47 [conn38701] SocketException: remote: 172.16.49.111:27027 error: 9001 socket exception [1] server [172.16.49.111:27027]
Mon Jan 30 16:30:47 [conn38701] DBClientCursor::init call() failed
Mon Jan 30 16:30:47 [conn38701] end connection 172.16.49.111:35497
Mon Jan 30 16:30:47 [mongosMain] connection accepted from 172.16.49.111:35498 #38702
Mon Jan 30 16:30:47 [conn38702] got not master for: mongo1:27027
Received signal 11
Backtrace:



 Comments   
Comment by Eliot Horowitz (Inactive) [ 02/Feb/12 ]

See SERVER-4699

Comment by Johnny Boy [ 31/Jan/12 ]

172.16.49.111 is mongo1

Generated at Thu Feb 08 03:07:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.