[SERVER-3303] uncaught exception: error: { "$err" : "socket exception", "code" : 11002 } Created: 20/Jun/11  Updated: 29/Feb/12  Resolved: 22/Nov/11

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 1.8.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Michael Conigliaro Assignee: Greg Studer
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu on EC2


Issue Links:
Duplicate
duplicates SERVER-3040 Too many ReplicaSetMonitor::_checkCon... Closed
Operating System: ALL
Participants:

 Description   

Started a server today and couldn't connect to the database:

$ mongo
MongoDB shell version: 1.8.2
connecting to: test
> use example
switched to db example
> show collections
Mon Jun 20 22:24:43 uncaught exception: error:

{ "$err" : "socket exception", "code" : 11002 }

This was in mongos.log:

Mon Jun 20 21:56:28 /usr/bin/mongos db version v1.8.2-rc3, pdfile version 4.5 starting (--help for usage)
Mon Jun 20 21:56:28 git version: 2d7719228787c9c8100456bc70bf860ec2885732
Mon Jun 20 21:56:28 build sys info: Linux bs-linux64.10gen.cc 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41
Mon Jun 20 21:56:28 config string : mongodb-config03.example.com:27019,mongodb-config02.example.com:27019,mongodb-config01.example.com:27019
Mon Jun 20 21:56:28 creating new connection to:mongodb-config03.example.com:27019
Mon Jun 20 21:56:28 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:28 creating new connection to:mongodb-config02.example.com:27019
Mon Jun 20 21:56:28 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:28 creating new connection to:mongodb-config01.example.com:27019
Mon Jun 20 21:56:28 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:28 SyncClusterConnection connecting to [mongodb-config03.example.com:27019]
Mon Jun 20 21:56:28 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:28 SyncClusterConnection connecting to [mongodb-config02.example.com:27019]
Mon Jun 20 21:56:28 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:28 SyncClusterConnection connecting to [mongodb-config01.example.com:27019]
Mon Jun 20 21:56:28 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:28 MaxChunkSize: value: 200
Mon Jun 20 21:56:28 BackgroundJob starting: Balancer
Mon Jun 20 21:56:28 [Balancer] about to contact config servers and shards
Mon Jun 20 21:56:28 [websvr] web admin interface listening on port 28017
Mon Jun 20 21:56:28 [websvr] fd limit hard:65535 soft:65535 max conn: 52428
Mon Jun 20 21:56:28 [mongosMain] waiting for connections on port 27017
Mon Jun 20 21:56:28 [mongosMain] fd limit hard:65535 soft:65535 max conn: 52428
Mon Jun 20 21:56:28 BackgroundJob starting: cursorTimeout
Mon Jun 20 21:56:29 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:34 [Balancer] error connecting to seed mongodb04.example.com:27018: couldn't connect to server mongodb04.example.com:27018
Mon Jun 20 21:56:34 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:39 [Balancer] error connecting to seed mongodb05.example.com:27018: couldn't connect to server mongodb05.example.com:27018
Mon Jun 20 21:56:39 BackgroundJob starting: ReplicaSetMonitorWatcher
Mon Jun 20 21:56:39 [ReplicaSetMonitorWatcher] starting
Mon Jun 20 21:56:39 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:44 [Balancer] error connecting to seed mongodb02.example.com:27018: couldn't connect to server mongodb02.example.com:27018
Mon Jun 20 21:56:44 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:47 [Balancer] ReplicaSetMonitor::_checkConnection: mongodb01.example.com:27018

{ setName: "1", ismaster: true, secondary: false, hosts: [ "mongodb01.example.com:27018", "mongodb02.example.com:27018" ], arbiters: [ "mongodb03.example.com:27018" ], maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:56:47 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:47 [Balancer] updated set (1) to: 1/mongodb01.example.com:27018,mongodb02.example.com:27018
Mon Jun 20 21:56:47 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:47 [Balancer] ReplicaSetMonitor::_checkConnection: mongodb08.example.com:27018

{ setName: "3", ismaster: false, secondary: true, hosts: [ "mongodb08.example.com:27018", "mongodb07.example.com:27018" ], arbiters: [ "mongodb09.example.com:27018" ], primary: "mongodb07.example.com:27018", maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:56:47 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:47 [Balancer] updated set (3) to: 3/mongodb08.example.com:27018,mongodb07.example.com:27018
Mon Jun 20 21:56:48 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:48 [Balancer] ReplicaSetMonitor::_checkConnection: mongodb11.example.com:27018

{ setName: "4", ismaster: false, secondary: true, hosts: [ "mongodb11.example.com:27018", "mongodb10.example.com:27018" ], arbiters: [ "mongodb12.example.com:27018" ], primary: "mongodb10.example.com:27018", maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:56:48 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:48 [Balancer] updated set (4) to: 4/mongodb11.example.com:27018,mongodb10.example.com:27018
Mon Jun 20 21:56:48 [Balancer] _check : 1/mongodb01.example.com:27018,mongodb02.example.com:27018
Mon Jun 20 21:56:48 [Balancer] ReplicaSetMonitor::_checkConnection: mongodb01.example.com:27018

{ setName: "1", ismaster: true, secondary: false, hosts: [ "mongodb01.example.com:27018", "mongodb02.example.com:27018" ], arbiters: [ "mongodb03.example.com:27018" ], maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:56:48 BackgroundJob starting: ConnectBG
Mon Jun 20 21:56:48 [Balancer] _check : 2/
Mon Jun 20 21:56:50 [Balancer] User Assertion: 10009:ReplicaSetMonitor no master found for set: 2
Mon Jun 20 21:56:50 [Balancer] warning: could not initialize balancer, please check that all shards and config servers are up
Mon Jun 20 21:56:50 [Balancer] will retry to initialize balancer in one minute
Mon Jun 20 21:56:59 [ReplicaSetMonitorWatcher] checking replica set: 1
Mon Jun 20 21:56:59 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongodb01.example.com:27018

{ setName: "1", ismaster: true, secondary: false, hosts: [ "mongodb01.example.com:27018", "mongodb02.example.com:27018" ], arbiters: [ "mongodb03.example.com:27018" ], maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:56:59 [ReplicaSetMonitorWatcher] checking replica set: 2
Mon Jun 20 21:56:59 [ReplicaSetMonitorWatcher] _check : 2/
Mon Jun 20 21:57:01 [ReplicaSetMonitorWatcher] checking replica set: 3
Mon Jun 20 21:57:01 [ReplicaSetMonitorWatcher] _check : 3/mongodb08.example.com:27018,mongodb07.example.com:27018
Mon Jun 20 21:57:01 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongodb08.example.com:27018

{ setName: "3", ismaster: false, secondary: true, hosts: [ "mongodb08.example.com:27018", "mongodb07.example.com:27018" ], arbiters: [ "mongodb09.example.com:27018" ], primary: "mongodb07.example.com:27018", maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:57:01 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongodb07.example.com:27018

{ setName: "3", ismaster: true, secondary: false, hosts: [ "mongodb07.example.com:27018", "mongodb08.example.com:27018" ], arbiters: [ "mongodb09.example.com:27018" ], maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:57:01 [ReplicaSetMonitorWatcher] checking replica set: 4
Mon Jun 20 21:57:01 [ReplicaSetMonitorWatcher] _check : 4/mongodb11.example.com:27018,mongodb10.example.com:27018
Mon Jun 20 21:57:01 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongodb11.example.com:27018

{ setName: "4", ismaster: false, secondary: true, hosts: [ "mongodb11.example.com:27018", "mongodb10.example.com:27018" ], arbiters: [ "mongodb12.example.com:27018" ], primary: "mongodb10.example.com:27018", maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:57:01 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongodb10.example.com:27018

{ setName: "4", ismaster: true, secondary: false, hosts: [ "mongodb10.example.com:27018", "mongodb11.example.com:27018" ], arbiters: [ "mongodb12.example.com:27018" ], maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:57:21 [ReplicaSetMonitorWatcher] checking replica set: 1
Mon Jun 20 21:57:21 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongodb01.example.com:27018

{ setName: "1", ismaster: true, secondary: false, hosts: [ "mongodb01.example.com:27018", "mongodb02.example.com:27018" ], arbiters: [ "mongodb03.example.com:27018" ], maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:57:21 [ReplicaSetMonitorWatcher] checking replica set: 2
Mon Jun 20 21:57:21 [ReplicaSetMonitorWatcher] _check : 2/
Mon Jun 20 21:57:23 [ReplicaSetMonitorWatcher] checking replica set: 3
Mon Jun 20 21:57:23 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongodb07.example.com:27018

{ setName: "3", ismaster: true, secondary: false, hosts: [ "mongodb07.example.com:27018", "mongodb08.example.com:27018" ], arbiters: [ "mongodb09.example.com:27018" ], maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:57:23 [ReplicaSetMonitorWatcher] checking replica set: 4
Mon Jun 20 21:57:23 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongodb10.example.com:27018

{ setName: "4", ismaster: true, secondary: false, hosts: [ "mongodb10.example.com:27018", "mongodb11.example.com:27018" ], arbiters: [ "mongodb12.example.com:27018" ], maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:57:43 [ReplicaSetMonitorWatcher] checking replica set: 1
Mon Jun 20 21:57:43 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongodb01.example.com:27018

{ setName: "1", ismaster: true, secondary: false, hosts: [ "mongodb01.example.com:27018", "mongodb02.example.com:27018" ], arbiters: [ "mongodb03.example.com:27018" ], maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:57:43 [ReplicaSetMonitorWatcher] checking replica set: 2
Mon Jun 20 21:57:43 [ReplicaSetMonitorWatcher] _check : 2/
Mon Jun 20 21:57:45 [ReplicaSetMonitorWatcher] checking replica set: 3
Mon Jun 20 21:57:45 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongodb07.example.com:27018

{ setName: "3", ismaster: true, secondary: false, hosts: [ "mongodb07.example.com:27018", "mongodb08.example.com:27018" ], arbiters: [ "mongodb09.example.com:27018" ], maxBsonObjectSize: 16777216, ok: 1.0 }

Mon Jun 20 21:57:45 [ReplicaSetMonitorWatcher] checking replica set: 4
Mon Jun 20 21:57:45 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongodb10.example.com:27018

{ setName: "4", ismaster: true, secondary: false, hosts: [ "mongodb10.example.com:27018", "mongodb11.example.com:27018" ], arbiters: [ "mongodb12.example.com:27018" ], maxBsonObjectSize: 16777216, ok: 1.0 }

Bouncing mongos fixed this.



 Comments   
Comment by Raja MM [ 06/Dec/11 ]

the mongos may down in your server.

Comment by Greg Studer [ 22/Nov/11 ]

Just noticed that this looks like a dup of SERVER-3040 - the messages seem normal for waiting for replica set, if the rs master was down at that point, as seems to be the case. Let me know if you have any more info here though...

Comment by Michael Conigliaro [ 23/Jun/11 ]

I'm pretty sure this went on for more than a few minutes, because I doubt I would have noticed otherwise. Unfortunately, I don't remember which server this occurred on, so it would be tough to find the log again. However, I'm pretty sure that the above messages were simply repeated over and over. I just grabbed the top part of the log (from when the server was started) so you could see everything that happened before the errors.

All servers should be identical (they're managed via Chef).

Comment by Greg Studer [ 21/Jun/11 ]

Was this a repeatable event - did you try connecting multiple times and was the mongos and your interaction with it only active for ~1 minute (or is this just the top of the log file)? From what's here, it seems like mongos is just waiting for a refresh of the seed data from the config server, which takes a minute or more. DIfferent story if this mongos had been running for a long time in this state.

Also, do you have any special configuration set up for mongodb04/05, or are all your servers identical?

Generated at Thu Feb 08 03:02:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.