[SERVER-6178] Cannot use mongos if subset of config servers can't read from or write to disk Created: 22/Jun/12  Updated: 11/Jul/16  Resolved: 05/Jul/12

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.0.1, 2.0.2, 2.0.3, 2.0.4, 2.0.5, 2.0.6
Fix Version/s: 2.0.7, 2.2.0-rc0

Type: Bug Priority: Major - P3
Reporter: Matthew Barlocker Assignee: Greg Studer
Resolution: Done Votes: 0
Labels: configsrv, mongos
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Any


Issue Links:
Related
related to SERVER-5064 mongos can't start when one config se... Closed
is related to SERVER-6313 config server timeouts not used in al... Closed
Operating System: Linux
Participants:

 Description   

This bug does not normally affect the mongo system we have set up. However, when AWS lost power to one of our EBS volumes, it became very apparent that we could not start any more mongos processes, so our production system came down.

Basics

While it is not easy to get AWS to lose power to EBS volumes, it is very easy to reproduce this bug using NFS and IPTables. We'll have 1 NFS server, and then 1 NFS client. The client will be running all instances of mongod and mongos. The NFS server will host a single share that the NFS client will use for a single config server.

NFS Server Setup

sudo apt-get install nfs-kernel-server
sudo mkdir /srv/nfs/mongo
sudo vi /etc/exports

# /etc/exports
/srv/nfs/mongo <IP of NFS client>/32(rw,sync,no_subtree_check,no_root_squash)

sudo /etc/init.d/nfs-kernel-server restart

NFS Client Setup

sudo apt-get install nfs-common
sudo mkdir -p /nfs/mongo
sudo vi /etc/fstab

<IP of NFS server>:/srv/nfs/mongo /nfs/mongo nfs4 _netdev,auto 0 0

sudo mount /nfs/mongo

Mongo Setup (on same server as NFS Client)

sudo mkdir /db/a1
sudo mkdir /db/a2
sudo mkdir /db/a3
sudo mkdir /db/b1
sudo mkdir /db/b2
sudo mkdir /db/b3
sudo mkdir /db/c1
sudo ln -s /nfs/mongo/c2 /db/c2
sudo mkdir /db/c3

sudo mkdir /var/run/mongo
sudo mkdir /db/logs

/usr/bin/mongod --configsvr --smallfiles --fork --port 27050 --dbpath /db/c1 --logpath /db/logs/c1.log --logappend --pidfilepath /var/run/mongo/c1.pid --maxConns 1024
/usr/bin/mongod --configsvr --smallfiles --fork --port 27051 --dbpath /db/c2 --logpath /db/logs/c2.log --logappend --pidfilepath /var/run/mongo/c1.pid --maxConns 1024
/usr/bin/mongod --configsvr --smallfiles --fork --port 27052 --dbpath /db/c3 --logpath /db/logs/c3.log --logappend --pidfilepath /var/run/mongo/c1.pid --maxConns 1024

/usr/bin/mongod --shardsvr --smallfiles --fork --port 27150 --dbpath /db/a1 --logpath /db/logs/a1.log --logappend --pidfilepath /var/run/mongo/c1.pid --maxConns 1024 --replSet a
/usr/bin/mongod --shardsvr --smallfiles --fork --port 27151 --dbpath /db/a2 --logpath /db/logs/a2.log --logappend --pidfilepath /var/run/mongo/c1.pid --maxConns 1024 --replSet a
/usr/bin/mongod --shardsvr --smallfiles --fork --port 27152 --dbpath /db/a3 --logpath /db/logs/a3.log --logappend --pidfilepath /var/run/mongo/c1.pid --maxConns 1024 --replSet a

/usr/bin/mongod --shardsvr --smallfiles --fork --port 27250 --dbpath /db/b1 --logpath /db/logs/b1.log --logappend --pidfilepath /var/run/mongo/c1.pid --maxConns 1024 --replSet b
/usr/bin/mongod --shardsvr --smallfiles --fork --port 27251 --dbpath /db/b2 --logpath /db/logs/b2.log --logappend --pidfilepath /var/run/mongo/c1.pid --maxConns 1024 --replSet b
/usr/bin/mongod --shardsvr --smallfiles --fork --port 27252 --dbpath /db/b3 --logpath /db/logs/b3.log --logappend --pidfilepath /var/run/mongo/c1.pid --maxConns 1024 --replSet b

sleep 10

echo "rs.initiate({_id: 'a', members: [{_id: 0, host: 'localhost:27150', priority: 2},{_id: 1, host: 'localhost:27151', priority: 1},{_id: 2, host: 'localhost:27152', priority: 0}]})" | mongo localhost:27150
echo "rs.initiate({_id: 'b', members: [{_id: 0, host: 'localhost:27250', priority: 2},{_id: 1, host: 'localhost:27251', priority: 1},{_id: 2, host: 'localhost:27252', priority: 0}]})" | mongo localhost:27250

sleep 30

{{echo "db.runCommand(

{addshard: 'a/localhost:27150'}

)" | mongo admin}}
{{echo "db.runCommand(

{addshard: 'b/localhost:27250'}

)" | mongo admin}}

In a different terminal (one that can be tied up):

/usr/bin/mongos --configdb localhost:27050,localhost:27051,localhost:27052 --fork --logpath /var/log/mongos.log --logappend --port 27017 --maxConns 1024

Notice that mongos starts normally.

Baseline

Connect, using mongo, to the mongos process. Insert some items. Find some items. Do whatever. Notice it all works as expected.

Kill the storage associated with one of the mongod config servers. On the NFS Server:

sudo iptables -I INPUT -s <IP of NFS client>/32 -j DROP

Connect, reconnect, etc. using the mongos process. Notice it all still works as expected.

Bug Manifestation

Kill the mongos process (Ctrl-C should be fine). After it's down, start it up again using the same command as before.

/usr/bin/mongos --configdb localhost:27050,localhost:27051,localhost:27052 --fork --logpath /var/log/mongos.log --logappend --port 27017 --maxConns 1024

Notice that mongos will hang for a minute, and then die.

Expected Outcome

Mongos, even though it connected successfully to the config server with the downed data store, should timeout on it's operations, and treat the config server as a downed server; this should result in a successful start of mongos.



 Comments   
Comment by Greg Studer [ 05/Jul/12 ]

Created a new ticket to track the changes to config server timeouts once mongos has successfully started, linked above.

Comment by Matthew Barlocker [ 28/Jun/12 ]

Yes, as far as I can remember that message looks similar to the one I ran into.

Comment by auto [ 28/Jun/12 ]

Author:

{u'date': u'2012-06-28T15:22:10-07:00', u'email': u'greg@10gen.com', u'name': u'Greg Studer'}

Message: SERVER-6178 catch dbexceptions not socket exceptions during startup consistency check

Backport of commit 29253bec3ba365668d503ca015c4e9a7f4cc3f0d .

Signed-off-by: Tad Marshall <tad@10gen.com>
Branch: v2.0
https://github.com/mongodb/mongo/commit/9dc4c3fb5b7c1851812752d548fd8813afddde26

Comment by auto [ 28/Jun/12 ]

Author:

{u'date': u'2012-06-28T12:32:50-07:00', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-6178 catch dbexceptions not socket exceptions during startup consistency check

blah
Branch: master
https://github.com/mongodb/mongo/commit/29253bec3ba365668d503ca015c4e9a7f4cc3f0d

Comment by Greg Studer [ 28/Jun/12 ]

Secondary issue is that certain connections to the config server don't time out after 30s. This leads to problems as it is possible to connect to the mongos (once it starts up, which takes ~2mins) but certain operations can hang when trying to reload config data.

Comment by Greg Studer [ 28/Jun/12 ]

Issue above seems to be that we catch SocketExceptions and not DBExceptions when we checkConfigServersConsistent in mongos main - if NFS is down, this causes a cursor exception (DBException) which then slips through and terminates mongos.

Mongos does start up after changing this to a DBException, however since the affected config server continues to accept new connections (it just doesn't return data after accepting), the startup process continues to require 30s timeouts and is very slow. Some way of marking the server as "bad" even when the server responds successfully to non-disk operations would be required to avoid this, which is tricky.

Comment by Greg Studer [ 28/Jun/12 ]

Note - problem is not reproduced if all packets to server (port 3000x) are simply dropped, only reproduced if packets to nfs are dropped (port 2049).

Comment by Greg Studer [ 28/Jun/12 ]

Hmm... tried to reproduce on my end (Ubuntu 10.10), and while I'm able to successfully hang the config server (assuming the hanging server is second in the mongos list), I'm not able to reproduce the mongos crash if the config server is second in the list. I am able to reproduce a seemingly similar error if the config server down is the first in the list :

Thu Jun 28 14:49:08 /home/greg/Downloads/mongodb-linux-x86_64-2.0.4/bin/mongos db version v2.0.4, pdfile version 4.5 starting (--help for usage)
Thu Jun 28 14:49:08 git version: 329f3c47fe8136c03392c8f0e548506cb21f8ebf
Thu Jun 28 14:49:08 build info: Linux ip-10-110-9-236 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41
Thu Jun 28 14:49:08  config string : localhost:30001,localhost:30000,localhost:30002
Thu Jun 28 14:49:08 creating new connection to:localhost:30001
Thu Jun 28 14:49:08 BackgroundJob starting: ConnectBG
Thu Jun 28 14:49:08 connected connection!
 
(wait 30 secs)
 
Thu Jun 28 14:49:38 Socket recv() timeout  127.0.0.1:30001
Thu Jun 28 14:49:38 SocketException: remote: 127.0.0.1:30001 error: 9001 socket exception [3] server [127.0.0.1:30001] 
Thu Jun 28 14:49:38 DBClientCursor::init call() failed
Thu Jun 28 14:49:38 User Assertion: 10276:DBClientBase::findN: transport error: localhost:30001 query: { getlasterror: 1 }
uncaught exception in mongos main:
10276 DBClientBase::findN: transport error: localhost:30001 query: { getlasterror: 1 }
 CursorCache at shutdown -  sharded: 0 passthrough: 0

Is this similar to what you saw?

Comment by Matthew Barlocker [ 27/Jun/12 ]

Unfortunately, I shut down the servers that I used to duplicate the issue, and the logs have already rotated out for my production servers. My only way of getting the logs is to reproduce using the steps given above.

Comment by Greg Studer [ 27/Jun/12 ]

We'll start trying to reproduce this issue on our side, thanks for the detailed bug report. There's plenty of information for us to get started, but it would also be helpful if you could post a sample mongos log (especially at high verbosity -vvvvv) if it's easy for you to do so, to verify that we're reproducing the same problem.

Generated at Thu Feb 08 03:10:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.