[SERVER-14863] Mongos ReplicaSetMonitorWatcher continues to monitor drained/removed shard Created: 12/Aug/14  Updated: 06/Dec/22  Resolved: 25/Jul/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.4.10, 2.6.3, 2.7.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Victor Hooi Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Done Votes: 9
Labels: ShardingRoughEdges
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Sharding
Operating System: ALL
Steps To Reproduce:

Create a sharded cluster with multiple mongos processes connected to it. For example, using mlaunch:

mlaunch init --sharded 2 --replicaset 3 --mongos 4

Enable logging on a database/collection:

> ./mongo
MongoDB shell version: 2.7.5-pre-
connecting to: test
mongos> sh.enableSharding("foo")
{ "ok" : 1 }
mongos> sh.shardCollection("foo.dummydata", {name: "hashed"})
{ "collectionsharded" : "foo.dummydata", "ok" : 1 }

Insert some dummy data:

mongos> for (var i = 1; i <= 2000; i++) db.dummydata.insert( { name : i, foo: "Lorem ipsum dolor sic amet." } )
WriteResult({ "nInserted" : 1 })

Increase the logging on each of your mongos processes to logging level 4:

> ./mongo --port 27017
MongoDB shell version: 2.7.5-pre-
connecting to: 127.0.0.1:27017/test
mongos> use admin
switched to db admin
mongos> db.runCommand({setParameter:1, logLevel: 4})
{ "was" : 0, "ok" : 1 }
mongos> quit()

Start the draining process:

mongos> use admin
switched to db admin
mongos> db.runCommand({removeShard:"shard02"})
{
        "msg" : "draining started successfully",
        "state" : "started",
        "shard" : "shard02",
        "ok" : 1
}

After the chunks have finished draining, run it a second time to remove the shard:

mongos> use admin
switched to db admin
mongos> db.runCommand({removeShard: "shard02"})
{
        "msg" : "removeshard completed successfully",
        "state" : "completed",
        "shard" : "shard02",
        "ok" : 1
}

Run flushRouterConfig on each mongos:

> ./mongo --port 27018
MongoDB shell version: 2.7.5-pre-
connecting to: 127.0.0.1:27018/test
mongos> use admin
switched to db admin
mongos> db.adminCommand({flushRouterConfig: 1})
{ "flushed" : true, "ok" : 1 }
mongos> quit()

The mongos on which you performed the drain will only check for shard01:

> grep "checking replica set" mongos_27017.log | tail -n 10
2014-08-12T12:20:03.373+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T12:20:13.377+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T12:20:23.382+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T12:20:33.387+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T12:20:43.392+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T12:20:53.396+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T12:21:03.401+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T12:21:13.406+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T12:21:23.413+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T12:21:33.417+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01

However, the other mongos processes will still have a ReplicaSetMonitorWatcher checking for shard02:

> grep "checking replica set" mongos_27018.log | tail -n 10
2014-08-12T12:22:33.512+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard02
2014-08-12T12:22:33.515+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T12:22:43.520+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard02
2014-08-12T12:22:43.523+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T12:22:53.527+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard02
2014-08-12T12:22:53.535+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T12:23:03.539+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard02
2014-08-12T12:23:03.541+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T12:23:13.546+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard02
2014-08-12T12:23:13.549+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01

After a reboot of the affected mongos, they no longer monitor the removed shard.

> mlaunch stop 27018
1 node stopped.
> mlaunch start 27018
launching: /Users/victorhooi/code/mongo/mongos on port 27018
> cd data/mongos/
> grep "checking replica set" mongos_27018.log | tail -n 10
2014-08-12T14:01:30.956+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T14:01:40.962+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T14:01:50.974+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T14:02:00.980+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T14:02:10.985+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T14:02:20.991+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T14:02:30.997+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T14:02:41.002+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T14:02:51.007+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-12T14:03:01.014+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01

Participants:
Case:

 Description   

We have a MongoDD sharded cluster with two shards. We have multiple mongos processes connected to this cluster.

Through one of the mongos processes, we initiate a drain of, and removal of one of the shards. We also run flushRouterConfig on the other mongos processes.

The other mongos processes continue to have ReplicaSetMonitorWatcher's that check for the removed shard. A restart of the mongos seems to be the only way to get it to recognise that the shard has been removed.

I have tested the above behaviour against 2.4.10, 2.6.3 and 2.7.5 (Git version c184143fa4d8a4fdf4fdc684404d4aad3e55794b)



 Comments   
Comment by Benjamin Caimano (Inactive) [ 25/Jul/19 ]

The ReplicaSetMonitorWatcher no longer exists.

Comment by David Murphy [ 18/Aug/14 ]

I would think that we should think about this in a more global way. If the topology of the clusters nodes is changed ( that is it had removeShard or addShard executed) it should update the config version in such a way that every mongos is is forced to reload the config. This would prevent only some mongos' from knowing about that change right?

Comment by Greg Studer [ 13/Aug/14 ]

Well, rejecting connections via a firewall approach wouldn't require any processes to be stopped, but it isn't very elegant.

Comment by Victor Hooi [ 13/Aug/14 ]

greg_10gen Thanks for that, I can confirm that if you take all of shard02 down (i.e. stop the entire replica set), it does stop monitoring it after a while. I stopped the replica set at 13:50 (GMT +10). Below are the mongos logs afterwards:

2014-08-13T13:51:18.166+1000 W          [ReplicaSetMonitorWatcher] No primary detected for set shard02
...
2014-08-13T13:55:28.430+1000 I          [ReplicaSetMonitorWatcher] All nodes for set shard02 are down. This has happened for 25 checks in a row. Polling will stop after 5 more failed checks
...
2014-08-13T13:56:08.475+1000 W          [ReplicaSetMonitorWatcher] No primary detected for set shard02
2014-08-13T13:56:08.476+1000 I          [ReplicaSetMonitorWatcher] All nodes for set shard02 are down. This has happened for 29 checks in a row. Polling will stop after 1 more failed checks
...
2014-08-13T13:56:18.487+1000 D          [ReplicaSetMonitorWatcher] User Assertion: 13328:connection pool: connect failed oswin-rmbp.local:27025 : couldn't connect to server oswin-rmbp.local:27025 (10.211.55.2), connection attempt failed
2014-08-13T13:56:18.487+1000 W          [ReplicaSetMonitorWatcher] No primary detected for set shard02
2014-08-13T13:56:18.487+1000 I          [ReplicaSetMonitorWatcher] All nodes for set shard02 are down. This has happened for 30 checks in a row. Polling will stop after 0 more failed checks
2014-08-13T13:56:18.488+1000 I          [ReplicaSetMonitorWatcher] Replica set shard02 was down for 30 checks in a row. Stopping polled monitoring of the set.
2014-08-13T13:56:18.488+1000 D NETWORK  [ReplicaSetMonitorWatcher] Removing ReplicaSetMonitor for shard02 from replica set table
2014-08-13T13:56:18.488+1000 D NETWORK  [ReplicaSetMonitorWatcher] Removing connections from all pools for host: shard02
2014-08-13T13:56:18.488+1000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: shard01
2014-08-13T13:56:18.488+1000 D NETWORK  [ReplicaSetMonitorWatcher] Starting new refresh of replica set shard01
2014-08-13T13:56:18.488+1000 D NETWORK  [ReplicaSetMonitorWatcher] polling for status of connection to 10.211.55.2:27021, no events
2014-08-13T13:56:18.489+1000 D NETWORK  [ReplicaSetMonitorWatcher] Updating host oswin-rmbp.local:27021 based on ismaster reply: { setName: "shard01", setVersion: 1, ismaster: true, secondary: false, hosts: [ "oswin-rmbp.local:27021", "oswin-rmbp.local:27023", "oswin-rmbp.local:27022" ], primary: "oswin-rmbp.local:27021", me: "oswin-rmbp.local:27021", maxBsonObj

So this is another way to do it. However, I suspect this isn't ideal either, and is probably just as intrusive as needing to restart the mongos. Are you aware of any way to remove a shard and stop the monitoring, without needing to restart or terminate any processes?

Comment by Greg Studer [ 12/Aug/14 ]

If the replica set is taken offline (or firewalled), after 5 minutes the mongos should stop monitoring it. Was this the case in your tests?

Agree though that mongos could be smarter, and if the shard is being repurposed it may not be useful to shut it down for 5 minutes before reusing it.

Generated at Thu Feb 08 03:36:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.