[SERVER-8273] Mongos view of Replica Set can become stale forever if members change completely between refresh Created: 22/Jan/13  Updated: 19/Apr/18  Resolved: 19/Apr/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.2.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: James Blackburn Assignee: Randolph Tan
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux RHEL 5.5


Attachments: Text File replicaset-errors.txt     File test.patch    
Operating System: Linux
Participants:

 Description   

The ReplicaSetMonitor refreshes the replica set view every 10 sec or when a new operation is requested on a replica set connection that had errored out. The problem comes in when the members of the set changed such that none of the members are part of the members of what the ReplicaSetMonitor have. And the direct consequence is that there will be no way for the monitor to contact any of the new members.

The only work around for this issue is to manually edit the config.shards collection and restart all mongos.

Attached a script, test.patch, that demonstrates this problem.

Original bug report:

We had an issue whereby our mongo Config Servers didn't notice when the host names in one shard was changed.

We have two shards:
rs0: cn14:27118,cn53:27118
rs1: cn54:27118,cn55:27118

These were changed in the replicaset to:
rs0: cn14-ib:27118,cn53-ib:27118
rs1: cn54-ib:27118,cn55-ib:27118

the '-ib' interfaces are different interfaces on the host with the same name (infiniband).

The replicasets appeared to be happy and in sync, for both rs0, and rs1. However only rs0 was updated in the config servers shards collection!

The entire cluster was rebooted over the weekend. Two days later the config.shards collection did not learn the new hostnames of rs1. Also killing and restarting the config servers and mongos' since then hasn't helped.

That said the cluster appeared to work fine, until we enabled sharding on a database. At that point the mongos' and pymongo clients started failing (see attached assertions and backtraces).

The error seen in pymongo:

  File "/users/is/ahlpypi/egg_cache/p/pymongo-2.4.1_1-py2.6-linux-x86_64.egg/pymongo/cursor.py", line 814, in next
    if len(self.__data) or self._refresh():
  File "/users/is/ahlpypi/egg_cache/p/pymongo-2.4.1_1-py2.6-linux-x86_64.egg/pymongo/cursor.py", line 763, in _refresh
    self.__uuid_subtype))
  File "/users/is/ahlpypi/egg_cache/p/pymongo-2.4.1_1-py2.6-linux-x86_64.egg/pymongo/cursor.py", line 720, in __send_message
    self.__uuid_subtype)
  File "/users/is/ahlpypi/egg_cache/p/pymongo-2.4.1_1-py2.6-linux-x86_64.egg/pymongo/helpers.py", line 104, in _unpack_response
    error_object["$err"])
OperationFailure: database error: can't find shard for: cn54-ib:27118

The error seen on the mongos:

Tue Jan 22 15:29:17 [conn270] warning: db exception when initializing on rs1:rs1/cn54:27118,cn55:27118, current connection
 state is { state: { conn: "rs1/cn54-ib:27118,cn55-ib:27118", vinfo: "mongoose.centaur @ 6|1||50fe9e767e7521213b281407", c
ursor: "(none)", count: 0, done: false }, retryNext: false, init: false, finish: false, errored: false } :: caused by :: 1
3129 can't find shard for: cn54-ib:27118
Tue Jan 22 15:29:17 [conn270] AssertionException while processing op type : 2004 to : mongoose.centaur :: caused by :: 131
29 can't find shard for: cn54-ib:27118
Tue Jan 22 15:29:19 [conn270] ERROR: can't get TCP_KEEPINTVL: errno:92 Protocol not available
Tue Jan 22 15:29:19 [conn270] Assertion: 13129:can't find shard for: cn54-ib:27118
0x80de91 0x7d76e9 0x7d786c 0x7688a8 0x763e64 0x769f76 0x76c21a 0x7704ec 0x56b7df 0x58249a 0x5889d9 0x7814a2 0x75a81b 0x4ff
e41 0x7fc0b1 0x3e1c60673d 0x3e1bed3f6d 
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo15printStackTraceERSo+0x21) [0x80de91]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo11msgassertedEiPKc+0x99) [0x7d76e9]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos [0x7d786c]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo15StaticShardInfo4findERKSs+0x358) [0x7688a8]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo5Shard5resetERKSs+0x34) [0x763e64]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo17checkShardVersionEPNS_12DBClientBaseERKSsN5boost10shared_ptrI
KNS_12ChunkManagerEEEbi+0x906) [0x769f76]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo14VersionManager19checkShardVersionCBEPNS_15ShardConnectionEbi+
0x6a) [0x76c21a]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo15ShardConnection11_finishInitEv+0xfc) [0x7704ec]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo27ParallelSortClusteredCursor28setupVersionAndHandleSlaveOkEN5b
oost10shared_ptrINS_23ParallelConnectionStateEEERKNS_5ShardENS2_IS5_EERKNS_15NamespaceStringERKSsNS2_IKNS_12ChunkManagerEE
E+0x19f) [0x56b7df]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo27ParallelSortClusteredCursor9startInitEv+0xdea) [0x58249a]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo27ParallelSortClusteredCursor8fullInitEv+0x9) [0x5889d9]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo13ShardStrategy7queryOpERNS_7RequestE+0x472) [0x7814a2]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo7Request7processEi+0x1fb) [0x75a81b]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMess
agingPortEPNS_9LastErrorE+0x71) [0x4ffe41]
 /opt/ahl/releases/mongodb/2.2.1-1.ahl/bin/mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x411) [0x7fc0b1]
 /lib64/libpthread.so.0 [0x3e1c60673d]
 /lib64/libc.so.6(clone+0x6d) [0x3e1bed3f6d]
Tue Jan 22 15:29:19 [conn270] warning: db exception when initializing on rs1:rs1/cn54:27118,cn55:27118, current connection
 state is { state: { conn: "rs1/cn54-ib:27118,cn55-ib:27118", vinfo: "mongoose.centaur @ 6|1||50fe9e767e7521213b281407", c
ursor: "(none)", count: 0, done: false }, retryNext: false, init: false, finish: false, errored: false } :: caused by :: 1
3129 can't find shard for: cn54-ib:27118
Tue Jan 22 15:29:19 [conn270] AssertionException while processing op type : 2004 to : mongoose.centaur :: caused by :: 131
29 can't find shard for: cn54-ib:27118

We ended up fixing this by manually changing the rs1 location in the config.shards collection.

According to this news group posting by Eliot, this shouldn't be necessary:
https://groups.google.com/d/topic/mongodb-user/9L3LM5nK5aI/discussion



 Comments   
Comment by Gregory McKeon (Inactive) [ 19/Apr/18 ]

This is also an issue in drivers - we don't see a good solution beyond using SRV in 3.6, so we're closing WaD for now.

Comment by Randolph Tan [ 27/Feb/13 ]

My apologies, I got sidetracked when I found the issue. You are correct that the ERROR log message was relevant. This only appears when mongos tries to update the config server with a new seed list. The newer mongos will have a more detailed log explaining why it failed. We don't encourage changing the config.shards collection manually as it easy to make mistakes.

Comment by James Blackburn [ 16/Feb/13 ]

And the direct consequence is that there will be no way for the monitor to contact any of the new members.

Note this isn't the case in this bug report. The new interfaces cn54-ib is an interface on the same host cn54. The mongods are contactable through both the old and the new dns name. Looking at the logs there appears to be some internal confusion - clearly the mongos' are seeing the hosts at the new address, they're just unable to update the config db.

Also we did a lot of googling and couldn't find any docs for recovering from a situation like this. It would be great if there was a doc that said it was safe to change config.shards manually.

Comment by Eliot Horowitz (Inactive) [ 16/Feb/13 ]

Not sure there is any real solution to this.
Lots of hacky partial ones, but nothing that makes sense to me.

Comment by James Blackburn [ 30/Jan/13 ]

It's happened again. We've removed a host from one of the shards, and the config servers haven't noticed.

Meanwhile the mongos' have loads of this:

Wed Jan 30 15:22:07 [WriteBackListener-cn17-ib:27118] WriteBackListener exception : socket exception [CONNECT_ERROR] for cn17-ib
Wed Jan 30 15:22:09 [WriteBackListener-cn17-ib:27118] WriteBackListener exception : socket exception [CONNECT_ERROR] for cn17-ib
Wed Jan 30 15:22:12 [WriteBackListener-cn17-ib:27118] WriteBackListener exception : socket exception [CONNECT_ERROR] for cn17-ib
Wed Jan 30 15:22:16 [WriteBackListener-cn17-ib:27118] WriteBackListener exception : socket exception [CONNECT_ERROR] for cn17-ib
Wed Jan 30 15:22:21 [WriteBackListener-cn17-ib:27118] WriteBackListener exception : socket exception [CONNECT_ERROR] for cn17-ib
Wed Jan 30 15:22:27 [WriteBackListener-cn17-ib:27118] WriteBackListener exception : socket exception [CONNECT_ERROR] for cn17-ib

What's the best way to fix this?

Comment by James Blackburn [ 23/Jan/13 ]

I've attached the output of:

grep -i -E 'replica|rs0|rs1' 

on that log file.

Comment by James Blackburn [ 23/Jan/13 ]

Looking in one of the mongos logs from last Thursday afternoon:

 
Thu Jan 17 16:50:47 [mongosMain] SyncClusterConnection connecting to [cn53:27117]
Thu Jan 17 16:50:47 [mongosMain] SyncClusterConnection connecting to [cn54:27117]
Thu Jan 17 16:50:47 [mongosMain] SyncClusterConnection connecting to [cn55:27117]
Thu Jan 17 16:50:47 [Balancer] about to contact config servers and shards
Thu Jan 17 16:50:47 [mongosMain] waiting for connections on port 27119
Thu Jan 17 16:50:47 [websvr] admin web console waiting for connections on port 28119
Thu Jan 17 16:50:47 [Balancer] SyncClusterConnection connecting to [cn53:27117]
Thu Jan 17 16:50:47 [Balancer] SyncClusterConnection connecting to [cn54:27117]
Thu Jan 17 16:50:47 [Balancer] SyncClusterConnection connecting to [cn55:27117]
Thu Jan 17 16:50:47 [Balancer] starting new replica set monitor for replica set rs0 with seed of cn14:27118,cn53:27118
Thu Jan 17 16:50:47 [Balancer] successfully connected to seed cn14:27118 for replica set rs0
Thu Jan 17 16:50:47 [Balancer] changing hosts to { 0: "cn14-ib:27118", 1: "cn53-ib:27118" } from rs0/
Thu Jan 17 16:50:47 [Balancer] trying to add new host cn14-ib:27118 to replica set rs0
Thu Jan 17 16:50:47 [Balancer] successfully connected to new host cn14-ib:27118 in replica set rs0
Thu Jan 17 16:50:47 [Balancer] trying to add new host cn53-ib:27118 to replica set rs0
Thu Jan 17 16:50:47 [Balancer] successfully connected to new host cn53-ib:27118 in replica set rs0
Thu Jan 17 16:50:47 [Balancer] scoped connection to cn53:27117,cn54:27117,cn55:27117 not being returned to the pool
Thu Jan 17 16:50:47 [Balancer] ERROR: RSChangeWatcher: could not update config db for set: rs0 to: rs0/cn14-ib:27118,cn53-ib:27118
Thu Jan 17 16:50:47 [Balancer] successfully connected to seed cn53:27118 for replica set rs0
Thu Jan 17 16:50:47 [Balancer] Primary for replica set rs0 changed to cn14-ib:27118
Thu Jan 17 16:50:47 [Balancer] replica set monitor for replica set rs0 started, address is rs0/cn14-ib:27118,cn53-ib:27118
Thu Jan 17 16:50:47 [ReplicaSetMonitorWatcher] starting
Thu Jan 17 16:50:47 [Balancer] starting new replica set monitor for replica set rs1 with seed of cn54:27118,cn55:27118
Thu Jan 17 16:50:47 [Balancer] successfully connected to seed cn54:27118 for replica set rs1
Thu Jan 17 16:50:47 [Balancer] changing hosts to { 0: "cn54-ib:27118", 1: "cn55-ib:27118" } from rs1/
Thu Jan 17 16:50:47 [Balancer] trying to add new host cn54-ib:27118 to replica set rs1
Thu Jan 17 16:50:47 [Balancer] successfully connected to new host cn54-ib:27118 in replica set rs1
Thu Jan 17 16:50:47 [Balancer] trying to add new host cn55-ib:27118 to replica set rs1
Thu Jan 17 16:50:47 [Balancer] successfully connected to new host cn55-ib:27118 in replica set rs1
Thu Jan 17 16:50:47 [Balancer] SyncClusterConnection connecting to [cn53:27117]
Thu Jan 17 16:50:47 [Balancer] SyncClusterConnection connecting to [cn54:27117]
Thu Jan 17 16:50:47 [Balancer] SyncClusterConnection connecting to [cn55:27117]
Thu Jan 17 16:50:47 [Balancer] scoped connection to cn53:27117,cn54:27117,cn55:27117 not being returned to the pool
Thu Jan 17 16:50:47 [Balancer] ERROR: RSChangeWatcher: could not update config db for set: rs1 to: rs1/cn54-ib:27118,cn55-ib:27118
Thu Jan 17 16:50:47 [Balancer] successfully connected to seed cn55:27118 for replica set rs1
Thu Jan 17 16:50:47 [Balancer] Primary for replica set rs1 changed to cn54-ib:27118
Thu Jan 17 16:50:47 [Balancer] replica set monitor for replica set rs1 started, address is rs1/cn54-ib:27118,cn55-ib:27118
Thu Jan 17 16:50:47 [Balancer] config servers and shards contacted successfully
Thu Jan 17 16:50:47 [Balancer] balancer id: dlondbahls98.maninvestments.com:27119 started at Jan 17 16:50:47
Thu Jan 17 16:50:47 [Balancer] created new distributed lock for balancer on cn53:27117,cn54:27117,cn55:27117 ( lock timeout : 900000, ping interval : 30000, process : 0 )

Perhaps:

ERROR: RSChangeWatcher: could not update config db for set: rs1 to: rs1/cn54-ib:27118,cn55-ib:27118

is relevant?

Comment by James Blackburn [ 22/Jan/13 ]

Thanks!

The cluster is attached to MMS under the HAL group name. Some of the earlier pings no longer appear, but if you have history of older pings from this morning or before, you can see the problem.

Comment by Scott Hernandez (Inactive) [ 22/Jan/13 ]

Great, thanks for the info and report. We have made some changes which I believe will be better in 2.4, but might not be able to be, or have been, back-ported.

I am having one of the active devs on those changes see if this case was covered in those changes.

Comment by James Blackburn [ 22/Jan/13 ]

Yes, the replica sets were reconfigured middle of last week.

Moreover the entire cluster was restarted over the weekend for bios firmware upgrades.

Today too we killed and restarted mongos's a few time hoping it would pick up the change.

Oddly it did pick up the change to one of the replicasets last week.

Comment by Scott Hernandez (Inactive) [ 22/Jan/13 ]

Did you restart any mongos instances since the replica set reconfig?

Generated at Thu Feb 08 03:16:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.