[SERVER-21182] Fix replica set monitor ownership Created: 22/Oct/15  Updated: 12/Aug/19  Resolved: 12/Aug/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Crystal Horn Assignee: Benjamin Caimano (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

Currently, there is 1-1 relationship between RemoteCommandTargeter and ReplicaSetMonitor. Because of this, if the RSM becomes unusable due to none of the hosts being reachable, the targeter will forever be using an unusable RSM, which may happen if shards are inaccessible in the beginning.

What saves us in this case is that we periodically (every 30 sec) reload the shard registry, done by the balancer loop, which will recreate the targeters and install new RSMs.

We should make the RemoteCommandTargeter own the ReplicaSetMonitor and introduce a polling thread in the ShardRegistry so that we have finer control over its behaviour.



 Comments   
Comment by Benjamin Caimano (Inactive) [ 12/Aug/19 ]

There is currently one RCM to many RCTs/DBClientRSes. That's no longer 1-1. The RSM no longer becomes permanently unusable if none of those hosts are unreachable. Wouldn't surprise me if it did do that at one point, but it hasn't been the case at least since 3.6. It also does expedited scanning since early 4.1. I'd say we don't have to worry about this ticket any more.

Comment by Benjamin Caimano (Inactive) [ 25/Jul/19 ]

Stealing to service arch and investigating

Comment by Andy Schwerin [ 09/Nov/15 ]

Kal, please put a more complete description on this and move to 3.1 Desired.

Generated at Thu Feb 08 03:56:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.