-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Sharding
-
Fully Compatible
-
ALL
-
Sharding 2016-09-19
-
(copied to CRM)
ISSUE DESCRIPTION AND IMPACT
As a workaround for SERVER-23192, MongoDB 3.2.10 introduced an option where a node never stops monitoring a replica set once it has started, no matter how long it appears to be down for. Using this option means you can encounter problems if you remove a shard then add back a shard with the same replica set name.
This parameter is set to false by default, and can be set by executing following command:
db.adminCommand( {setParameter: 1, 'timeOutMonitoringReplicaSets': true} )
DIAGNOSIS AND AFFECTED VERSIONS
This option is included MongoDB 3.2.10 and subsequent releases of MongoDB 3.2. Please note that it is not included in MongoDB 3.4.
REMEDIATION AND WORKAROUNDS
If the operator wishes to re-add the shard using different hosts at a later date, the operator has two choices:
- Restart all the affected nodes.
- Toggle the timeOutMonitoringReplicaSets server parameter introduced in
SERVER-25516from false to true on each affected node. Once the the shard is discovered, switch timeOutMonitoringReplicaSets back to false, usually this process takes about two minutes.
Original description
As a workaround for SERVER-23192 on 3.2 we can introduce an option where we never stop monitoring a replica set once we've started, no matter how long it appears to be down for. Using this option means you can encounter problems if you remove a shard then add back a shard with the same replica set name.
- related to
-
SERVER-23192 mongos and shards will become unusable if contact is lost with all CSRS config server nodes for more than 30 consecutive failed attempts to contact
- Closed