[SERVER-35649] Nodes removed due to isSelf failure should re-attempt to find themselves Created: 18/Jun/18  Updated: 08/Jan/24  Resolved: 21/Sep/20

Status: Closed
Project: Core Server
Component/s: Networking, Replication
Affects Version/s: 3.4.13
Fix Version/s: 4.8.0, 4.4.2, 4.2.13, 4.0.24

Type: New Feature Priority: Minor - P4
Reporter: Owen Allen Assignee: A. Jesse Jiryu Davis
Resolution: Fixed Votes: 3
Labels: former-quick-wins, gm-ack
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongod.log    
Issue Links:
Backports
Depends
Related
related to SERVER-41031 After an unreachable node is added an... Open
related to SERVER-54121 Single replica set node removed due t... Open
related to SERVER-62699 Replica set fails to restart after sh... Backlog
related to SERVER-48480 Abort initial sync upon transition to... Closed
related to SERVER-40159 Add retry logic for name resolution f... Closed
is related to SERVER-62699 Replica set fails to restart after sh... Backlog
is related to SERVER-48178 Finding self in reconfig may be inter... Closed
is related to SERVER-51163 Mark nodes returning InvalidReplicaSe... Closed
is related to SERVER-40159 Add retry logic for name resolution f... Closed
Backwards Compatibility: Minor Change
Backport Requested:
v4.4, v4.2, v4.0
Sprint: Repl 2020-08-24, Repl 2020-09-07, Repl 2020-09-21, Repl 2020-10-05
Participants:
Case:
Linked BF Score: 40

 Description   

We have some dev/staging environments which are locally hosted in our office building. They are entirely for internal usage, so there uptime isn't critical. We have recently been experiencing power outages that cause all 3 members of the replica set to go down and then when power restores come back up at the same time. This has happened about 5 times now and each time when the replica set comes back up both the primary/secondary end up in the REMOVED status and never recover unless we manually restart one of the mongo processes.

mongo-dev1 rs.status()

{
        "state" : 10,
        "stateStr" : "REMOVED",
        "uptime" : 199841,
        "optime" : {
                "ts" : Timestamp(1529137449, 1),
                "t" : NumberLong(590)
        },
        "optimeDate" : ISODate("2018-06-16T08:24:09Z"),
        "ok" : 0,
        "errmsg" : "Our replica set config is invalid or we are not a member of it",
        "code" : 93,
        "codeName" : "InvalidReplicaSetConfig"
}

mongo-dev2 rs.status()

{
        "state" : 10,
        "stateStr" : "REMOVED",
        "uptime" : 199879,
        "optime" : {
                "ts" : Timestamp(1529137449, 1),
                "t" : NumberLong(590)
        },
        "optimeDate" : ISODate("2018-06-16T08:24:09Z"),
        "ok" : 0,
        "errmsg" : "Our replica set config is invalid or we are not a member of it",
        "code" : 93,
        "codeName" : "InvalidReplicaSetConfig"
}

mongo-dev1 show log rs

2018-06-16T09:10:25.236+0000 I REPL     [replExecDBWorker-0] New replica set config in use: { _id: "dev_cluster1", version: 140719, protocolVersion: 1, members: [ { _id: 0, host: "mongo-dev1.220office.local:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 2.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 3, host: "utility-dev1.220office.local:27017", arbiterOnly: true, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 4, host: "mongo-dev2.2
2018-06-16T09:10:25.236+0000 I REPL     [replExecDBWorker-0] transition to REMOVED
2018-06-17T21:28:09.139+0000 I REPL     [ReplicationExecutor] Member utility-dev1.220office.local:27017 is now in state ARBITER

mongo-dev1 rs.conf()

{
        "_id" : "dev_cluster1",
        "version" : 140719,
        "protocolVersion" : NumberLong(1),
        "members" : [
                {
                        "_id" : 0,
                        "host" : "mongo-dev1.220office.local:27017",
                        "arbiterOnly" : false,
                        "buildIndexes" : true,
                        "hidden" : false,
                        "priority" : 2,
                        "tags" : {                        },
                        "slaveDelay" : NumberLong(0),
                        "votes" : 1
                },
                {
                        "_id" : 3,
                        "host" : "utility-dev1.220office.local:27017",
                        "arbiterOnly" : true,
                        "buildIndexes" : true,
                        "hidden" : false,
                        "priority" : 1,
                        "tags" : {                        },
                        "slaveDelay" : NumberLong(0),
                        "votes" : 1
                },
                {
                        "_id" : 4,
                        "host" : "mongo-dev2.220office.local:27017",
                        "arbiterOnly" : false,
                        "buildIndexes" : true,
                        "hidden" : false,
                        "priority" : 1,
                        "tags" : {                        },
                        "slaveDelay" : NumberLong(0),
                        "votes" : 1
                }
        ],
        "settings" : {
                "chainingAllowed" : true,
                "heartbeatIntervalMillis" : 2000,
                "heartbeatTimeoutSecs" : 10,
                "electionTimeoutMillis" : 10000,
                "catchUpTimeoutMillis" : 60000,
                "getLastErrorModes" : {                },
                "getLastErrorDefaults" : {
                        "w" : 1,
                        "wtimeout" : 0
                }
        }
}

As I read the documentation I can't find much information about the REMOVED status. In our setup, since mongo-dev1 has a priority of 2 and mongo-dev2 has a priority of 1 I would expect mongo-dev1 to be elected as the primary after the reboot.

Is this a bug or are we doing something wrong. If it's a bug, what is the proper procedure for this when all members of the replica set come online at the same time?



 Comments   
Comment by Githook User [ 11/Mar/21 ]

Author:

{'name': 'Xuerui Fa', 'email': 'xuerui.fa@mongodb.com', 'username': 'XueruiFa'}

Message: SERVER-35649 Retry DNS failures during reconfig

(cherry picked from commit 418c61279986a5eaddc66a16b5e288556ad1f6d3)

SERVER-51163 Mark nodes returning InvalidReplicaSetConfig in heartbeats as down

(cherry picked from commit 28c6948a2a02760a69aaee3875c4b2a427528a5a)
Branch: v4.0
https://github.com/mongodb/mongo/commit/6156b718630344212bea269a28ed5bacf96a9b86

Comment by Steven Vannelli [ 01/Mar/21 ]

Requesting a 4.0 backport to resolve BF-20251

Comment by Githook User [ 04/Feb/21 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-35649 Retry DNS failures during reconfig

(cherry picked from commit 418c61279986a5eaddc66a16b5e288556ad1f6d3)
Branch: v4.2
https://github.com/mongodb/mongo/commit/d1871f0fb5f61159aecc76865a4b8a3f5d0159fd

Comment by Githook User [ 15/Oct/20 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-35649 Retry DNS failures during reconfig

(cherry picked from commit 418c61279986a5eaddc66a16b5e288556ad1f6d3)

  1. Conflicts:
  2. etc/backports_required_for_multiversion_tests.yml
  3. jstests/replsets/disallow_adding_initialized_node1.js

SERVER-51163 Mark nodes returning InvalidReplicaSetConfig in heartbeats as down

(cherry picked from commit 28c6948a2a02760a69aaee3875c4b2a427528a5a)
Branch: v4.4
https://github.com/mongodb/mongo/commit/7b05685be5aae841a73c434af59e321ce8e8db83

Comment by Githook User [ 21/Sep/20 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-35649 Retry DNS failures during reconfig
Branch: master
https://github.com/mongodb/mongo/commit/418c61279986a5eaddc66a16b5e288556ad1f6d3

Comment by Nick Brewer [ 29/Jun/18 ]

owenallenaz,

Since DNS infrastructure typically has some level of geographic redundancy, it's not a hard requirement that replica set nodes be able to resolve themselves without DNS - however it's necessary in a case like this where the DNS resolvers are coming online at roughly the same time as (or perhaps slightly after) the mongod nodes.

Is it possible to have nodes that are in REMOVED state, especially if it's due to isSelf failure, check back rather than requiring manual intervention?

I'm passing this along to our replication team as a feature request.

Regards,
Nick

Comment by Owen Allen [ 25/Jun/18 ]

In that situation, it seems like it would be considered best practice (possibly called out in the docs) that replset members should be able to locate themselves without DNS.

Is it possible to have nodes that are in REMOVED state, especially if it's due to isSelf failure, check back rather than requiring manual intervention?

Comment by Spencer Brody (Inactive) [ 25/Jun/18 ]

The first time a mongod logs a new replica set config, whether from a reconfig or at startup when loading the config from the disk, the first thing it does is try to locate itself in the config. In order to do this, it iterates through every member in the config and runs a simple command (called 'isSelf') against each member, with a unique ID to identify that it is the sender. When the node receives an isSelf command that it can determine was from itself because it contains the same unique ID that it sent, it then knows which name in the config corresponds to itself. If the node can't resolve its own hostname during this process, then the node will never be able to send the command to itself, and will thus conclude that it isn't part of the config, and will then transition to the REMOVED state. At that point it won't re-attempt to find itself in the config until it is restarted, or it receives a new config via a replSetReconfig.

Comment by Nick Brewer [ 25/Jun/18 ]

owenallenaz

One of our dev's theory was that it had to do with the mongo boxes coming up at the same time that our DNS server came up (since the power outage claimed it too)

This is a good point - in most of the instances where I've seen a node enter the REMOVED state, it's simply due to its inability to connect to other members of a set, often due to a change in IP address or port. A DNS blackout would have a similar effect. 

To test this a bit more, you could take your resolvers out of the equation entirely and hard-code the correct entries into /etc/hosts, assuming that your nodes have static addresses. I suspect that when the addresses are added in this way, the replica set members will be able to find each other on simultaneous startup. 

Nick

Comment by Owen Allen [ 25/Jun/18 ]

Unfortunately I was too slow to respond and our log files have rolled since then. One of our dev's theory was that it had to do with the mongo boxes coming up at the same time that our DNS server came up (since the power outage claimed it too), and since the initial DNS queries would have failed since our replset uses hostname rather than IPs, it affects the repl set config in a way that it doesn't recover from. From my local tests this appears to be replicate the existing condition.

I have attached a log file from that test. The first few reboots were down trying to replicate all 3 boxes coming up at once (without DNS interruption). In those cases they booted up with no issue. In the last case, I altered the local resolv.conf so that it was no longer able to locate intranet DNS entries (such as the other members of the replset). If you look for when the REMOVED statement is you'll see the before and after chatter. Now, if I correct the resolv.conf and am able to locate intranet resources, Mongo does not correct the issue, it remains in REMOVED status forever. Basically, this signals to me that if Mongo boots up when DNS is down, it can remain broken forever. Now, reading the log entry, perhaps I am setting up the mongo box incorrectly since it states, "2018-06-25T18:10:38.907+0000 I REPL [replExecDBWorker-1] This node is not a member of the config". Perhaps I need to ensure that the boxes are able to locate themselves even if DNS is down. mongod.log

Comment by Nick Brewer [ 19/Jun/18 ]

Hi,

I'd like to see the full log messages starting from the time that the machines are brought back up.

Thanks,
Nick

Generated at Thu Feb 08 04:40:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.