[SERVER-47613] Invariant in processReplSetRequestVotes Created: 17/Apr/20  Updated: 29/Oct/23  Resolved: 22/Apr/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.0.19, 4.2.7, 4.4.0-rc3, 4.7.0

Type: Bug Priority: Major - P3
Reporter: A. Jesse Jiryu Davis Assignee: A. Jesse Jiryu Davis
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4, v4.2, v4.0
Participants:
Linked BF Score: 16

 Description   

If a node receives a heartbeat reconfig and can't find itself in the config due to a network issue, it sets TopologyCoordinator::_selfIndex to -1. It logs like:

Cannot find self in new replica set configuration; I must be removed{"error":{"code":74,"codeName":"NodeNotFound","errmsg":"No host described in new configuration with {version: 3, term: 1} for replica set server7781-configRS maps to this node"}}

If TopologyCoordinator::processReplSetRequestVotes then receives a request with the correct config term and version, it passes the check added in SERVER-46387, and goes on to check whether _selfConfig().isArbiter(). The node crashes with an invariant in _selfConfig() because _selfIndex is -1.

The root cause is a network problem that prevents the node from finding itself in the config. We've observed mysterious DNS issues in EC2 that sometimes prevent mongod from resolving its own address in repl::isSelf(), perhaps the build failure I'm debugging is an example of that. Regardless, we must prevent any scenario that uses -1 as a member index.



 Comments   
Comment by Githook User [ 25/Apr/20 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-47613 Fix invariant when a removed member votes

(cherry picked from commit cc814e4c87c1ae20ef7c0840344496043dbdf18d)
Branch: v4.0
https://github.com/mongodb/mongo/commit/be0eb178a1c599497a307a5b114d112343fab9c1

Comment by Githook User [ 24/Apr/20 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-47613 Fix invariant when a removed member votes

(cherry picked from commit cc814e4c87c1ae20ef7c0840344496043dbdf18d)

  1. Conflicts:
  2. src/mongo/db/repl/replication_coordinator_impl.cpp
  3. src/mongo/db/repl/replication_coordinator_impl_test.cpp
  4. src/mongo/db/repl/topology_coordinator_v1_test.cpp
    Branch: v4.2
    https://github.com/mongodb/mongo/commit/fe855c5d116316c50905cf5900c23801fd0c6f68
Comment by Githook User [ 24/Apr/20 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-47613 Fix invariant when a removed member votes

(cherry picked from commit cc814e4c87c1ae20ef7c0840344496043dbdf18d)
Branch: v4.4
https://github.com/mongodb/mongo/commit/111907e47c8b1b3fb2de31308ff3328cb13b8bd3

Comment by Githook User [ 21/Apr/20 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-47613 Fix invariant when a removed member votes
Branch: master
https://github.com/mongodb/mongo/commit/cc814e4c87c1ae20ef7c0840344496043dbdf18d

Generated at Thu Feb 08 05:14:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.