[SERVER-72774] A node in quiesce mode can win election Created: 12/Jan/23  Updated: 29/Oct/23  Resolved: 10/Feb/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 5.0.14
Fix Version/s: 7.0.0-rc0, 5.0.16, 6.0.6

Type: Bug Priority: Major - P3
Reporter: Dmitry Ryabtsev Assignee: Wenbin Zhu
Resolution: Fixed Votes: 0
Labels: repl-shortlist
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Problem/Incident
Related
Assigned Teams:
Replication
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v6.3, v6.2, v6.0, v5.0, v4.4
Sprint: Repl 2023-02-06, Repl 2023-02-20
Participants:
Case:

 Description   

A replica set member being in quiesce mode can end up winning an election:

{"t":{"$date":"2023-01-04T02:45:08.421+00:00"},"s":"I",  "c":"COMMAND",  "id":4695400, "ctx":"conn67","msg":"Terminating via shutdown command","attr":{"force":true,"timeoutSecs":15}}
{"t":{"$date":"2023-01-04T02:45:08.421+00:00"},"s":"I",  "c":"REPL",     "id":4794602, "ctx":"conn67","msg":"Attempting to enter quiesce mode"}
{"t":{"$date":"2023-01-04T02:45:08.421+00:00"},"s":"I",  "c":"REPL",     "id":4695102, "ctx":"conn67","msg":"Entering quiesce mode for shutdown","attr":{"quiesceTimeMillis":14999}}
 
{"t":{"$date":"2023-01-04T02:45:12.066+00:00"},"s":"I",  "c":"ELECTION", "id":21450,   "ctx":"ReplCoord-0","msg":"Election succeeded, assuming primary role","attr":{"term":158}}
{"t":{"$date":"2023-01-04T02:45:12.066+00:00"},"s":"I",  "c":"REPL",     "id":21358,   "ctx":"ReplCoord-0","msg":"Replica set state transition","attr":{"newState":"PRIMARY","oldState":"SECONDARY"}}

However, being in that mode will not allow the peers to sync from the newly elected primary:

{"t":{"$date":"2023-01-04T02:45:12.978+00:00"},"s":"I",  "c":"REPL",     "id":3873117, "ctx":"BackgroundSync","msg":"Choosing primary as sync source","attr":{"primary":"atlas-mbh24j-shard-00-01.nt8am.mongodb.net:27017"}}
{"t":{"$date":"2023-01-04T02:45:12.978+00:00"},"s":"I",  "c":"CONNPOOL", "id":22576,   "ctx":"ReplCoordExternNetwork","msg":"Connecting","attr":{"hostAndPort":"atlas-mbh24j-shard-00-01.nt8am.mongodb.net:27017"}}
{"t":{"$date":"2023-01-04T02:45:12.996+00:00"},"s":"I",  "c":"REPL",     "id":5579707, "ctx":"ReplCoordExtern-0","msg":"Denylisting candidate due to error","attr":{"candidate":"atlas-mbh24j-shard-00-01.nt8am.mongodb.net:27017","error":{"code":91,"codeName":"ShutdownInProgress","errmsg":"The server is in quiesce mode and will shut down","remainingQuiesceTimeMillis":10428},"denylistDurationSeconds":10,"denylistUntil":{"$date":"2023-01-04T02:45:22.996Z"}}}

This can lead to oplog divergion and, later, a rollback when the shutdown node is restarted to rejoin the replica set.



 Comments   
Comment by Githook User [ 25/Mar/23 ]

Author:

{'name': 'Wenbin Zhu', 'email': 'wenbin.zhu@mongodb.com', 'username': 'WenbinZhu'}

Message: SERVER-72774 Prevent a node in quiesce mode to win election.

(cherry picked from commit 6b19e54d461bab075ade6e3e05767a881ee37597)
Branch: v5.0
https://github.com/mongodb/mongo/commit/48f111ef3858426b6b9ce3e307718acec8bdc080

Comment by Githook User [ 24/Mar/23 ]

Author:

{'name': 'Wenbin Zhu', 'email': 'wenbin.zhu@mongodb.com', 'username': 'WenbinZhu'}

Message: SERVER-72774 Prevent a node in quiesce mode to win election.

(cherry picked from commit 6b19e54d461bab075ade6e3e05767a881ee37597)
Branch: v6.0
https://github.com/mongodb/mongo/commit/2cebbf7bc8b7dc0bccc8d77ac7dce8352dfe79c9

Comment by Githook User [ 10/Feb/23 ]

Author:

{'name': 'Wenbin Zhu', 'email': 'wenbin.zhu@mongodb.com', 'username': 'WenbinZhu'}

Message: SERVER-72774 Prevent a node in quiesce mode to win election.
Branch: master
https://github.com/mongodb/mongo/commit/6b19e54d461bab075ade6e3e05767a881ee37597

Generated at Thu Feb 08 06:22:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.