[SERVER-53431] Server should respond running operations with appropriate topologyVersion on stepdown Created: 17/Dec/20  Updated: 29/Oct/23  Resolved: 26/Jan/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.9.0, 4.4.5, 4.2.16

Type: Bug Priority: Major - P3
Reporter: Jason Chan Assignee: Matthew Russotto
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-76581 Server can return stale topologyVersi... Closed
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4, v4.2
Sprint: Repl 2021-01-25, Repl 2021-02-08
Participants:
Linked BF Score: 0

 Description   

1. We kill operations as part of the beginning of stepdown. Calling AutoGetRstlForStepUpStepDown starts the killOp thread
2. We start to kill user operations before we disabling writes on primary and before transitioning the server to SECONDARY (these are the things that update the server description and trigger a topologyVersion bump)
3. The killed operation error response is appended with a topologyVersion that hasn't been incremented yet.

Since the topologyVersion is not incremented, the driver will try to reselect the same server to run the command even though it may still be in the process of stepping down.

We can consider adding an extra incrementation to the topologyVersion before scheduling the killOps (we already increment the topologyVersion twice as part of stepdown – once for when we disable writes, and another when we complete the transition to secondary). Another alternative is to delaying the killOps logic until the topologyVersion is properly incremented.



 Comments   
Comment by Githook User [ 20/Jul/21 ]

Author:

{'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}

Message: SERVER-53431 Server should report itself not writable during stepdown

(cherry picked from commit d73b402b349498d799d4d4458cff9b0c4cea5fb6)
Branch: v4.2
https://github.com/mongodb/mongo/commit/28efeba497f86e7d6c32cba7adeaf9ca04e14704

Comment by Githook User [ 17/Feb/21 ]

Author:

{'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}

Message: SERVER-53431 Server should respond running operations with appropriate topologyVersion on stepdown

(cherry picked from commit a83036b85dd4b120663a560109f623534fa240ca)
Branch: v4.4
https://github.com/mongodb/mongo/commit/d73b402b349498d799d4d4458cff9b0c4cea5fb6

Comment by Matthew Russotto [ 26/Jan/21 ]

Fixed by moving the first increment of the counter to before we enqueue the RSTL lock instead of (not in addition to) after. While we are waiting for the lock to be enqueued, we will report that we are not a writable primary. If stepdown fails, we will increment the counter again to report that we are writable.

Comment by Githook User [ 26/Jan/21 ]

Author:

{'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}

Message: SERVER-53431 Server should respond running operations with appropriate topologyVersion on stepdown
Branch: master
https://github.com/mongodb/mongo/commit/a83036b85dd4b120663a560109f623534fa240ca

Comment by Matthew Russotto [ 19/Jan/21 ]

After looking at the code, I recall the issue: we can't disable writes until we take the RSTL. We can't take the RSTL until we kill the operations. So we kill operations in a loop until we get the RSTL. If we increment the topology version first, we'll still get a window in which we can start a new operation and have it killed with the same topology version. Unless we increment the version once per loop. Since it's a 64-bit counter, that actually seems reasonable.

Comment by Matthew Russotto [ 19/Jan/21 ]

It seems wrong that we kill operations before disabling writes, but there was a lot of complexity around this and it may be unavoidable.

Generated at Thu Feb 08 05:30:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.