[SERVER-57167] Prevent throwing on session creation due to stepdown before stepdown completes Created: 24/May/21  Updated: 27/Oct/23  Resolved: 19/Aug/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Haley Connelly Assignee: Luis Osta (Inactive)
Resolution: Gone away Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-34608 Drivers may still see ismaster=true f... Closed
is related to SERVER-38456 killSessionsLocalKillTransactions mus... Closed
is related to SERVER-52564 Deadlock between step down and MongoD... Closed
Participants:
Linked BF Score: 39

 Description   

Issue:
Since SERVER-52564, session checkout throws InterruptedDueToReplStateChange  when a stepdown is in progress. Throwing a retryable error before stepdown is complete means retries may be exhausted (running against the node stepping down) before the new primary steps up and the command can be retargeted. 

Ideally, the command would be retargeted upon retry.

Proposal: Instead of bubbling the exception immediately back to the caller, wait for the stepdown to complete before throwing to the top layer. 

This could be done by catching the initial InterruptedDueToReplStateChange exception thrown, blocking until the RSTL is released in the catch block, then calling checkForInterrupt() and throwing. Thus, by the time the session checkout throws, the stepdown has had time to complete.



 Comments   
Comment by Luis Osta (Inactive) [ 19/Aug/21 ]

Fixed by - https://jira.mongodb.org/browse/SERVER-53431

Comment by Matthew Russotto [ 19/Aug/21 ]

luis.osta

Yes I believe SERVER-53431 should cover this; by the time the driver gets the retryable error, the server should no longer report itself as primary.

Comment by Luis Osta (Inactive) [ 19/Aug/21 ]

matthew.russotto Since https://jira.mongodb.org/browse/SERVER-53431 has been backported does this mean that this fix is no longer necessary?

Comment by Matthew Russotto [ 08/Jun/21 ]

Yes, before SERVER-53431 we would kill ops while still reporting isMaster/isWritablePrimary true for a short period, but I think it didn't cause noticeable problems until the faster topology change notification project went in (which meant we saw the race every time). Which is why that didn't get backported to 4.2. Backporting SERVER-53431 to 4.2 will likely fix BF-20963.

Comment by Max Hirschhorn [ 08/Jun/21 ]

Usually we avoid drivers exhausting retries by returning isWritablePrimary: false from the hello command (isMaster: false from isMaster for older versions) for the affected node (even if the primary is in a state where it can technically still accept writes), rather than blocking anything on the stepdown completing. This is done with the _waitingForRSTLAtStepDown variable, here

https://github.com/mongodb/mongo/blob/f4e79552c206e31be0ed2a68d3fee7426a2f9f7e/src/mongo/db/repl/replication_coordinator_impl.cpp#L2553

matthew.russotto, is the behavior you're describing new with SERVER-53431? If so, that could explain why we had only been seeing it on the 4.2 branch despite the changes from SERVER-52564 also being present on the 4.4 and master branches. My knowledge may be stale from when I had filed SERVER-34608 while working on integrating retryable writes into the mongo shell.

Comment by Matthew Russotto [ 07/Jun/21 ]

Usually we avoid drivers exhausting retries by returning isWritablePrimary: false from the hello command (isMaster: false from isMaster for older versions) for the affected node (even if the primary is in a state where it can technically still accept writes), rather than blocking anything on the stepdown completing. This is done with the _waitingForRSTLAtStepDown variable, here

https://github.com/mongodb/mongo/blob/f4e79552c206e31be0ed2a68d3fee7426a2f9f7e/src/mongo/db/repl/replication_coordinator_impl.cpp#L2553

This seems to already occur before blocking session checkout, so I'm not sure why we wouldn't retarget already?

Comment by Haley Connelly [ 24/May/21 ]

These changes should be backported at least to v4.2

Generated at Thu Feb 08 05:41:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.