[SERVER-46558] Bgsync stops all index builds even before transitioning to rollback state and causes secondary replication to hang Created: 03/Mar/20  Updated: 29/Oct/23  Resolved: 18/Mar/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.4.0-rc0, 4.7.0

Type: Bug Priority: Major - P3
Reporter: Suganthi Mani Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-46823 Enable default for index commit quoru... Closed
Related
related to SERVER-46976 Enable commit quorum in rollback_wait... Closed
related to SERVER-48419 Extend rollback to recover resumable ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4
Sprint: Execution Team 2020-03-23
Participants:
Linked BF Score: 24

 Description   

Since bgsync aborts the index build even before transitioning to rollback state, side effect of that is really bad, as the node is still eligible to run election and become primary. One notable consequence of that behavior is that, consider a case where we have 3 node replica set. (node A is the primary and node B secondary1 and node C is secondary2) and the thread pool size is 1.

1) node A (primary for term 10) starts the index Build 'x_1', uses indexbuildCoordinator thread pool and generates startIndexBuild oplog entries to both secondaries.
2) node B and node C, on receiving the startIndexBuild starts the index build (uses indexbuildCoordinator thread pool)
3) node A faces network partition and gets disconnected from node B and node C.
4) node A receives some writes W1 at term 10 and sees it lost majority of votes and steps down.
5) Node C gets elected and becomes primary for term 11. And, node A now rejoins the n/w and sees the sync source, say, node C (new primary) has diverged from its oplog. So, it gets into this code path and starts aborting the index build. Since the node A hasn't yet transitioned to rollback, it's free to run the election and let's assume it won the election on receiving vote from node B.

As a result of step 5, node A will no longer run the real rollback step. This is because, on node A becoming primary, it stops the oplog fetcher service, so this check or [this|https://github.com/mongodb/mongo/blob/17984db6c531594c00bf226804d9ab7ed6225643/src/mongo/db/repl/rollback_impl.cpp#L190 check might fails making the node not to rollback any oplog entries.

Problems:
1) The consequence of this is that index build on secondaries becomes orphaned.
2) Since the index build on node A got aborted, the node A is free to start new index build, say, 'y_1'. If secondaries receives the startIndexBuild oplog entry for index 'y_1', the secondaries would wait for the indexBuildsCoordinator thread to become available and blocks secondary replication.

Solution: We should abort index build only when the node transitioned its state to rollback and we are sure that the entries are going to get rolled back. And, it applies to both rollback via recoverToStableTimestamp and rollback via refetch.

P.S: I noticed this failure frequently in my patch build. And, currently, since the index build is generating high volumes of timeout error. The BF stating this issue is lost.



 Comments   
Comment by Githook User [ 18/Mar/20 ]

Author:

{'name': 'Louis Williams', 'username': 'louiswilliams', 'email': 'louis.williams@mongodb.com'}

Message: SERVER-46558 Abort index builds only after transitioning to rollback and when it is guaranteed not to fail

(cherry picked from commit 03dc2fefa0fb1c77d2caeb6dd166166276bc5b15)
Branch: v4.4
https://github.com/mongodb/mongo/commit/760af00cd3ec03c7304e73d0cde61b14e2fe7902

Comment by Githook User [ 18/Mar/20 ]

Author:

{'email': 'louis.williams@mongodb.com', 'name': 'Louis Williams', 'username': 'louiswilliams'}

Message: SERVER-46558 Abort index builds only after transitioning to rollback and when it is guaranteed not to fail
Branch: master
https://github.com/mongodb/mongo/commit/03dc2fefa0fb1c77d2caeb6dd166166276bc5b15

Generated at Thu Feb 08 05:11:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.