[SERVER-42650] Remove stale comments mentioned in the RollbackTest for "transitionToSteadyStateOperations" state. Created: 06/Aug/19 Updated: 27/Oct/23 Resolved: 06/Jan/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Suganthi Mani | Assignee: | Judah Schvimer |
| Resolution: | Gone away | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
|
The comments mentioned here in "transitionToSteadyStateOperations" state is no longer valid with the tiebreaker model. To be noted, we can transition to "kSteadyStateOps" only from "kSyncSourceOpsDuringRollback". When the RollbackTest was having arbiter (
As a result, the comment mentioned here is valid as node S can get vote from arbiter and become primary. But, with the tiebreaker setup, the network topology for state "kSyncSourceOpsDuringRollback" is like
Over in this model, there is noway, node S can escape from rollback because we make sure in "kSyncSourceOpsBeforeRollback" state that node S & P1's oplog entries are diverged. And, the node P1 writes an oplog entry in the higher term than node S. So, no way node S can get "yes" vote from node P1 for replSetRequestVote cmd unless the node S has rolled back. This ticket should remove this stale comment and the if-else check. Also, update the RollbackTest such that the rollbackId check is executed for all cases in transitionToSteadyStateOperations(). |
| Comments |
| Comment by Suganthi Mani [ 06/Jan/20 ] |
|
Marking it as gone away, as the original intention of the ticket (stale comment removal) got fixed by |
| Comment by Suganthi Mani [ 03/Jan/20 ] |
|
FYI, Original stale comment has been removed by |
| Comment by Andy Schwerin [ 18/Oct/19 ] |
|
Only depends on closed tickets. Unblocking. |
| Comment by Judah Schvimer [ 08/Aug/19 ] |
|
Hmm, that actually seems like a real bug since it means we can roll back without letting other nodes know. I'm not sure if this could cause problems before since it requires restart, but it almost certainly could with resumable initial sync where the sync source is allowed to restart, and the initial syncing node must find out if the sync source rolls back (by checking the RBID). We need to increment the rollback id before truncating the oplog. |
| Comment by Suganthi Mani [ 08/Aug/19 ] |
|
I am thinking of below case for enableMajorityReadConcern= false which uses RVR. 1) syncFixup() // Truncates the oplog. When the node restarts, there is a chance we may not go into ROLLBACK state. And, if we plan to check for rollback id in transitionToSteadyStateOperations() for all cases, then it might be a problem. |
| Comment by Judah Schvimer [ 07/Aug/19 ] |
|
I would rather not change the behavior to enable this test, which is not critical. I agree that we could do it before syncFixUp() though. I don't follow how rollback via refetch (RVR) would lose the rollback id (RBID). The RBID is still persisted for RVR and we're still guaranteeing the node has to go through rollback for the test fixture to make progress. Can you please clarify? |
| Comment by Suganthi Mani [ 06/Aug/19 ] |
|
quick note, for RTT (enableMajorityReadConcern = true), rollbackID info is persisted in local.system.rolback.id collection before the start of the real rollback process. So, any restarts on rollbacked node refresh the rollback id info from local.system.rolback.id collection at startup. For rollbackViaRefetch, the incremented rollback id is persisted only after the real rollback step. So, if there is an unclean shutdown happens before persisting the rollback id, the node might lose the latest rollback id info. So, this ticket might be a problem for enableMajorityReadConcern= false. For rollbackViaRefetch, I am not seeing any good reason for wrapping the incrementRollbackID() call in ON_BLOCK_EXIT. To make the behavior consistent with RTT, we should call incrementRollbackID() before syncFixUp(). judah.schvimer, any thoughts. |
| Comment by Suganthi Mani [ 06/Aug/19 ] |
|
This ticket should go in after |