[SERVER-48778] Make reconstruct_prepared_transactions_initial_sync.js robust to election failures. Created: 15/Jun/20  Updated: 29/Oct/23  Resolved: 22/Jun/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.4.0-rc11, 4.2.9, 4.7.0

Type: Bug Priority: Major - P3
Reporter: Suganthi Mani Assignee: Suganthi Mani
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to SERVER-47612 Elections not robust in remove_newly_... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4, v4.2
Sprint: Repl 2020-06-15, Repl 2020-06-29
Participants:
Linked BF Score: 18

 Description   

Using replTest.stepUp in the test always expects the replica set to have a primary before running replSetStepUp cmd for any retries. It's not possible to hold that guarantee if a test run with high electionTimeOutMillis (24 hrs) because the nodes can't start an election by it's own during the test unless it was started by the jstest explicitly using "replSetStepDown" cmd.

Though SERVER-47612 (on master) made storing last vote document during replSetRequestVotes cmd as resilient against concurrent step down, the solution for this ticket should be more generic to handle any replSetStepUp failures. So, we should make the test to use stepUpNoAwaitReplication instead of stepUp method.



 Comments   
Comment by Githook User [ 30/Jun/20 ]

Author:

{'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com', 'username': 'smani87'}

Message: SERVER-48778 Make reconstruct_prepared_transactions_initial_sync.js robust to election failures.

(cherry picked from commit e94ca2a3bc234c8f340330217d89da6e73d1f026)
Branch: v4.2
https://github.com/mongodb/mongo/commit/a25456e68d46b60d499993f23725a5db83014e0d

Comment by Githook User [ 22/Jun/20 ]

Author:

{'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com', 'username': 'smani87'}

Message: SERVER-48778 Make reconstruct_prepared_transactions_initial_sync.js robust to election failures.

(cherry picked from commit e94ca2a3bc234c8f340330217d89da6e73d1f026)
Branch: v4.4
https://github.com/mongodb/mongo/commit/9563c3590163e7ac8e4510e5e6398e3d74b4fea3

Comment by Githook User [ 22/Jun/20 ]

Author:

{'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com', 'username': 'smani87'}

Message: SERVER-48778 Make reconstruct_prepared_transactions_initial_sync.js robust to election failures.
Branch: master
https://github.com/mongodb/mongo/commit/e94ca2a3bc234c8f340330217d89da6e73d1f026

Comment by Suganthi Mani [ 22/Jun/20 ]

william.schultz
We have not yet backported SERVER-47612 to MongoDB 4.4 and that's the reason we have BF-17895. My inclination for this test fix to be more general rather than relaying on fix on server side (SERVER-47612). FYI, I also think SERVER-47612 fix is brittle and made a comment on that.

Comment by William Schultz (Inactive) [ 15/Jun/20 ]

suganthi.mani It seems that SERVER-47612 addressed the issue of step up failing due to the replSetRequestVotes command being interrupted. Is my understanding correct that BF-17895 should no longer be occurring in practice (due to the SERVER-47612 fix), but SERVER-48778 is filed to address a more general issue e.g. if the initial step up ever fails for any reason.

Generated at Thu Feb 08 05:18:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.