[SERVER-72491] Force Delay on ContinuousInitialSync Hook Node Stepup Created: 03/Jan/23  Updated: 29/Oct/23  Resolved: 04/Jan/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.3.0-rc0

Type: Bug Priority: Major - P3
Reporter: Sean Zimmerman Assignee: Sean Zimmerman
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Repl 2023-01-09
Participants:
Linked BF Score: 13

 Description   

Problem

In BF-27059 we would see a bug where a retryable write fails on a fresh initial sync node due to trying to retry operations not in the nodes oplog. This is a very consistent build failure that has been ongoing for over a month with 33% failure rate.

Solution & Acceptance Criteria

The current ContinuousInitialSync hook has a random chance to either immediately promote the initial sync node after sync or to wait before promotion. The bug only occurs on immediate promotion so the solution should be to remove the random chance.

Impact

This change does reduce the total number of initial syncs done during the suite (from ~100 to ~50 on average). This should still be sufficient to test initial sync and can be further improved by adjusting the wait time before promotion in the future.



 Comments   
Comment by Githook User [ 04/Jan/23 ]

Author:

{'name': 'seanzimm', 'email': 'sean.zimmerman@mongodb.com', 'username': 'seanzimm'}

Message: SERVER-72491: Remove random chance to immediately promote initial sync node
Branch: master
https://github.com/mongodb/mongo/commit/d1a67f12332e4ab92e43b24f192267a1f10574c3

Generated at Thu Feb 08 06:21:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.