[SERVER-76356] Is it possible to lower the amount of time a node running for election wait for responses? Created: 20/Apr/23  Updated: 27/Oct/23  Resolved: 01/May/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Rohan Sharan Assignee: Vishnu Kaushik
Resolution: Gone away Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Gantt Dependency
Assigned Teams:
Replication
Sprint: Repl 2023-05-01, Repl 2023-05-15
Participants:

 Description   

In a mongosync BF in a v5.0 kill sharded suite I'm investigating, the progress endpoint times out after 30 seconds. This happens because the primary is killed, and a secondary fails to start up (taking a full 30 seconds waiting for the result of an election). The old primary had been killed, so it doesn't get the vote from that one, but from the other secondary it gets no response after 30 seconds (but it waits!). You can see this behavior here.

Since both the testing fixture waiting on the progress endpoint times out after 30 seconds, and the secondary up for election waits for a response for 30 seconds, this causes the problem. Is there a way for me to configure a parameter so that the election will fail after less than 30 seconds have elapsed?



 Comments   
Comment by Vishnu Kaushik [ 01/May/23 ]

Discussed with Rohan, this ticket is a no-op now. The change will be made on the mongosync test infra end.

Comment by Vishnu Kaushik [ 01/May/23 ]

I'm discussing the possibility of speeding up the suite in general instead of making changes on the server end with rohan.sharan@mongodb.com; I found many "Slow query" messages on the original failure where the query was taking longer than 10 seconds (and some even taking 30 seconds). I think if that is fixed this error will go away.

Generated at Thu Feb 08 06:32:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.