[SERVER-21937] Calculate write concern and distlock timeouts for talking to the config servers relative to the electionTimeout for the config replica set Created: 17/Dec/15  Updated: 06/Dec/22  Resolved: 13/Apr/18

Status: Closed
Project: Core Server
Component/s: Replication, Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Sharding
Operating System: ALL
Participants:

 Description   

As part of SERVER-21050 we increased the distlock and write concern timeout for config server operations so that they were actually long enough to allow proper retries in the event of config server failure with the default election timeout of 10 seconds. For people on fast reliable networks who turn down their election timeout, however, these values may be too long and lead to slower error detection and retry logic execution. We should calculate these timeouts in term of the election timeout, or at least allow these timeouts to be independently controlled so that people can adjust them down at the same time that they adjust down their election timeout.



 Comments   
Comment by Kaloian Manassiev [ 17/Dec/15 ]

When you say that these longer timeouts may "lead to slower error detection", do you mean in the pathological cases where primary cannot be elected at all? Because in the normal case where a primary is elected faster than 10 seconds, the timeouts which we selected make no difference.

For the pathological case, they do not make error detection by clients much slower either, because we have an upper bound on the number of retries, which in the absence of master will fail very quickly.

So, in my opinion there is no need to link the distributed lock and the wait for config write concern timeouts with the replication's election timeout.

Generated at Thu Feb 08 03:58:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.