Uploaded image for project: 'Documentation'
  1. Documentation
  2. DOCS-14177

Provide better clarity over on what timeout setting results in an election, for failover

    XMLWordPrintableJSON

Details

    Description

      Description

      There is a lot of confusion out there over what replica set config timeout setting causes secondaries to call for an election.  Specifically there is a lack of documentation clarity on the role of the following two parameters:

      • settings.heartbeatTimeoutSecs
      • settings.electionTimeoutMillis{{}}

      See docs page: https://docs.mongodb.com/manual/reference/replica-configuration/#rsconf.settings.electionTimeoutMillis

      According to https://groups.google.com/g/mongodb-user/c/RwLZvRV7DAg for replication protocol version 1  - pv1, (which, as per https://docs.mongodb.com/manual/reference/replica-set-protocol-versions/ was the default from 3.2 and is the only protocol version supported from version 4.0),  "the only knob that controls failover sensitivity in pv1 is electionTimeoutMillis" and "In v1, you can expect the timeout to be at most electionTimeoutMillis"

      This needs to be made more clear in the docs for https://docs.mongodb.com/manual/reference/replica-configuration/#rsconf.settings.electionTimeoutMillis  for properties "settings.heartbeatTimeoutSecs" & "settings.electionTimeoutMillis".

      At the moment, the docs do say "NOTE For pv1, settings.electionTimeoutMillis has a greater influence on whether the secondary members call for an election than the settings.heartbeatTimeoutSecs". Unfortunately this is a very woolly and vague statement which provides no concrete actionable value.

      Also of note, is the core server source code README for replication, https://github.com/mongodb/mongo/blob/master/src/mongo/db/repl/README.md#user-content-communication which talks about "Check the liveness of the other nodes (heartbeats)". Again this is a bit vague but I suggest talking to the core replication developers who authored this about providing a far better description of what heartbeatTimeoutSecs is for and how it should be used at at.

      Scope of changes

      Specify that only electionTimeoutMillis is the only knob that controls failover sensitivity in pv1

      Attachments

        Activity

          People

            rea.rustagi@mongodb.com Rea Rustagi
            paul.done@mongodb.com Paul Done
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:
              30 weeks ago