Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-35383

Increase electionTimeoutMillis for the ContinuousStepdown hook used in stepdown suites

    • Fully Compatible
    • v4.0, v3.6
    • TIG 2018-07-02, TIG 2018-07-16
    • 27
    • 2

      The electionTimeoutMillis parameter for the ContinuousStepdown hook, used in the concurrency stepdown suites, is set to 5000. We should increase this per the captured discussion:

      > > On 2018/05/30 22:09:12, maxh wrote:
      > > > [note] As mentioned in SERVER-34666, I don't think we should shorten the
      > > > election timeout as it can lead to an election happening that isn't
      > initiated
      > > by
      > > > the StepdownThread due to heartbeats being delayed. I'm okay with keeping
      it
      > > > as-is for now because it is consistent with the replica set configuration
      > the
      > > > JavaScript version would have used; however, I'd like for there to be a
      > > > follow-up SERVER ticket to change it.
      > > >
      > > >
      > >
      >
      https://jira.mongodb.org/browse/SERVER-34666?focusedCommentId=1873407&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-1873407
      > >
      > > For the followup ticket, do we just want to remove this value and use the
      > > default, or set it to a higher timeout?
      >
      > I'm not sure - I'd like to get some input from Judah on it. I'm currently
      > wondering if we really need to avoid setting the election timeout to 24 hours
      > when all_nodes_electable=true. We're going to use the replSetStepUp command in
      > the Python version of the StepdownThread to cause one of the secondaries to
      run
      > for election anyway. If for some reason the replSetStepUp command fails, then
      > the former primary will try and step back up after 10 seconds on its own
      anyway.
      >
      >
      https://github.com/mongodb/mongo/blob/r4.1.0/buildscripts/resmokelib/testing/fixtures/replicaset.py#L149-L154

      If you only want elections to come from the StepdownThread, then I'd recommend
      setting the election timeout to 24 hours. The replSetStepUp command should still
      work, and if it fails for some reason, then no other node will try to run for
      election. There's no real difference between the default 10 seconds and the
      current 5 seconds except for the amount of flakiness you'd expect (not the
      existence of flakiness that we're trying to remove completely).

            Assignee:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Reporter:
            jonathan.abrahams Jonathan Abrahams
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: