Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-66023

Do not constantly reset election and liveness timers

    XMLWordPrintable

Details

    • Improvement
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • None
    • 6.0.1, 5.0.11, 6.1.0-rc0
    • None
    • None
    • Fully Compatible
    • v6.0, v5.3, v5.0, v4.4
    • Repl 2022-05-02, Repl 2022-05-16, Repl 2022-05-30
    • 135

    Description

      We cancel the election timeout on secondaries whenever the primary liveness is updated, which potentially happens on every oplog batch. We cancel the liveness timeout on primaries whenever the oldest secondary liveness is updated, which potentially happens on every replSetUpdatePosition. It turns out cancelling a timer, at least on Linux, is quite expensive (likely system call overhead), and we do this in the replication lock, which increases contention on that already-hot mutex.

      We can greatly reduce this with a class which handles "cancel and reschedule" by keeping track of the latest time of the reschedule, and then when the timeout occurs, reschedules at that point instead of immediately. This means we get no cancels and one reschedule every timeout interval (not every miniscule bump forward of the timer)

      Attachments

        Issue Links

          Activity

            People

              matthew.russotto@mongodb.com Matthew Russotto
              matthew.russotto@mongodb.com Matthew Russotto
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: