DelayableTimeoutCallback ignores earlier timeout values

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Replication
    • ALL
    • Repl 2026-03-02
    • 200
    • None
    • None
    • None
    • None
    • None
    • None
    • None

       

      When a replica set member receives a new configuration that reduces the election timeout, the node may not run for election at the new (sooner) time. Instead, it continues to use the old (longer) timeout, causing significant delays in failover scenarios.

      Location: https://github.com/mongodb/mongo/blob/master/src/mongo/db/repl/replication_coordinator_impl.cpp#L4670

      PostMemberStateUpdateAction _setCurrentRSConfig(...) {
          // ... config installation ...
          
          _cancelCatchupTakeover(lk);
          _cancelPriorityTakeover(lk);
          _cancelAndRescheduleElectionTimeout(lk);  // Called here
          
          // ...
      } 

      Location: https://github.com/mongodb/mongo/blob/master/src/mongo/db/repl/replication_coordinator_impl_heartbeat.cpp#L1262-L1277

      void ReplicationCoordinatorImpl::_cancelAndRescheduleElectionTimeout(WithLock lk) {
          // ...
          
          if (wasActive && doNotReschedule) {
              // Only explicitly cancel if NOT rescheduling
              _handleElectionTimeoutCallback.cancel();
          }
          
          if (doNotReschedule)
              return;
          
          // Calculate new timeout from NOW
          auto requestedWhen = now + _rsConfig.unsafePeek().getElectionTimeoutPeriod();
          
          // This does NOT cancel the old callback!
          _handleElectionTimeoutCallback.delayUntilWithJitter(lk, requestedWhen, upperBound);
      }
       

      The Bug

      Location: https://github.com/mongodb/mongo/blob/master/src/mongo/db/repl/delayable_timeout_callback.cpp#L93-L108

      Status DelayableTimeoutCallback::_delayUntil(WithLock lk, Date_t when) {
          if (!_cbHandle) {
              // No timeout active - schedule new one
              return _reschedule(lk, when);
          }
          if (when == _nextCall) {
              // Same time - do nothing
          }
          _nextCall = when;  // ⚠️ Just updates the target time
          return Status::OK(); // ⚠️ Does NOT cancel/reschedule the callback!
      }
      

      The old callback remains scheduled in the executor at its original time. When it eventually fires, it checks:

      Location: https://github.com/mongodb/mongo/blob/master/src/mongo/db/repl/delayable_timeout_callback.cpp#L134-L145

      void DelayableTimeoutCallback::_handleTimeout(...) {
          // ...
          if (_nextCall > now) {
              // Too early - reschedule for the new time
              _reschedule(lk, _nextCall);
              return;
          }
          // It's time (or past time) - execute callback
          _callback(args);
      }
      

      Why This is a Problem

      • Moving timeout LATER: Works fine - old callback fires early, sees now < _nextCall, reschedules to new time
      • Moving timeout SOONER: Broken - old callback fires late, sees now >= _nextCall, executes immediately (but already late!)

      Potential Fixes

      Option 1: Use scheduleAt() logic in _cancelAndRescheduleElectionTimeout()

      Modify _cancelAndRescheduleElectionTimeout() to detect when moving backwards and explicitly cancel:

      https://github.com/mongodb/mongo/blob/master/src/mongo/db/repl/replication_coordinator_impl_heartbeat.cpp#L1239-L1295

      Option 3: Always cancel and reschedule during reconfig

      Simply always cancel the existing callback when installing a new config:

      void ReplicationCoordinatorImpl::_cancelAndRescheduleElectionTimeout(WithLock lk) {
          // ...
          if (wasActive) {
              _handleElectionTimeoutCallback.cancel();
          }
          
          if (doNotReschedule)
              return;
          
          // Now schedule fresh
          auto requestedWhen = now + _rsConfig.unsafePeek().getElectionTimeoutPeriod();
          Milliseconds upperBound = Milliseconds(_getElectionOffsetUpperBound(lk));
          _handleElectionTimeoutCallback.delayUntilWithJitter(lk, requestedWhen, upperBound);
      }
      

            Assignee:
            Moustafa Maher
            Reporter:
            Moustafa Maher
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: