-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Replication
-
ALL
-
Repl 2026-03-02
-
200
-
None
-
None
-
None
-
None
-
None
-
None
-
None
When a replica set member receives a new configuration that reduces the election timeout, the node may not run for election at the new (sooner) time. Instead, it continues to use the old (longer) timeout, causing significant delays in failover scenarios.
PostMemberStateUpdateAction _setCurrentRSConfig(...) {
// ... config installation ...
_cancelCatchupTakeover(lk);
_cancelPriorityTakeover(lk);
_cancelAndRescheduleElectionTimeout(lk); // Called here
// ...
}
void ReplicationCoordinatorImpl::_cancelAndRescheduleElectionTimeout(WithLock lk) {
// ...
if (wasActive && doNotReschedule) {
// Only explicitly cancel if NOT rescheduling
_handleElectionTimeoutCallback.cancel();
}
if (doNotReschedule)
return;
// Calculate new timeout from NOW
auto requestedWhen = now + _rsConfig.unsafePeek().getElectionTimeoutPeriod();
// This does NOT cancel the old callback!
_handleElectionTimeoutCallback.delayUntilWithJitter(lk, requestedWhen, upperBound);
}
The Bug
Status DelayableTimeoutCallback::_delayUntil(WithLock lk, Date_t when) {
if (!_cbHandle) {
// No timeout active - schedule new one
return _reschedule(lk, when);
}
if (when == _nextCall) {
// Same time - do nothing
}
_nextCall = when; // ⚠️ Just updates the target time
return Status::OK(); // ⚠️ Does NOT cancel/reschedule the callback!
}
The old callback remains scheduled in the executor at its original time. When it eventually fires, it checks:
void DelayableTimeoutCallback::_handleTimeout(...) {
// ...
if (_nextCall > now) {
// Too early - reschedule for the new time
_reschedule(lk, _nextCall);
return;
}
// It's time (or past time) - execute callback
_callback(args);
}
Why This is a Problem
- Moving timeout LATER: Works fine - old callback fires early, sees now < _nextCall, reschedules to new time
- Moving timeout SOONER: Broken - old callback fires late, sees now >= _nextCall, executes immediately (but already late!)
Potential Fixes
Option 1: Use scheduleAt() logic in _cancelAndRescheduleElectionTimeout()
Modify _cancelAndRescheduleElectionTimeout() to detect when moving backwards and explicitly cancel:
Option 3: Always cancel and reschedule during reconfig
Simply always cancel the existing callback when installing a new config:
void ReplicationCoordinatorImpl::_cancelAndRescheduleElectionTimeout(WithLock lk) {
// ...
if (wasActive) {
_handleElectionTimeoutCallback.cancel();
}
if (doNotReschedule)
return;
// Now schedule fresh
auto requestedWhen = now + _rsConfig.unsafePeek().getElectionTimeoutPeriod();
Milliseconds upperBound = Milliseconds(_getElectionOffsetUpperBound(lk));
_handleElectionTimeoutCallback.delayUntilWithJitter(lk, requestedWhen, upperBound);
}