Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 3.2.18, 3.4.11, 3.6.0-rc0
Affects Version/s: None
Component/s: Replication
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v3.4, v3.2
Sprint:
Repl 2017-10-02
Linked BF Score:
0
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

In ReplicationCoordinatorImpl::_scheduleNextLivenessUpdate_inlock(), we do not schedule a new liveness update if the nextTimeout would be in the past. This is wrong; we should schedule an immediate liveness update in that case.

One scenario is that we have just run our liveness check and the earliest live member was just barely fresh ("almost stale"), so we do nothing. A small time passes before we schedule the new one, and now that member is stale, so the next timeout period is in the past. We then stop doing liveness checks.

Assignee:: Judah Schvimer
Reporter:: Matthew Russotto
Participants:: Githook User, Judah Schvimer, Matthew Russotto, Ramon Fernandez Marina
Votes:: 0 Vote for this issue
Watchers:: 5 Start watching this issue

Created:: Jun 30 2017 05:45:45 PM UTC
Updated:: Oct 30 2023 11:15:33 PM UTC
Resolved:: Sep 15 2017 05:15:24 PM UTC
Confidence Status Last Update:: 13/Sep/17 4:09 PM

Details

Description

Attachments

Forms

Activity

People

Dates