Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.0.13, 4.2.1, 4.3.1
Affects Version/s: None
Component/s: Replication
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v4.2, v4.0
Sprint:
Repl 2019-08-26
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

~~SERVER-33812~~ attach afterClusterTime to all oplog queries. A node with higher timestamp but lower term than the sync source should roll back due to an empty batch, e.g. the old primary has (ts: 9, term: 1), while the new primary has (ts: 8, term: 2). However, the oplog query failed with MaxTimeMSExpired added in ~~SERVER-35200~~. I believe the query times out while waiting for afterClusterTime. In production, it's very likely the old primary will roll back when new writes arrive with even higher timestamp, maybe by the periodic no-op writer. However, it is still a liveness issue.

is related to

SERVER-42219 Oplog buffer not always empty when primary exits drain mode

Closed

related to

SERVER-33812 First initial sync oplog read batch fetched may be empty; do not treat as an error.

Closed

SERVER-35200 Speed up failure detection in the OplogFetcher during steady state replication

Closed

Assignee:: Siyuan Zhou
Reporter:: Siyuan Zhou
Participants:: Eric Milkie, Githook User, Siyuan Zhou, Suganthi Mani
Votes:: 0 Vote for this issue
Watchers:: 9 Start watching this issue

Created:: Aug 20 2019 12:29:13 AM UTC
Updated:: Oct 29 2023 10:17:54 PM UTC
Resolved:: Aug 21 2019 01:56:32 AM UTC
Confidence Status Last Update:: 20/Aug/19 12:29 AM

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates