Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 5.1.0-rc0
Affects Version/s: None
Component/s: Sharding
Labels:
- sharding-wfbf-day

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Sprint:
Sharding 2020-04-20, Sharding 2020-05-04, Sharding 2020-05-18, Sharding 2020-07-13, Sharding 2020-06-01, Sharding 2020-06-15, Sharding 2020-06-29, Sharding 2020-07-27, Sharding 2020-08-10, Sharding 2020-08-24, Sharding 2020-09-21, Sharding 2020-10-05, Sharding 2020-10-19, Sharding 2020-11-02, Sharding 2020-11-16, Sharding 2020-11-30, Sharding 2020-12-14, Sharding 2020-12-28, Sharding 2021-01-11, Sharding 2021-01-25, Sharding 2021-02-22, Sharding 2021-03-08, Sharding 2021-03-22, Sharding 2021-04-05, Sharding 2021-04-19, Sharding 2021-05-03
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

computeOperationTime returns the replication lastCommittedOpTime for majority reads or the lastAppliedOpTime otherwise. And we have this dassert that makes sure that neither of them should be beyond the cluster time.

We have a dassert in ReplicationCoordinatorImpl::_setMyLastAppliedOpTimeAndWallTime to assert that the lastAppliedOpTime should never advance beyond the cluster time. However, we do not have a corresponding dassert for the lastCommittedOpTime.

Normally, for internal communications between replset members, we have the LogicalTimeMetadataHook to parse cluster time metadata and advance the cluster time on receiving a network message. For example, when a node receives a heartbeat response, it parses the cluster time metadata as part of the network interface hooks before handing it off to repl to process the heartbeat response. And when the node processes the heartbeat, it could advance the commit point on hearing a more recent commit point. So the assumption that the commit point is never ahead of the cluster time is normally correct because we parse the cluster time metadata first.

However, if a heartbeat response comes from an arbiter, it could contain a more recent commit point without cluster time metadata, simply because logical clock is disabled for arbiters. So theoretically, the following could happen:
1. secondary's current knowledge of the cluster time is 90
2. secondary receives a heartbeat response without cluster time metadata from an arbiter that has a commit point 100
3. secondary processes the heartbeat and advances its commit point (lastCommitted) to 100.
4. A majority read on the secondary returns operation time 100 from computeOperationTime
5. dassert is hit because the secondary's logical clock is 90, being < the operation time.

The fact that computeOperationTime returns lastCommitted for majority reads seems weird. Because for majority reads, they dont actually read at lastCommitted but the committed snapshot. And repl guarantees that the committed snapshot is never ahead of the lastApplied, (whereas lastCommitted could). So I think it is more correct for computeOperationTime to return the committed snapshot for majority reads.

This ticket should investigate whether what mentioned above could actually happen with arbiters. No matter which time we decide computeOperationTime should return for majority reads, we should add a corresponding dassert in repl like we did for the lastApplied to enforce that assumption. And we should have a targeted test for it.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

test.js
2 kB
Sep 30 2021 09:44:40 PM UTC

related to

SERVER-35663 Replication recovery does not update the logical clock

Closed

Assignee:: Randolph Tan
Reporter:: Lingzhi Deng
Participants:: Githook User, Lingzhi Deng, Randolph Tan, Vivian Ge
Votes:: 0 Vote for this issue
Watchers:: 8 Start watching this issue

Created:: Feb 21 2020 07:55:12 PM UTC
Updated:: Oct 29 2023 10:11:55 PM UTC
Resolved:: Oct 04 2021 07:32:53 PM UTC

Details

Description

Attachments

Attachments

Issue Links

Forms

Activity

People

Dates