[SERVER-46308] Investigate dependency between commit point (lastCommitted) and cluster time Created: 21/Feb/20 Updated: 29/Oct/23 Resolved: 04/Oct/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 5.1.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Lingzhi Deng | Assignee: | Randolph Tan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Sprint: | Sharding 2020-04-20, Sharding 2020-05-04, Sharding 2020-05-18, Sharding 2020-07-13, Sharding 2020-06-01, Sharding 2020-06-15, Sharding 2020-06-29, Sharding 2020-07-27, Sharding 2020-08-10, Sharding 2020-08-24, Sharding 2020-09-21, Sharding 2020-10-05, Sharding 2020-10-19, Sharding 2020-11-02, Sharding 2020-11-16, Sharding 2020-11-30, Sharding 2020-12-14, Sharding 2020-12-28, Sharding 2021-01-11, Sharding 2021-01-25, Sharding 2021-02-22, Sharding 2021-03-08, Sharding 2021-03-22, Sharding 2021-04-05, Sharding 2021-04-19, Sharding 2021-05-03 | ||||||||
| Participants: | |||||||||
| Description |
|
computeOperationTime returns the replication lastCommittedOpTime for majority reads or the lastAppliedOpTime otherwise. And we have this dassert that makes sure that neither of them should be beyond the cluster time. We have a dassert in ReplicationCoordinatorImpl::_setMyLastAppliedOpTimeAndWallTime to assert that the lastAppliedOpTime should never advance beyond the cluster time. However, we do not have a corresponding dassert for the lastCommittedOpTime. Normally, for internal communications between replset members, we have the LogicalTimeMetadataHook to parse cluster time metadata and advance the cluster time on receiving a network message. For example, when a node receives a heartbeat response, it parses the cluster time metadata as part of the network interface hooks before handing it off to repl to process the heartbeat response. And when the node processes the heartbeat, it could advance the commit point on hearing a more recent commit point. So the assumption that the commit point is never ahead of the cluster time is normally correct because we parse the cluster time metadata first. However, if a heartbeat response comes from an arbiter, it could contain a more recent commit point without cluster time metadata, simply because logical clock is disabled for arbiters. So theoretically, the following could happen: The fact that computeOperationTime returns lastCommitted for majority reads seems weird. Because for majority reads, they dont actually read at lastCommitted but the committed snapshot. And repl guarantees that the committed snapshot is never ahead of the lastApplied, (whereas lastCommitted could). So I think it is more correct for computeOperationTime to return the committed snapshot for majority reads. This ticket should investigate whether what mentioned above could actually happen with arbiters. No matter which time we decide computeOperationTime should return for majority reads, we should add a corresponding dassert in repl like we did for the lastApplied to enforce that assumption. And we should have a targeted test for it. |
| Comments |
| Comment by Vivian Ge (Inactive) [ 06/Oct/21 ] |
|
Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you! |
| Comment by Githook User [ 04/Oct/21 ] |
|
Author: {'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}Message: |
| Comment by Randolph Tan [ 30/Sep/21 ] |
|
Attached test.js demonstrating the scenario as described. |