[SERVER-35420] Remove stable optime candidates list from ReplicationCoordinator Created: 05/Jun/18  Updated: 06/Dec/22  Resolved: 22/Jan/19

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: Backlog - Replication Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-34489 Enable new format Unique Index via FCV Closed
Related
is related to SERVER-35344 Stable timestamp and _currentCommitte... Backlog
is related to SERVER-35714 Recover to common point on rollback i... Backlog
Assigned Teams:
Replication
Participants:

 Description   

Once we are allowed to do PIT reads on secondaries that fall in the middle of previous secondary application batches, we'll be able to get rid of ReplicationCoordinatorImpl::_stableOpTimeCandidates. At that point the stable timestamp can simply be the min of the lastApplied optime on this node and the replication majority commit point.



 Comments   
Comment by Tess Avitabile (Inactive) [ 02/Jan/20 ]

I believe we also need this list for sets with enableMajorityReadConcern:false. Once we remove that option, I would be interested in exploring removing the candidates list. I think it would be worth testing the perf impact.

Comment by Judah Schvimer [ 02/Jan/20 ]

From conversation with milkie on the initial sync semantics design: I think we still have candidates for replica sets with only one voting node. The replication commit point can advance ahead of the all_durable timestamp, but the stable timestamp must be less than or equal to the all_durable timestamp. all_durable is not guaranteed to be a timestamp in the oplog, but the stable timestamp must be a timestamp in the oplog. The candidate list is a lot of complexity for what I'd consider an edge case. I think we could make sets with a single voting node hold the replication commit point behind the all_durable timestamp and remove the candidates list. This would add a perf penalty in that edge case, but that's almost certainly worth the complexity removal and the perf penalty of maintaining the candidates list.

Looking in the codebase I no longer see any places where we use the candidate list outside of choosing the stable timestamp, so I think it is worth considering removing it in the future.

Comment by Gregory McKeon (Inactive) [ 22/Jan/19 ]

We now use this list in several places, so we're leaving it.

Comment by Judah Schvimer [ 26/Sep/18 ]

This probably has to be the lesser of the "all committed" timestamp and the "majority commit point", but we must be careful because the "all committed" timestamp can be a timestamp that's not in the oplog, which is not a valid "stable timestamp".

Generated at Thu Feb 08 04:39:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.