[SERVER-37863] Implement sleep time calculation mechanism Created: 01/Nov/18  Updated: 06/Dec/22  Resolved: 20/Feb/19

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Maria van Keulen Assignee: Backlog - Storage Execution Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-37864 Implement write op rate calculation m... Closed
Assigned Teams:
Storage Execution
Sprint: Storage NYC 2018-12-03
Participants:

 Description   

Add a new mechanism that calculates the replication lag periodically, and uses this lag to determine how much time writes on the primary should sleep. This mechanism will be used to assist in throttling writes on the primary when the replication lag gets sufficiently large.



 Comments   
Comment by Maria van Keulen [ 20/Feb/19 ]

Closing this ticket, since it is specific to a flow control implementation that we chose not to pursue.

Comment by Maria van Keulen [ 26/Nov/18 ]

Per a conversation with Andy, we do not want to use OpTimes to calculate replication lag because certain user errors would be able to significantly skew the apparent calculated lag. The lag might appear to be 0 for weeks or briefly spike to be over a year. Using the wall clock times added in SERVER-34598 would avoid these errors.

Comment by Judah Schvimer [ 26/Nov/18 ]

maria.vankeulen, I expect the equivalent millisecond granularity optimes will be stored in the TopologyCoordinator as well for consistency and easy subtraction.

Comment by Maria van Keulen [ 26/Nov/18 ]

judah.schvimer We had originally planned to calculate the replication lag using these optimes, but due to various issues with optimes we have decided to implement SERVER-34598 as part of this project and rely on these wall clock times to calculate the lag.

Comment by Judah Schvimer [ 26/Nov/18 ]

The topology coordinator has the majority commit point and the last applied optime. We should be able to calculate majority commit point lag simply by subtracting these two values.

Comment by Maria van Keulen [ 01/Nov/18 ]

I understand your point. It seems that having this occur in a separate thread is not strictly necessary.

Comment by Andy Schwerin [ 01/Nov/18 ]

While I'm in favor of not calculating the interval every time, it seems like calling a function to fetch the interval, which might if it chooses to recalculate the interval, would server the desired purpose without requiring a thread or periodic task.

Comment by Maria van Keulen [ 01/Nov/18 ]

I have updated the ticket description to hopefully clarify the intent of this work.

Comment by Maria van Keulen [ 01/Nov/18 ]

The throttling is presently planned to happen at the global lock level (before the global lock is acquired for a write operation), so the intent for putting replication lag calculation in its own thread was to not clutter up the global lock acquisition code with lag calculation handling. Additionally, the lag does not need to be calculated every time the global lock is acquired, so we discussed that a periodic runner thread would be a good way to calculate it at specific intervals.

Comment by Andy Schwerin [ 01/Nov/18 ]

I'm not clear why you need a thread for this. The topology coordinator keeps an up-to-date table with the information needed to derive the current majority replication lag in constant time, I believe.

Generated at Thu Feb 08 04:47:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.