[SERVER-65054] Avoid slow insert batches blocking replication Created: 29/Mar/22  Updated: 10/Jan/24

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Matthew Russotto Assignee: Backlog - Replication Team
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-62193 Improve insert_vector secondary perfo... Open
related to SERVER-78556 Return default of internalInsertMaxBa... Open
related to SERVER-70155 Add duration of how long an oplog slo... Closed
is related to SERVER-59776 50% regression in single multi-update Closed
is related to SERVER-61185 Use prefix_search for unique index lo... Closed
Assigned Teams:
Replication
Sprint: Repl 2023-03-06, Repl 2023-03-20, Repl 2023-04-03, Repl 2023-04-17, Repl 2023-05-01, Repl 2023-05-15
Participants:
Case:

 Description   

We now have a tunable max batch size for inserts (typically 500).  While an insert batch is being inserted, replication of any operations started after the batch is blocked because insert holds open an oplog hole.  Normally this is fine, but if the inserts are slow, this can cause overall replication lag; if we could prevent this or detect this (e.g. break up an insert batch when it gets long) we could avoid lag.



 Comments   
Comment by Geert Bosch [ 30/Jun/23 ]

Both settings were related to the amount of work we'd aim to do in a single transaction. However, that reasoning related to amount of work to do under a transaction, and did not anticipate the lengthening of the amount of time that the critical section for advancing the allDurable would be held. The fix in SERVER-78556 should be safe and expedient.

Comment by Vishnu Kaushik [ 20/Mar/23 ]

We were trying to use the new feature built in PM-2780 to groupOplogEntries which lets us group oplog entries for operations into a single apply ops oplog entry (see the code review I put up previously). However it seems the new feature doesn't support retryable writes.

For that reason we are going to use the timer approach outlined above instead.

Comment by Louis Williams [ 20/Sep/22 ]

No objections. This is what we're already doing in the BatchedDeleteStage.

Comment by Judah Schvimer [ 19/Sep/22 ]

I'm curious if anyone watching this ticket objects to matthew.russotto@mongodb.com's suggestion in the description to start a timer when we start applying the batch and breakup the batch after some (configurable) period of time.

Comment by Judah Schvimer [ 04/Apr/22 ]

Transactionalizing inserts would reduce the amount of time an oplog hole was open, which might help.

Generated at Thu Feb 08 06:01:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.