[SERVER-45696] Lengthy continuous backup can result in lengthy stall on first checkpoint after backup Created: 22/Jan/20  Updated: 06/Dec/22  Resolved: 29/Jun/21

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 4.2.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Backlog - Storage Engines Team
Resolution: Done Votes: 5
Labels: KP42
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Problem/Incident
Related
related to WT-5587 Limit how many checkpoints are droppe... Closed
Assigned Teams:
Storage Engines
Operating System: ALL
Participants:
Case:

 Description   

Customers using continuous backups have observed lengthy stalls on the first checkpoint after the backup. Observed symptoms include a lengthy stall in ftdc, stall in secondaries fetching oplog, resulting in increasing lag and complete stall in writes on the primary due to flow control.



 Comments   
Comment by Haribabu Kommi [ 29/Jun/21 ]

With a couple of WT backup fixes that went in, all of the known backup issues are resolved. As part of the PM-1773 performance test along with backup running in the background doesn't show much overhead on the running system when the backup is running, other than increasing the disk IO usage.

Closing this ticket as there is no more work to be done in the WT backup.

Comment by Daniel Pasette (Inactive) [ 07/Apr/20 ]

WT-5587 was committed and released in 4.2.4 as a mitigation, and more work is scheduled.

Comment by David Dana [ 07/Apr/20 ]

jocelyn.del-prado is this something that the Storage Engines team can prioritize sooner rather than later?

Generated at Thu Feb 08 05:09:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.