[SERVER-36496] Cache pressure issues during oplog replay in initial sync Created: 07/Aug/18  Updated: 02/Oct/23  Resolved: 02/Oct/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Backlog - Replication Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Problem/Incident
is caused by WT-4016 Measure and improve lookaside perform... Closed
Related
related to SERVER-33191 Cache-full hangs on 3.6 Closed
related to SERVER-34900 initial sync uses different batch lim... Closed
related to SERVER-34938 Secondary slowdown or hang due to con... Closed
related to SERVER-34942 Stuck with cache full during oplog re... Closed
related to SERVER-36238 replica set startup fails in wt_cache... Closed
Assigned Teams:
Replication
Operating System: ALL
Participants:

 Description   

During oplog replay in initial sync we don't advance the oldest timestamp. This can pin a lot of data in the cache. Resulting symptoms include

  • slow initial sync due to resulting cache churn
  • growth of lookaside (cache overflow) table and corresponding file WiredTigerLAS.wt
  • in extreme cases cache-full hangs were observed. This was believed fixed with SERVER-33191, but we should be alert for new occurrences of this issue.


 Comments   
Comment by Lingzhi Deng [ 02/Oct/23 ]

We think this problem goes away with WT history store.

Generated at Thu Feb 08 04:43:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.