[SERVER-34941] Add testing to cover cases where timestamps cause cache pressure Created: 10/May/18  Updated: 29/Oct/23  Resolved: 27/Jul/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.6.4
Fix Version/s: 3.6.7, 4.0.1, 4.1.2

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Benety Goh
Resolution: Fixed Votes: 0
Labels: SWKB, nyc
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File stuck-recovery.png    
Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-35191 Stuck with cache full during rollback Closed
is duplicated by SERVER-35339 Complete recovery failure after uncle... Closed
Problem/Incident
Related
related to SERVER-33191 Cache-full hangs on 3.6 Closed
related to SERVER-35191 Stuck with cache full during rollback Closed
related to SERVER-36238 replica set startup fails in wt_cache... Closed
related to SERVER-51430 update log message search in recovery... Closed
is related to SERVER-34938 Secondary slowdown or hang due to con... Closed
is related to SERVER-36495 Cache pressure issues during recovery... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6
Sprint: Storage NYC 2018-07-30
Participants:
Case:

 Description   

During recovery oplog application we don't advance the oldest timestamp. This can pin a lot of data in the cache and we can get stuck with the cache full.



 Comments   
Comment by Benety Goh [ 27/Jul/18 ]

Reproduction script shows that the stuck cache issue was resolved between 3.6.5 and 3.6.6. There is a reported performance regression reported in SERVER-36221 that we will continue to investigate.

Comment by Githook User [ 20/Jul/18 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-34941 add test to fill wiredtiger cache during recovery oplog application

(cherry picked from commit 6b0147211e26239fb15e06fa5555bd3f701d8669)
Branch: v3.6
https://github.com/mongodb/mongo/commit/20a3ee6ac2b728bba70c7383e7f2dbb6c429f565

Comment by Githook User [ 20/Jul/18 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-34941 use 4.0 storage recovery logging component in recovery_wt_cache_full.js

(cherry picked from commit 472d4ecaf989b239e324ef12b39357802d96f607)
Branch: v4.0
https://github.com/mongodb/mongo/commit/e1092255d6168d90639f3f170537e42d09ab6c2e

Comment by Githook User [ 20/Jul/18 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-34941 remove test logic to update min valid. this is not required as of 4.0

(cherry picked from commit 461184c1467fb6c130638b27bf1d71962c7e830b)
Branch: v4.0
https://github.com/mongodb/mongo/commit/e3119769fc8130428c035a1d8bf68e67282e8e8a

Comment by Githook User [ 20/Jul/18 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-34941 add test to fill wiredtiger cache during recovery oplog application

(cherry picked from commit 6b0147211e26239fb15e06fa5555bd3f701d8669)
Branch: v4.0
https://github.com/mongodb/mongo/commit/5c17751d900c7ebc6cdb4eefbd2e1797555baf43

Comment by Githook User [ 20/Jul/18 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-34941 use 4.0 storage recovery logging component in recovery_wt_cache_full.js
Branch: master
https://github.com/mongodb/mongo/commit/472d4ecaf989b239e324ef12b39357802d96f607

Comment by Githook User [ 19/Jul/18 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-34941 remove test logic to update min valid. this is not required as of 4.0
Branch: master
https://github.com/mongodb/mongo/commit/461184c1467fb6c130638b27bf1d71962c7e830b

Comment by Githook User [ 19/Jul/18 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-34941 add test to fill wiredtiger cache during recovery oplog application
Branch: master
https://github.com/mongodb/mongo/commit/6b0147211e26239fb15e06fa5555bd3f701d8669

Comment by Alexander Gorrod [ 18/May/18 ]

bruce.lucas Do you have a workload we can use to reproduce this issue? Or failing that could you attach diagnostic data from a scenario where the symptom happened?

Comment by Gregory McKeon (Inactive) [ 15/May/18 ]

Sending this to storage, since they're more familiar with the oldest timestamp semantics on 3.6. We're happy to assist if there's repl work here.

Generated at Thu Feb 08 04:38:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.