[SERVER-32547] Mongo stalls while evicting dirty cache Created: 04/Jan/18  Updated: 09/Feb/18  Resolved: 18/Jan/18

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.6.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Bob Potter Assignee: Kelsey Schubert
Resolution: Duplicate Votes: 0
Labels: RF
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File SERVER-32547_oplog_trunc.png     PNG File Screen Shot 2018-01-15 at 5.18.20 pm.png     PNG File lookaside.png     PNG File question.png     PNG File t2-initial-look.png    
Issue Links:
Duplicate
duplicates SERVER-32139 Oplog truncation creates large amount... Closed
Related
related to WT-3766 Lookaside sweep for obsolete updates Closed
Operating System: ALL
Participants:

 Description   

We are having an issue where mongo becomes unresponsive for minutes at a time. During this period it is emptying the dirty cache.

The pattern we are seeing is that during normal operation performance is generally acceptable. However we have a large amount of data we want to delete from the cluster and starting that deletion process triggers the stalling behavior. Once this behavior starts it seems to reappear even after we've stopped the deletions. Restarting the server processes causes the problem to go away until we start the deletions again.

I'm uncertain if this is related but we previously had an issue (SERVER-31141) with stalls when the disk cache reached the 95% threshold but WT-3079 appeared to fix that issue.

I have logs and diagnostic data I can add to the upload portal.

Let me know if any other information would be helpful.



 Comments   
Comment by Kelsey Schubert [ 18/Jan/18 ]

Hi bpot,

Thanks for uploading the diagnostic.data; it was invaluable to diagnosis this issue. The issue you are encountering is described in SERVER-32139 and will be resolved when WT-3805 is completed. Please feel free to review these tickets for additional information and watch the tickets for updates.

Kind regards,
Kelsey

Comment by Bob Potter [ 04/Jan/18 ]

Data is uploaded.

Comment by Kelsey Schubert [ 04/Jan/18 ]

Hi bpot,

Thank you for opening a new ticket describing this issue. I've created a secure upload portal for you to provide the files.

Thanks again,
Kelsey

Generated at Thu Feb 08 04:30:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.