[SERVER-81803] Understanding cache eviction failures Created: 19/Sep/23 Updated: 03/Oct/23 Resolved: 03/Oct/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Yuvaraj Anbarasan | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 2 |
| Labels: | external-user | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Participants: |
| Description |
|
Our production sharded cluster is facing issues for a while now where wiredtiger cache of primary nodes of each shard are using application threads to evict pages from cache, resulting in read and write failures. This happens when the update bytes breaches 10% of the cache size. We already tried increasing the eviction thread count to 20 and cache size to 70% but didn’t notice any improvements in cache evictions.
Recently one of our nodes auto recovered from this issue and since then the update bytes percentage of that node is under the limit (constantly within 2.5% - 3%). The following are the observations that we made after the auto recovery,
We also restarted the mongo process on few of the nodes after which we see the update bytes are within the limit. It would be great to have your suggestions or views regarding this issue which would help us to understand the issue better and fix it.
Thank you |
| Comments |
| Comment by Eric Sedor [ 03/Oct/23 ] |
|
In general for this issue we'd like to encourage you to start by asking our community for help by posting on the MongoDB Developer Community Forums. Briefly I can suggest that I think the configuration options discussed in If the discussion there leads you to suspect a bug in the MongoDB server, then we'd want to investigate it as a possible bug here in the SERVER project. We'll be happy to leverage the WT team for such an issue. I'll close this ticket for now but we can reopen it if there is a bug. Note that the contents of the diagnostic.data directory of your dbPath will be critical to do so. Sincerely, |
| Comment by Yuvaraj Anbarasan [ 21/Sep/23 ] |
|
Hi vamsi.krishna@mongodb.com , I think it wouldn't possible to share the FDTC file. We're still checking with out infra team regarding this. I have attached some of the metrics in the thread. If any other metrics are required we can share it. |
| Comment by Vamsi Boyapati [ 20/Sep/23 ] |
|
Could you attach the FTDC data to this ticket? |
| Comment by Yuvaraj Anbarasan [ 20/Sep/23 ] |