Loading...

XML

Word

Printable

JSON

Type: Question
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- external-user

Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Our production sharded cluster is facing issues for a while now where wiredtiger cache of primary nodes of each shard are using application threads to evict pages from cache, resulting in read and write failures. This happens when the update bytes breaches 10% of the cache size. We already tried increasing the eviction thread count to 20 and cache size to 70% but didn’t notice any improvements in cache evictions.

Recently one of our nodes auto recovered from this issue and since then the update bytes percentage of that node is under the limit (constantly within 2.5% - 3%). The following are the observations that we made after the auto recovery,

Decrease in checkpoint blocked page evictions.

Decrease in cache eviction gave up due to detecting an out of order tombstone ahead of the selected on disk update.
Change in eviction walk strategy - before the recovery the strategy was only dirty pages. Currently it’s only clean pages.
Decrease in eviction failures due to failure during reconciliation.
Increase in pages read from disk to cache

We also restarted the mongo process on few of the nodes after which we see the update bytes are within the limit. It would be great to have your suggestions or views regarding this issue which would help us to understand the issue better and fix it.

Thank you

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

cache_eviction_gave_up_due_to_detecting_an_out_of_order_tombstone_ahead_of_the_selected_on_disk_update.png
239 kB
Sep 19 2023 05:42:46 PM UTC
checkpoint_blocked_page_evictions.png
267 kB
Sep 19 2023 05:42:46 PM UTC
eviciton_strategy_dirty_pages.png
157 kB
Sep 19 2023 05:42:46 PM UTC
eviction-strategy_clean_pages.png
152 kB
Sep 19 2023 05:42:46 PM UTC
in_memory_page_splits.png
185 kB
Sep 19 2023 05:42:46 PM UTC
update_bytes_percentage.png
149 kB
Sep 19 2023 05:42:46 PM UTC

Assignee:: Unassigned
Reporter:: Yuvaraj Anbarasan
Participants:: Eric Sedor, Vamsi Boyapati, Yuvaraj Anbarasan
Votes:: 2 Vote for this issue
Watchers:: 9 Start watching this issue

Created:: Sep 19 2023 05:41:52 PM UTC
Updated:: Oct 03 2023 03:30:38 PM UTC
Resolved:: Oct 03 2023 03:28:47 PM UTC

Details

Description

Attachments

Attachments

Activity

People

Dates