-
Type: Improvement
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
The current heuristic for append-only tables (such as the mongodb oplog) causes recently written pages to be evicted from memory in order to avoid polluting the cache. This means that when the page is read it must be read from disk (or OS cache) then decompressed before results can be returned. On a system with high write load, this increases the work done in the critical path to getting oplog data to downstream nodes in a replica set.
One solution we discussed in person would be for the uncompressed page images for the oplog to be held in cache after they've been written to disk. MongoDB could then periodically inform WT of the lowest position that is still "interesting" so that older pages can still be rapidly evicted. "Interesting" can be loosely defined as the position of the furthest behind replica that is still up, possibly also considering the position of any oplog-tailing cursors created by users.
- depends on
-
WT-2737 Scrub dirty pages rather than evicting them
- Closed
- duplicates
-
WT-2764 Optimize checkpoints to reduce throughput disruption
- Closed
- is related to
-
SERVER-18081 Tailing the oplog requires paging in the recently added entries under WiredTiger
- Closed