WiredTiger has several thresholds when managing its cache. In particular, with default settings, all application operations are throttled when the amount of dirty content in cache reaches 20%.
However, this behavior combines with MongoDB's replication machinery to create a vicious cycle where heavy update workloads generate a lot of cache pressure on the primary. Secondaries can only apply the oplog as fast as they can read it from the primary, so some replication lag is common during heavy write workloads.
With readConcern majority always on in 3.6, replication lag generates further cache pressure on the primary as it maintains history for majority reads. This can in turn slow down secondary reads of the oplog when the primary is overwhelmed by updates.
Further, once lookaside eviction is required, pages can be evicted from cache and read back with history, leaving them marked dirty. This further contributes to cache pressure on primaries (and particularly pressure increasing the dirty content in cache).
Investigate only throttling update operations when the dirty cache limit is reached and allowing reads to proceed. Further, investigate situations that cause oplog reads to block and attempt to tweak behavior to favor oplog reads making progress.
WT-5135 Change lookaside file inserts to use cursor.insert
- related to
WT-3773 Reconciling a page with modify records and lookaside can choose wrong values
- links to