Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-12798

Checkpoint time is high when the cache size is large

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Storage Engines
    • StorEng - Defined Pipeline

      MongoDB runs checkpoint every one minute. Ideally, checkpoint should be able to finish within one minute on a healthy database. However, we have seen cases that if the customer has a large cache, the checkpoint time can be long with modest percentage of dirty content. For example, if the cache size is 300GB and the dirty fill ratio is at 5%, checkpoint needs to write 15GB of data. This may cause the checkpoint to take longer and causing other problems in the system.

      WiredTiger has several mechanisms to help reduce the work checkpoint needs to do:

      • checkpoint scrub eviction to wait eviction to reduce the amount of dirty content below the dirty target.
      • application threads may help to evict dirty data if the dirty fill ratio exceeds the dirty trigger level.

      However, none of the above helps in this case because the dirty fill ratio is still low and these mechanisms are not activated.

      We can adjust the dirty target and dirty trigger settings to activate these mechanisms but they may cause other side effects, such as increased latencies if application threads are brought to do eviction.

      We need to investigate optimizations and settings to reduce the time of checkpoint on databases with big cache.

            Assignee:
            backlog-server-storage-engines [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            chenhao.qu@mongodb.com Chenhao Qu
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated: