Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-7190

Limit eviction of non-history store pages when checkpoint is operating on history store

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • WT10.0.1, 4.4.7, 5.0.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • 8
    • Storage - Ra 2021-03-22, Storage - Ra 2021-04-05, Storage - Ra 2021-04-19, Storage - Ra 2021-05-03

      Test (attached) is heavy update workload in PSA replica set, emrc:t, S down, which stresses the history store. Configured with 2 GB cache, which would be typical of a machine with ~4 GB of memory.

      • Resident memory usage rises, reaching ~4 GB, at which point I abruptly terminated mongod at B to simulate crash due to OOM that could occur (did not hit actual OOM because was run on a machine with >>4 GB of memory)
      • During recovery oplog application after B ("opcountersRepl update") we see a similar (actually somewhat worse) pattern, and would again hit OOM and crash around C, before recovery completes
      • Increase in resident memory is due to accumulation of pageheap_free_bytes, indicative of memory fragmentation
      • The step function increases in fragmentation occur during checkpoints, when we allow dirty cache content to rise to >140% of configured cache size.

      Fragmentation can occurs when a large amount of memory is allocated in small regions, such as update structures associated with dirty content, then is freed but cannot be re-used for large structures such as pages read from disk. We put in place mechanisms to limit this fragmentation by limiting dirty cache to 20% and update structures to 10% of cache, and I suspect that by allowing dirty cache content to greatly exceed these limits we are creating excessive memory fragmentation.

      Note that WT-6924 was put into place to eliminate the very large spikes of dirty content we were seeing before to many times the cache size, but I'm not sure why we are still allowing dirty content to entirely fill the cache (and more) rather than limiting it to 20% as we normally do.

        1. repro1.js
          1 kB
        2. repro.sh
          2 kB
        3. image-2021-04-07-15-49-57-323.png
          image-2021-04-07-15-49-57-323.png
          209 kB
        4. image-2021-04-07-15-49-30-344.png
          image-2021-04-07-15-49-30-344.png
          118 kB
        5. image-2021-04-06-15-06-04-825.png
          image-2021-04-06-15-06-04-825.png
          138 kB
        6. image-2021-03-31-14-29-38-862.png
          image-2021-03-31-14-29-38-862.png
          443 kB
        7. image-2021-03-09-13-09-29-379.png
          image-2021-03-09-13-09-29-379.png
          415 kB
        8. image-2021-03-09-13-09-22-280.png
          image-2021-03-09-13-09-22-280.png
          415 kB
        9. fragmentation.png
          fragmentation.png
          192 kB

            Assignee:
            haseeb.bokhari@mongodb.com Haseeb Bokhari (Inactive)
            Reporter:
            bruce.lucas@mongodb.com Bruce Lucas (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: