Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-10892

Session configuration to allow operations to skip cache eviction

    • Nick - 2024-04-30

      Not all operations should participate in cache eviction. This is WiredTiger's throttling mechanism, but it may prevent MongoDB from freeing up resources on our end.

      We have a special bypass in our throttling mechanism (aka tickets) for important operations that would otherwise stall the system indefinitely, and I believe we need to make similar exceptions in WiredTiger.

      We have many tasks in MongoDB that perform storage engine reads and writes whose delay makes the system behave worse under load. The most notable case is committing and aborting multi-document MongoDB transactions. Performing eviction after committing or aborting prevents us from freeing up resources in MongoDB. We do not take tickets for these operations, so they also bypass MongoDB's write throttling. In the worst case, when the system is under load, we can have hundreds of application threads trying to perform eviction without making any progress.

      To a lesser extent, the Journal flusher. This performs storage engine reads and can delay write acknowledgment if it falls behind. There are many other internal operations important for availability that, if caught in eviction, can make the system behave much worse than if we only blocked application threads.

      My proposal is that we have a configuration option for WT_SESSIONS to skip eviction in when the eviction is due to dirty data or updates, but not clean data. It still makes sense for an operation to evict a clean page in order to make room for a page that it wants to read.

            Assignee:
            backlog-server-storage-engines [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            louis.williams@mongodb.com Louis Williams
            Votes:
            0 Vote for this issue
            Watchers:
            19 Start watching this issue

              Created:
              Updated:
              Resolved: