Ignore eviction target/trigger in follower mode

XMLWordPrintableJSON

    • Storage Engines, Storage Engines - Transactions
    • SE Transactions - 2025-10-10
    • 3

      During internal testing of disaggregated storage, we have hit several issues where a standby node running in follower mode has stalled due to cache pressure.

      The typical scenario is that oplog application inserts/updates records that land in the ingest tables on the follower node. Because the follower can't write to shared storage, this dirty data has to remain in cache until it can either be pruned (after picking up a new checkpoint) or until it can be written into the shared table (after stepping up to become the primary/leader).

      But we currently use the same cache eviction targets and triggers in follower mode and leader mode, despite the fact that we can't evict dirty data from a follower. This means that inserting records equal to 10% of the cache (the default update trigger) will cause a follower to stall – all of the oplog applier threads get pulled in to help with eviction, but we can't evict so they get stuck, and the node effectively hangs.

      This ticket is intended as a short term fix to enable more standby testing. For now, we should not try to evict dirty or update content on a follower node. This risks filling (or overfilling) the cache with dirty data. So we should also add a failure mode where we panic if the cache is full of dirty data – better to have a clear failure with a clear cause than to have the system mysteriously hang.

      Definition of done:

      • Application threads are not used to help with dirty or update eviction when WiredTiger is in follower mode.
        • This could be implemented by dynamically adjusting the trigger/target values when the system switches between follower and leader modes. or by changing the checks for using application threads for eviction, or something else.
      • There are no changes to clean eviction behavior. WT should still evict clean pages if the cache is full.
      • If the cache has a large amount of dirty or update content (95%?) WT should log a clear message about the problem and panic.

      Note: the long term fix here isn't obvious – there are a variety of ways we could relieve pressure on the follower – but they also have downstream consequences for things like failover time and the efficiency of ingest draining during checkpoint pickup. Hence this ticket to allow more testing while we consider more holistic solution.

            Assignee:
            Chenhao Qu
            Reporter:
            Keith Smith
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: