Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-74085

Ensure queries that spill to TemporaryRecordStores checkpoint their data

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Execution
    • 120

      We should ensure that queries that spill to the storage engine help pay the cost of the spilling and ensure their data is actually on disk. My proposal is that we ensure queries that spill to disk periodically checkpoint the data for their temporary table, which is supported by the WiredTiger checkpoint() API.

      More detail:

      We hit a bug in WT (WT-10576) when we try to force drop a TemporaryRecordStore (TRS) that has uncommitted data. This can happen in query stages (e.g. hash agg) that spill using the storage engine inside multi-document transactions and the lifetime of the storage transaction exceeds the lifetime of the table.

      We tried to fix this by not using "force" to drop these temporary tables, but because these temporary record stores are not included in any checkpoints yet, we will fail to drop the table for up to 1 minute until the next checkpoint completes, persisting the data to disk.

      This raised a different question: if these TRS are not actually spilled to disk, what is the value of using them? We are essentially polluting the storage engine cache and creating more work for the next checkpoint, which could have performance impacts on the system. If we make these queries pay the cost of spilling, we would probably have fewer performance issues for the rest of the system, and also ensure that the tables get dropped faster.

            Assignee:
            backlog-server-execution [DO NOT USE] Backlog - Storage Execution Team
            Reporter:
            louis.williams@mongodb.com Louis Williams
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated: