Op-scoped QueryKnobConfiguration: single consistent snapshot per query

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Execution
    • QE 2026-05-11, QE 2026-04-13, QE 2026-05-25
    • 200
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Each ExpressionContext currently owns an independent `QueryKnobConfiguration` snapshot, computed lazily on first read. This produces two kinds of cross-expCtx inconsistency within a single query:

      1. Pre-propagation initialization. Inner sub-pipeline expCtxs (`$lookup` / `$unionWith` / `$graphLookup`) are constructed without `QuerySettings`; settings are propagated at exec time via `setQuerySettingsIfNotPresent`. If anything reads knobs on the inner expCtx during optimization, the snapshot is computed against empty settings and locked in via `DeferredFn` — then the propagation tasserts (8827100) when it tries to install real settings. This was the BF-42949 trigger via SERVER-124146.

      2. `setParameter` racing the snapshot. Even with identical `QuerySettings`, knobs derived from server-parameter atomics are re-sampled per expCtx on first knob access. A concurrent `setParameter` between parent's first read and a sub-pipeline's first read produces divergent snapshots within the same query. The class doc-comment promising consistency throughout the query lifetime only actually holds per expCtx.

      Goal. One `QueryKnobConfiguration` snapshot per operation, computed eagerly when settings are installed and visible to every expCtx in the tree. Both inconsistencies disappear: pre-propagation reads return the same snapshot as post-propagation reads, and a concurrent `setParameter` cannot land between two readers within the same op.

      Fix space. Two reasonable mechanisms — see the comments below for full discussion:

      • share the snapshot across an op's ExpressionContext tree via a refcounted holder co-owned by the expCtxs; or
      • migrate the canonical strong reference onto `ClientCursor` for cursor-bound ops, making "op-scoped state lives with the operation" a typed invariant.

      Either mechanism deletes the per-expCtx `DeferredFn<QueryKnobConfiguration>` field and the tassert at `expression_context.h:913`.

            Assignee:
            Catalin Sumanaru
            Reporter:
            Catalin Sumanaru
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: