Investigate finer-grained query shape granularity for insert command

    • Type: Task
    • Resolution: Unresolved
    • Priority: Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Integration
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Overview

      The current InsertCmdShape (introduced in SERVER-122050) hashes only on the target namespace; the documents payload is serialized as a fixed ?array<?object> placeholder and HashValue is a no-op. As a result, all inserts into the same namespace – regardless of batch size – accumulate into a single queryStats store entry.

      Concern raised

      PR review on SERVER-122054 raised that this makes per-shape metrics like totalExecMicros hard to interpret: a single-document insert and a 10k-document insert share one bucket despite very different execution costs.

      Options to evaluate

      • Bucket by document count (e.g. log-scale buckets 1, 2-10, 11-100, 101-1000, 1000+) so workload sizes do not blur together.
      • Per-statement shaping parallel to how update is keyed per-statement.
      • Add a non-shape metric (e.g. totalDocsInserted) – gives workload context without splitting shapes.
      • Status quo – match find convention (shape captures intent, not workload size); document the trade-off.

      Action

      Discuss in the weekly Query Stats sync. Decide whether to take action now or defer until customer feedback indicates need.

            Assignee:
            Chi-I Huang
            Reporter:
            Chi-I Huang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: