-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Integration
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Overview
The current InsertCmdShape (introduced in SERVER-122050) hashes only on the target namespace; the documents payload is serialized as a fixed ?array<?object> placeholder and HashValue is a no-op. As a result, all inserts into the same namespace – regardless of batch size – accumulate into a single queryStats store entry.
Concern raised
PR review on SERVER-122054 raised that this makes per-shape metrics like totalExecMicros hard to interpret: a single-document insert and a 10k-document insert share one bucket despite very different execution costs.
Options to evaluate
- Bucket by document count (e.g. log-scale buckets 1, 2-10, 11-100, 101-1000, 1000+) so workload sizes do not blur together.
- Per-statement shaping parallel to how update is keyed per-statement.
- Add a non-shape metric (e.g. totalDocsInserted) – gives workload context without splitting shapes.
- Status quo – match find convention (shape captures intent, not workload size); document the trade-off.
Action
Discuss in the weekly Query Stats sync. Decide whether to take action now or defer until customer feedback indicates need.
- is related to
-
SERVER-122050 Update etc/extensions.yml
-
- Closed
-
-
SERVER-122054 Collect query stats for inserts in standalone
-
- In Progress
-
- related to
-
SERVER-122054 Collect query stats for inserts in standalone
-
- In Progress
-