-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Execution
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Context
On a sharded cluster the SBE/Express primary returns kNotHandled when the post-image no longer lives
on the shard observing the update — caused by moveChunk, moveCollection, reshardCollection, etc.
PrimaryWithFallback then calls the Aggregation fallback. This is the genuine production trigger for
the aggregation cell and the only place the primary->fallback handoff is observable; it cannot be
reached on a replset (the replset metrics test exercises the Aggregation executor via the flag-off
path, but not the transition).
Scope
Test-only (the executor wiring is done in the per-engine tickets). A dedicated test with its own
ShardingTest that controls placement and asserts metric deltas after each DDL op.
Acceptance / test matrix
New jstests/sharding/query/change_streams/change_stream_metrics_update_lookup_fallback.js. For each of
moveChunk, moveCollection, reshardCollection (and any op that relocates the post-image or changes the
UUID): place a doc, open a collection change stream, perform the op so the post-image becomes remote
or the UUID changes, update the doc, then assert — summed across shards via
FixtureHelpers.mapOnEachShardNode (mongos exposes no per-process metrics; the split_large_event.js
pattern):
- updateLookup.sbe.notHandled +1 (primary declined), and
- updateLookup.aggregation.found +1 (fallback handled it).
Also assert the deleted/absent post-image case routes to aggregation.notFound.
Dependencies
Depends on the Aggregation and SBE wiring tickets — both must be wired before the handoff is
observable in metrics.
Notes
Heavier on cluster/DDL orchestration than metric plumbing; good once comfortable with ShardingTest and
change-stream sharding behaviour.