-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Summary
Comprehensive jstest matrix for the SERVER-66949 family: sharded deleteMany / updateMany / upsert that batch-retries on StaleConfig under-reports n / deletedCount / modifiedCount / upsertedCount. Drivers and end users see incorrect ack counts; audit pipelines break.
File
jstests/sharding/count_correctness_under_concurrent_moveChunk.js (295 lines): cross-product matrix over command × staleness inducer × targeting, 24 assertions total, with one idempotency-replay tail check.
| Axis | Values |
|---|---|
| command | deleteMany, updateMany, upsert(insert path via no-match _id), upsert(update path via matching _id) |
| staleness inducer | explicit moveChunk via fresh router, balancer-tagged migration (zone-driven), no-op self-move that bumps collection version |
| targeting | single-shard predicate ($lt 0), broadcast across 3 shards |
Test design
Two-mongos topology (s0 fresh, s1 stale) so ground truth is always read through an up-to-date routing table and never conflated with the SUT's stale cache. Per-cell assertions: deletedCount/modifiedCount/upsertedCount must equal ground truth from the fresh router AND the side-channel observation must agree (e.g. Σv post-updateMany, doc presence/absence post-upsert).
Honest handling of the multi_writes_on_placement_change.js precedent: single-shard updateMany against a stale router can legitimately surface QueryPlanKilled; the test catches that code and skips the count assertion for that cell rather than masking a real failure.
Idempotency tail check defends against the inverse failure where retry leaks a duplicate ack.
Verify
- node --input-type=module --check: clean.
- is related to
-
SERVER-66949 The reported count of documents deleted by a targeted deleteMany lower than what is actually deleted
-
- In Code Review
-