Add $merge stepdown-idempotency jstest matrix across whenMatched × whenNotMatched modes

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Summary

      SERVER-99827 currently excludes $merge aggregation tests from config-transition passthrough suites because no test grid maps which whenMatched × whenNotMatched combinations remain bit-identical under a forced primary stepdown that lands mid-pipeline. As internal counter materialisation migrates onto $merge (SERVER-123687 and related), passthrough authors need a concrete safe/unsafe table to selectively re-include exclusions rather than the current blanket skip.

      Proposal

      Add jstests/sharding/merge_stepdown_idempotency_matrix.js (~266 lines) that enumerates the full cross-product of whenMatched ∈

      {replace, merge, keepExisting, fail, pipeline}

      against whenNotMatched ∈

      {insert, fail, discard}

      , drives each cell through a real mid-pipeline stepdown, and prints a structured pass/unsafe verdict.

      Test shape

      • Fixture: three-node ReplSetTest (no ShardingTest required; the semantics under test are local to the $merge writer's retry, not router / chunk migration). Placement under jstests/sharding/ follows SERVER-99827's stated home for stepdown-during-aggregation coverage.
      • For each mode cell:
        1. Seed deterministic source (500 docs) and target (half-overlapping prefix so both matched and unmatched paths fire).
        2. Capture baseline: single-run aggregation against a stable primary.
        3. Capture stepdown-replay: enable hangWhileBuildingDocumentSourceMergeBatch on current primary, launch agg in parallel shell, await failpoint, force election via ReplSetTest.stepUp on a secondary, release failpoint, await primary agreement, re-issue same pipeline.
        4. Fingerprint target as (count, checksum, applySum).
        5. Classify pass (fingerprints equal) or unsafe (diverge → retry not idempotent for this mode).
      • Pipeline mode increments a per-document applyCount so double-application is detectable in the fingerprint even if count matches.

      Coverage notes

      • Four cells (keepExisting × {fail, discard}, fail × {fail, discard}

        ) are rejected at parse time per the documented mode matrix; the test marks them specInvalid: true and surfaces outcome: "n/a".

      • The remaining eleven cells run end-to-end.
      • Test does NOT fail on unsafe verdicts — the deliverable is the table itself; passthrough authors consume the printed jsTestLog summary to decide which exclusions in SERVER-99827 can be lifted.

      Verification

      • node --input-type=module --check < jstests/sharding/merge_stepdown_idempotency_matrix.js exits 0 (matches neighbor range_deletion_ordering_with_stepdown.js).
      • Imports ReplSetTest and configureFailPoint from canonical jstests/libs/ helpers.
      • Test tags: requires_replication, requires_majority_read_concern.

      Follow-ups

      • Once the table lands, owners of SERVER-99827 can selectively re-enable passthrough inclusion for cells reported as pass, and file targeted follow-ups for any cell reported as unsafe together with its captured reason string.
      • Sharded variant: same matrix can be re-driven inside ShardingTest with rs-backed shards to cover the routing-layer retry path; structure is unchanged.

      Related

      • SERVER-99827 — excludes $merge tests from config-transition suites

            Assignee:
            Unassigned
            Reporter:
            Mehar Grewal
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: