Make combination of implicit_change_stream_v2.js and txn overrides more robust

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Execution
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Overview

      The v2 change stream passthrough suites (e.g. change_streams_multi_stmt_txn_mongos_passthrough_v2, change_streams_multi_stmt_txn_sharded_collections_passthrough_v2) load both network_error_and_txn_override.js and implicit_change_stream_v2.js. The interaction between these two overrides is fragile and causes tests to fail with error 8027900: "attempting to continue transaction that was not started".

      Root Cause

      implicit_change_stream_v2.js is registered as the outermost runCommand override (loaded last). When it intercepts the first change stream aggregate command, featureFlagEnabled is null, so it calls FeatureFlagUtil.isPresentAndEnabled() to check the ChangeStreamPreciseShardTargeting feature flag. That check drives a nested adminCommand({hello: 1}) back through the inner part of the override chain, which includes txnRunCommandOverride (network_error_and_txn_override.js).

      At that point, txnRunCommandOverride has stale client-side transaction state from a just-committed transaction (e.g. an insert in a before() hook, txnNumber: 0). It sees the hello command, which cannot run inside a transaction, and tries to abort the current transaction before proceeding. The abortTransaction command fails on the server because the transaction was already committed.

      Repro Pattern

      Any test that (a) does at least one insert before the first watch() call and (b) runs in a v2 txn passthrough suite will hit this error on the first watch(). Specifically:

      1. Suite loads: enable_sessions.js, txn_passthrough_cmd_massage.js, network_error_and_txn_override.js, implicit_filter_eot_changestreams.js, implicit_change_stream_v2.js
      2. Test before() runs testColl.insert({...}) – txn override wraps it in txnNumber: 0, commits it
      3. Test calls testColl.watch([])implicit_change_stream_v2.js intercepts, checks feature flag via adminCommand({hello: 1})
      4. That adminCommand goes through txnRunCommandOverride, which tries to abort txnNumber: 0 (stale state) → server returns error 8027900

      Proposed Fix

      The feature flag check in implicit_change_stream_v2.js should bypass the transaction override layer. Options include:

      • Cache the feature flag result before any test runs (e.g. in a global setup step), so the nested adminCommand is never issued mid-command-processing
      • Use a direct connection (bypassing the session/override chain) for the feature flag lookup
      • Make txnRunCommandOverride aware that it is being called from within the processing of another command and skip the abort-on-non-txn-command logic in that case

            Assignee:
            Denis Grebennicov
            Reporter:
            Romans Kasperovics
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: