-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Execution
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Overview
The v2 change stream passthrough suites (e.g. change_streams_multi_stmt_txn_mongos_passthrough_v2, change_streams_multi_stmt_txn_sharded_collections_passthrough_v2) load both network_error_and_txn_override.js and implicit_change_stream_v2.js. The interaction between these two overrides is fragile and causes tests to fail with error 8027900: "attempting to continue transaction that was not started".
Root Cause
implicit_change_stream_v2.js is registered as the outermost runCommand override (loaded last). When it intercepts the first change stream aggregate command, featureFlagEnabled is null, so it calls FeatureFlagUtil.isPresentAndEnabled() to check the ChangeStreamPreciseShardTargeting feature flag. That check drives a nested adminCommand({hello: 1}) back through the inner part of the override chain, which includes txnRunCommandOverride (network_error_and_txn_override.js).
At that point, txnRunCommandOverride has stale client-side transaction state from a just-committed transaction (e.g. an insert in a before() hook, txnNumber: 0). It sees the hello command, which cannot run inside a transaction, and tries to abort the current transaction before proceeding. The abortTransaction command fails on the server because the transaction was already committed.
Repro Pattern
Any test that (a) does at least one insert before the first watch() call and (b) runs in a v2 txn passthrough suite will hit this error on the first watch(). Specifically:
- Suite loads: enable_sessions.js, txn_passthrough_cmd_massage.js, network_error_and_txn_override.js, implicit_filter_eot_changestreams.js, implicit_change_stream_v2.js
- Test before() runs testColl.insert({...}) – txn override wraps it in txnNumber: 0, commits it
- Test calls testColl.watch([]) – implicit_change_stream_v2.js intercepts, checks feature flag via adminCommand({hello: 1})
- That adminCommand goes through txnRunCommandOverride, which tries to abort txnNumber: 0 (stale state) → server returns error 8027900
Proposed Fix
The feature flag check in implicit_change_stream_v2.js should bypass the transaction override layer. Options include:
- Cache the feature flag result before any test runs (e.g. in a global setup step), so the nested adminCommand is never issued mid-command-processing
- Use a direct connection (bypassing the session/override chain) for the feature flag lookup
- Make txnRunCommandOverride aware that it is being called from within the processing of another command and skip the abort-on-non-txn-command logic in that case
- depends on
-
SERVER-52253 Enable feature flag for Improved change stream handling of cluster topology changes
- Closed
- is related to
-
SERVER-52253 Enable feature flag for Improved change stream handling of cluster topology changes
- Closed