-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: 8.0.4
-
Component/s: None
-
None
-
Cluster Scalability
-
Fully Compatible
-
ALL
-
Cluster Scalability 2025-03-17
In AF-1468, an exception handler in FindAndModifyCmd::_handleWouldChangeOwningShardErrorRetryableWriteLegacy() tries to implicitly abort a transaction that has not actually been started, causing an invariant to fail.
This erroneous exception handler, herein called EEH, is in a specific findAndModify code path that is executed when feature flag gFeatureFlagUpdateDocumentShardKeyUsingTransactionApi is disabled, and the findAndModify command is run as a retryable write without shard key whose update would change the owning shard (that is, the command is not part of a user-started transaction). Specifically, this code path handles the WouldChangeOwningShard error from the original shard by starting a new internal transaction to delete the document from the original owning shard and insert the modified document in the new owning shard.
In this code path, there are two uasserts that are tested before the txn number gets set in the TransactionRouter; if any of these asserts fail, the exception would then be handled by EEH:
- in beginOrContinueTxn() for starting a new transaction during shutdown.
- in _resetRouterStateForStartTransaction() for incorrect readConcern setting for a command that starts a transaction. For this specific code path, the transaction being started is an internal transaction, and findAndModify explicitly sets the read concern to local. Therefore, for this code path this uassert will never fail and never generate an exception that is caught by EEH.
Thus, only the condition that findAndModify tried to start an internal transaction while the mongos is shutting down will result in this crash. A client command cannot cause this crash.
This code path and EEH is not executed when gFeatureFlagUpdateDocumentShardKeyUsingTransactionApi is enabled.