SERVER-37344 implemented recoveryToken support for recovering the outcome over a sharded transaction when running commitTransaction on a recovery mongos (i.e., mongos which has not seen that transaction and doesn't know the coordinator or participants list).
In the case of aborting the transaction against a recovery mongos, the driver will still include the recoveryToken (SPEC-1279), but there are situations where the recovery token might still not be known, which means parts of the transaction could still remain open for up to the max transaction lifetime, potentially blocking other operations.
Since in such a case, neither the participants nor the coordinator might be known (especially with read-only shard optimizations), the only deterministic way of ensuring that the transaction vestiges have been aborted is to broadcast abortTransaction to all shards in the cluster. However, this is not a scalable solution and it is also a possibility for DOS attack, so instead as part of this ticket we will do the next best thing:
- Make the graceful MongoS shutdown logic do a best-effort abortTransaction for all in-progress transaction routers. That way we ensure that on maintenance shutdowns we will not leave open transactions.
- Document the cases where in 4.2 we can leave transactions hanging for a minute and manual recovery steps that operator might be able to take if they want to clear that state before the transactions expire. That would be the case where MongoS hard crashes after having started transaction on a shard, but before any recovery information is returned to the driver.
- Post-4.2.0 figure out a format for the recovery token, which contains the set of shards, which were involved as part of the transaction so far. The issues to be considered here are around how large that token can get, because shard ids are strings and theoretically, there is a possibility to exceed the BSON max size.