[SERVER-51349] An OpCtx's UncommittedCollections are retained after transaction resource stashing. Created: 05/Oct/20  Updated: 29/Oct/23  Resolved: 07/Oct/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.9.0

Type: Task Priority: Major - P3
Reporter: Blake Oler Assignee: Daniel Gottlieb (Inactive)
Resolution: Fixed Votes: 0
Labels: PM-234-M1, PM-234-T-lifecycle
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File invariant.diff    
Issue Links:
Depends
is depended on by SERVER-51209 Fill in missing gaps in Resharding wo... Closed
is depended on by SERVER-51210 Call setInitialChunksAndZones from th... Closed
Related
is related to SERVER-51350 Move collection creation inside a rep... Closed
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2020-10-19
Participants:

 Description   

When a command inside a running transaction returns a response, its (uncommitted) state is stashed into some map on session's in a structure called TxnResources. This stash includes things like locks, the current storage engine transaction and any collections the transaction has created.

Stashing the uncommitted collections will create a new shared pointer for the TxnResources to hold onto.

Unstashing the uncommitted collections asserts that the OperationContext does not know of any existing uncommitted collections.

The stash -> unstash works so long as an operation context is destroyed after each "statement". This is the typical case as each new request over the wire creates a new operation context. However, internal code can send commands directly into the local system. This entry point executes all of the transaction related stashing logic, but also allows for the caller to provide the same operation context for each statement.

The latter case can run into the above assertion when the transaction attempts (explicitly, or implicitly) to create a collection. This ticket is to make the stashing logic appropriately clean up the UncommittedCollections associated with an OpCtx.

Original Description
Assumptions exist for creating collections in replica set transactions that are outside of the scope of completion for Resharding's Milestone 1. In order to avoid spending too much time trying to debug collection creation in transactions, collection creation at the beginning of a resharding operation must be done outside of a transaction.

Right now, this includes creating the config.reshardingOperations collection before attempting to insert to that collection here.

Whoever works on this ticket must also find other places where collections are being written to in transactions as part of resharding, and assess whether it's possible that an operation leads to implicit collection creation. If so, the resharding collection must be optimistically outside of a transaction.

For questions about how to implement this, and for code review, refer to Dan Gottlieb.



 Comments   
Comment by Githook User [ 07/Oct/20 ]

Author:

{'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}

Message: SERVER-51349: Have UncommittedCollections release its resources when stashing to TxnResources.
Branch: master
https://github.com/mongodb/mongo/commit/9616997f36141394f54e793f9e502ebd55570cc4

Comment by Blake Oler [ 05/Oct/20 ]

Here is a patch that for now triggers the invariant that led to us discovering this issue. In this patch is:

To get the invariant, run this jstest. If you uncomment the code, you will bypass this invariant, which currently leads to an unrelated stacktrace involving shard_id.h. invariant.diff

Generated at Thu Feb 08 05:25:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.