[SERVER-83537] Investigate removal of side transaction for multikey Created: 22/Nov/23 Updated: 20/Dec/23 Resolved: 20/Dec/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Jordi Olivares Provencio | Assignee: | Jordi Olivares Provencio |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Catalog and Routing
|
||||
| Sprint: | CAR Team 2023-11-27, CAR Team 2023-12-11, CAR Team 2023-12-25 | ||||
| Participants: | |||||
| Linked BF Score: | 3 | ||||
| Story Points: | 3 | ||||
| Description |
|
During recovery of a node we perform oplog replay in order to reconstruct the node. As part of this, we reconstruct the prepared transactions so that we might commit them later. If at this point the transaction should set the multikey flag to be true we might encounter an error with WT in the following case:
Ideally we should try to avoid doing a side transaction and just accumulate the change with the original recovery unit. This would lead to prepare conflicts until the operation commits which is what's happening with the primary now. On secondaries however, this operation isn't symmetric as explained by We should investigate whether it is safe to remove the side transaction or if the write is safe to do without concerns for potential data inconsistencies. |
| Comments |
| Comment by Jordi Olivares Provencio [ 20/Dec/23 ] |
|
As a side transaction is deemed necessary for secondary replication the next best thing that we could do is to modify the side transaction to actually force the index to become multikey. However, in the absence of a specific new command or a new collMod option the most compatible form without implying an FCV change is to force an implicit multikey setting by inserting and then deleting a special document that enables the flag. This would functionally be a no-op but would trigger the multikey flag setting during oplog replication/recovery. The problem with this approach is that we can't generate such document with confidence that it won't cause issues. If the index is specified with {{ {unique: true}}} then we must synthetically produce a value that doesn't exist in the entire collection. Failure to do so would cause secondary replication to fail. As we want to refactor multikey in the future to not be as special with explicit replication via the oplog I'm closing this ticket as Won't Fix. Fixing this would require a very large effort that would have to be scrapped altogether when we refactor multikey. |
| Comment by Jordi Olivares Provencio [ 13/Dec/23 ] |
|
Not using the side-transaction isn't as safe as we would wish. Suppose we have a multi-document transaction that has prepared and changed the multikey metadata. Any operation that comes afterwards that does an insert and modifies the multikey metadata will get a prepare conflict until the transaction commits. This effectively deadlocks the server. Note that this is an issue with secondary replication since it would effectively stop replication from making forward progress. The secondary would prepare the transaction and then impede the applier thread from making forward progress. |