Details
-
Bug
-
Resolution: Won't Fix
-
Major - P3
-
None
-
None
-
None
-
None
-
Catalog and Routing
-
CAR Team 2023-11-27, CAR Team 2023-12-11, CAR Team 2023-12-25
-
3
-
3
Description
During recovery of a node we perform oplog replay in order to reconstruct the node. As part of this, we reconstruct the prepared transactions so that we might commit them later.
If at this point the transaction should set the multikey flag to be true we might encounter an error with WT in the following case:
- The internal durable timestamp for the catalog page advances to T=2
- The prepared transaction at time T=1 attempts to set the multikey flag using a side transaction
- The side transaction fails to commit due to WT returning an error to avoid data inconsistencies since we're writing back in time, thus potentially invalidating all values from that point onwards since they could've read a stale value.
Ideally we should try to avoid doing a side transaction and just accumulate the change with the original recovery unit. This would lead to prepare conflicts until the operation commits which is what's happening with the primary now. On secondaries however, this operation isn't symmetric as explained by SERVER-41766.
We should investigate whether it is safe to remove the side transaction or if the write is safe to do without concerns for potential data inconsistencies.