-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Replication
-
ALL
-
v8.0
-
Repl 2025-01-20, Repl 2025-02-03
-
200
In BF-35520 there was a case where a node went into rollback. The node had these oplog entries (part of a transactionally replicated vectored inserts)
1. timestamp: (1730013230, 1), prevOpTime (0, 0)
2. timestamp: (1730013230, 2), prevOpTime (1730013230, 1)
3. timestamp (1730013230, 3), prevOptime (1730013230, 2)
The node is rolling back to the stable timestamp which is (1730013230, 1). In _restoreTxnsTableEntryFromRetryableWrites we look through the oplog (before truncation) with this filter, trying to find retryable write entries (entries that have txnNumber and stmtId as top-level fields) that have a timestamp after the stable timestamp but with a prevOpTime <= the stable timestamp, and if so, restoring the txn table entry based off that info.
We should expect entry 2 to match the filter and restore the txn table entry, but after SPM-3381, the oplog entry format was changed so that inserts are batched within an applyOps entry, so the stmtId field is now nested within the applyOps. Therefore, none of the oplog entries match the filter, and we skip the step to restore the transactions table.
This results in a data inconsistency where one of the nodes does not have the correct config.transactions doc.
This requires that a secondary (not the primary) go into rollback, in order that when the secondary uses WT recover to stable timestamp, the config.transactions table is not correct, and then we skip the step to restore the transactions table, resulting in the node not having the correct config.transactions table at the end of rollback.
- is related to
-
SERVER-55305 Retryable write may execute more than once if primary had transitioned through rollback to stable
- Closed