[SERVER-32445] config.transactions table can get out of sync when the TransactionReaper remove entries Created: 21/Dec/17 Updated: 30/Oct/23 Resolved: 12/Mar/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 3.7.1 |
| Fix Version/s: | 3.6.4, 3.7.3 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Randolph Tan | Assignee: | Randolph Tan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Backport Requested: |
v3.6
|
||||||||||||||||
| Sprint: | Sharding 2018-02-12, Sharding 2018-02-26, Sharding 2018-03-12, Sharding 2018-03-26 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||
| Description |
|
Background: Updates to the config.transactions table don't generate oplog entries and are replicated differently in the secondaries. The primary store all the relevant information in the oplog of the write that would update the config.transactions table and the secondaries reconstruct the table from this. Because of how oplog application is parallelized, the order it gets applied cannot be guaranteed. Fortunately, there is a simple rule that is used: higher transaction number wins, and if tied, higher lastWriteOpTime wins. So as an optimization, the secondary simply squash all changes to the same session to a single update and apply them at the end of the batch. So the issue is when someone (like the TransactionReaper) deletes an entry in config.transactions, it will generate an oplog entry for the delete. When the secondary applies this delete oplog, the transaction is correctly deleted. But if there are updates on the transactions table for the same oplog batch, then it can "revive" back again, creating an orphan entry and making it inconsistent with the current primary. Note: that likelihood of this happening is low since the reaper only cleans up entries that are not active for more than 30 min. |
| Comments |
| Comment by Githook User [ 15/Mar/18 ] |
|
Author: {'email': 'randolph@10gen.com', 'name': 'Randolph Tan', 'username': 'renctan'}Message: Secondary replication of config.transactions table is now changed to create oplog entries of the actual updates to mirror what the primary would have done. |
| Comment by Githook User [ 12/Mar/18 ] |
|
Author: {'email': 'randolph@10gen.com', 'name': 'Randolph Tan', 'username': 'renctan'}Message: Secondary replication of config.transactions table is now changed to create oplog entries of the actual updates to mirror what the primary would have done. |