[SERVER-41163] During initial sync, failing to apply an update thats in a prepared txn hits an invariant Created: 15/May/19 Updated: 29/Oct/23 Resolved: 19/Jun/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 4.2.0-rc3 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Vlad Rachev (Inactive) | Assignee: | Pavithra Vetriselvan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | isfz, prepare_initial_sync | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
|||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
|||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | |||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | |||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.2
|
|||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: | I've attached the test. Here are the logs.
|
|||||||||||||||||||||||||||||||||||||
| Sprint: | Repl 2019-06-03, Repl 2019-06-17, Repl 2019-07-01 | |||||||||||||||||||||||||||||||||||||
| Participants: | ||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 8 | |||||||||||||||||||||||||||||||||||||
| Description |
|
When an update fails to get applied due to missing documents, we fetch the missing document using the oplog entry to figure out what document to retrieve. However, the oplog entry for the update in this case is the commitTransaction command. Currently, we assert that the oplog entry can only be an insert, delete or update. We thus hit this invariant when we try and retrieve the missing doc. |
| Comments |
| Comment by Githook User [ 27/Jun/19 ] |
|
Author: {'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}Message: (cherry picked from commit d3c0e4ad46fcba5aac61ecec1409e9df6e11f66e) |
| Comment by Githook User [ 19/Jun/19 ] |
|
Author: {'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}Message: |
| Comment by Judah Schvimer [ 17/May/19 ] |
|
prepareTransaction oplog entries need to be applied in their own batch in a single write unit of work so it doesn't make sense to extract and parallelize them. commitTransaction oplog entries for prepared transactions during steady state replication don't need to apply the operations since they're already in "prepare". In initial sync when we apply commitTransaction without going through prepare, we can do "extract and parallelize" safely though. In steady state replication it is also important to timestamp the transaction writes with the commitTimestamp, and not the commitTransaction oplog entry timestamp, which could be easy to mess up in the "extract and parallelize" approach. While initial sync doesn't really care about timestamping, we should still make a minimal effort to get it right (or document that we're not timestamping correctly during initial sync) in case we do ever care. We could also consider doing "extract and parallelize" for commitTransaction oplog entries in replication recovery, but it would be complicated by I think in creating the implementation plan it was seen as a short-cut to not "extract and parallelize" commitTransaction oplog entries during initial sync and replication recovery, and then we never got around to optimizing it. Little did we know it wasn't just an optimization. This functionality was implemented in |
| Comment by Pavithra Vetriselvan [ 16/May/19 ] |
|
judah.schvimer, I agree! If I understand correctly, the functionality to extract applyOps operations from commit oplog entries exists here. It seems like we can make this work for commit oplog entries as part of a prepared transactions by modifying this function. Is there a reason we chose not to do this for prepare? |
| Comment by Judah Schvimer [ 15/May/19 ] |
|
I think we should be unpacking these commitTransaction oplog entries during initial sync into their own writer threads. In that case the writer threads will see the entry as an update and not as a commitTransaction. This would also be more performant. We currently apply prepared transaction commitTransaction oplog entries in their own batch. |