[SERVER-38302] Committing or aborting prepared transactions may fail to un-pin stable timestamp Created: 28/Nov/18 Updated: 29/Oct/23 Resolved: 17/Jan/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.8 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Jack Mulrow | Assignee: | Pavithra Vetriselvan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | todo_in_code | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||
| Sprint: | Repl 2018-12-17, Repl 2019-01-14 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Linked BF Score: | 62 | ||||||||||||||||||||||||||||
| Description |
|
When calculating the stable timestamp, the replication coordinator does not allow it to advance beyond the oplog entry timestamp of the oldest uncommitted/aborted prepared transaction. This timestamp is maintained in the ServerTransactionMetrics service context decoration which is updated through the TransactionParticipant's TransactionMetricsObserver object when a transaction prepares, commits, or aborts a transaction. It looks like the metrics are updated after the commit oplog entry is written in a side transaction, so if the commit point advances to include the commit's opTime before the transaction metrics are updated, a recalculation of the stable timestamp will be triggered, but the replication coordinator will not know it can advance the stable timestamp to include the commit oplog entry. The next operation to trigger a calculation of the stable timestamp will be able to advance to include the commit, but if nothing triggers a new calculation, anything waiting for a new committed snapshot will hang. This seems possible when aborting a prepared transaction as well. A similar problem exists with prepare, because we also commit the prepare oplog entry before updating the oldest prepared timestamp in the transaction metrics, although this should lead to incorrectly advancing the stable timestamp instead of incorrectly holding it back. |
| Comments |
| Comment by Githook User [ 16/Jan/19 ] |
|
Author: {'username': 'pvselvan', 'email': 'pvselvan@umich.edu', 'name': 'Pavi Vetriselvan'}Message: |
| Comment by Pavithra Vetriselvan [ 11/Jan/19 ] |
|
The original solution of calling into the MetricsObserver before opObserver causes BF-11787 to occur intermittently. In order to move the stable timestamp, we have make sure the finishOpTime is recorded AND that the oplog entry is written. Since the root of problem is that nothing triggers a new calculation of the stable timestamp, I think we should explicitly calculate it every time we update the metrics with a finishOpTime. This might be an expensive operation, but since pinning back the stable timestamp is temporary, we can remove this once we are allowed to commit/abort behind the stable timestamp. |
| Comment by Githook User [ 11/Jan/19 ] |
|
Author: {'username': 'pvselvan', 'email': 'pvselvan@umich.edu', 'name': 'Pavi Vetriselvan'}Message: |
| Comment by Githook User [ 11/Jan/19 ] |
|
Author: {'username': 'pvselvan', 'email': 'pvselvan@umich.edu', 'name': 'Pavi Vetriselvan'}Message: Revert " This reverts commit 011d0d1a5d1517f7e8f6df0ce35412e1bf256afe. |
| Comment by Githook User [ 08/Jan/19 ] |
|
Author: {'username': 'pvselvan', 'email': 'pvselvan@umich.edu', 'name': 'Pavi Vetriselvan'}Message: |
| Comment by Siyuan Zhou [ 30/Nov/18 ] |
|
In both cases of prepare and commit, I think we should call into MetricsObserver before opObserver. |