[SERVER-36811] Provide a mechanism for replication to specify the 'maximum_truncation_timestamp' for a given 'stable_timestamp' Created: 22/Aug/18 Updated: 29/Oct/23 Resolved: 17/Sep/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Storage |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.4 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Judah Schvimer | Assignee: | Daniel Gottlieb (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | prepare_durability | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Sprint: | Storage NYC 2018-09-10, Storage NYC 2018-09-24 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
We must ensure that the ‘prepare’ oplog entries for any transactions that are prepared at the ‘stable timestamp’ (for replication rollback) or ‘last stable checkpoint timestamp’ (for startup replication recovery) are not truncated off the oplog (or any oplog entries for active transactions in the “Transactions larger than 16MB project”). If so then nodes will not be able to re-apply these uncommitted transactions. ‘last stable checkpoint timestamp’ <= ‘stable timestamp’, so we will focus on the ‘last stable checkpoint timestamp’ case since that implicitly makes sure we have all of the oplog entries we need for the ‘stable timestamp’ or the current point in time. To do this, whenever replication tells storage about a new ‘stable timestamp’, it will also provide it with a ‘maximum_truncation_timestamp’. This ‘maximum_truncation_timestamp’ will be the latest timestamp that the storage engine is allowed to truncate off the back of the oplog when its accompanying ‘stable timestamp’ is the current ‘last stable checkpoint timestamp’. Replication will provide the timestamp that was the ‘oldest active transaction timestamp’ at the time of the ‘stable timestamp’. This is not the current ‘oldest active transaction timestamp’, but rather an older value of it. |
| Comments |
| Comment by Githook User [ 17/Sep/18 ] |
|
Author: {'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}Message: |
| Comment by Judah Schvimer [ 04/Sep/18 ] |
|
Yes that's correct. I think either a boost::none or a null timestamp would suffice for "no active transactions at that stable timestamp". I'll leave that to your judgement. I don't expect it to matter on our end. |
| Comment by Daniel Gottlieb (Inactive) [ 30/Aug/18 ] |
|
Actually thinking about inMemory... inMemory also has an oplog that gets truncated. During a live rollback, inMemory will play forward from the stable timestamp. I don't think it's sufficient for preserving oplog only with respect to when data gets checkpointed. inMemory will need to update the acceptable truncation point as the stable timestamp goes forward and not only when the stable checkpoint timestamp advances. Does that sound correct? |
| Comment by Daniel Gottlieb (Inactive) [ 30/Aug/18 ] |
|
judah.schvimer, did you have a preference on what value gets passed in when there were no active transactions at that stable timestamp? |
| Comment by Judah Schvimer [ 22/Aug/18 ] |
|
I discussed this with daniel.gottlieb during the design of "prepare support for transactions". |