[SERVER-48603] Rollback via refetch can result in out of order timestamps Created: 05/Jun/20 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Replication, Storage |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Daniel Gottlieb (Inactive) | Assignee: | Backlog - Replication Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Replication
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
This is a hypothetical that I'm documenting for investigation. The idea was inspired by suganthi.mani's discovery in Consider the following sequence where a primary accepts some writes, rolls them back, and then as a secondary replicates writes that use the same key-space.
In this state, there are two update chains with out of order timestamps: Note the out of order updates in the RecordStore case are not alleviated by |
| Comments |
| Comment by Daniel Gottlieb (Inactive) [ 08/Jun/20 ] |
|
I don't know if that's actually a thing. alexander.gorrod? |
| Comment by Judah Schvimer [ 08/Jun/20 ] |
|
daniel.gottlieb, would it be possible to turn off durable history when using eMRC=false? |
| Comment by Daniel Gottlieb (Inactive) [ 05/Jun/20 ] |
|
I misdiagnosed the problem on the record store. Because RollbackViaRefetch does not restart the catalog, the next id doesn't reset to an earlier value in this case. So the problem within a process lifetime is limited to the _id index. That can be solved with a MongoDB only fix of regenerating the _id index key and throwing an untimestamped delete on top of the update chain. |
| Comment by Daniel Gottlieb (Inactive) [ 05/Jun/20 ] |
|
Brainstorming some solutions (including the bad ones). The common goal across all of them is either:
|