[SERVER-82305] Have dbCheck ignore prepare conflicts on secondaries Created: 18/Oct/23 Updated: 18/Nov/23 Resolved: 15/Nov/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.3.0-rc0, 7.2.0-rc2 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Sean Zimmerman | Assignee: | Louis Williams |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | bkp | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Storage Execution
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Backport Requested: |
v7.2, v7.0
|
||||||||||||||||
| Sprint: | Execution Team 2023-11-13, Execution Team 2023-11-27 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 8 | ||||||||||||||||
| Description |
|
In BF-30418 we discovered that dbCheck can hit a prepare conflict on secondaries and fail a wiredtirger invariant than a thread which encounters a prepare conflict must be killable. We feel that dbCheck needs to enforce prepare conflicts for correctness, and the oplog applier thread should remain unkillable. To fix this we should expand the PrepareConflictBehavior to be able to propagate an error instead of retrying the conflict (and running into the invariant mentioned). This will allow dbCheck to finish with a warning that a certain key range could not be validated |
| Comments |
| Comment by Githook User [ 16/Nov/23 ] | ||||||||||
|
Author: {'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}Message: (cherry picked from commit 6007ed70a67624d75509eca3d5adf7aee3d03cc9) | ||||||||||
| Comment by Githook User [ 15/Nov/23 ] | ||||||||||
|
Author: {'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}Message: | ||||||||||
| Comment by Louis Williams [ 06/Nov/23 ] | ||||||||||
|
I actually think it is correct to ignore prepare conflicts on the secondary. Consider the following interleaving of events that is currently leading to a problem:
In this case, we must ignore prepare conflicts on the secondary. Successfully replicating a dbCheck oplog entry guarantees that the range scanned on the primary at a specific point in time does not represent any documents in a prepared state. Therefore, for correctness, the secondary must ignore prepared updates when reading at the same point in time. | ||||||||||
| Comment by Louis Williams [ 01/Nov/23 ] | ||||||||||
|
I think there are two ways to solve this problem: I actually think solution 2 fits in better with how we want the system to behave in the future, as this would allow us to stop using ignore_prepare=force everywhere else. Solution 2, however, cannot be backported to 6.0. Considering that 6.0 is broken right now, I'm reverting | ||||||||||
| Comment by Sean Zimmerman [ 30/Oct/23 ] | ||||||||||
|
The backport of | ||||||||||
| Comment by Sean Zimmerman [ 18/Oct/23 ] | ||||||||||