[SERVER-45010] Clean shutdown after rollbackViaRefetch with eMRC=false can cause us to incorrectly overwrite unstable checkpoints Created: 06/Dec/19 Updated: 29/Oct/23 Resolved: 22/Jan/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 4.2.1, 4.3.2 |
| Fix Version/s: | 4.2.4, 4.3.3 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | William Schultz (Inactive) | Assignee: | William Schultz (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | KP42 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||
| Backport Requested: |
v4.2
|
||||||||||||||||||||||||||||
| Sprint: | Repl 2019-12-30, Repl 2020-01-13, Repl 2020-01-27 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Linked BF Score: | 50 | ||||||||||||||||||||||||||||
| Description |
|
At the end of rollbackViaRefetch when eMRC=false, we will take an unstable checkpoint before proceeding. After this is complete, we enter RECOVERING state and try to catch up our oplog. If the server is shut down cleanly after we take these unstable checkpoints, though, upon shutdown we will explicitly take another checkpoint, which will be stable because we have a stable timestamp set. We do set the initialDataTimestamp ahead of the stable timestamp after rollbackViaRefetch so that no checkpoints are taken, but the logic that normally handles that in the WTCheckpointThread is bypassed during shutdown, so we take a full stable checkpoint regardless of the initialDataTimestamp value. Taking a stable checkpoint and recovering from it on restart in this case causes us to break the assumptions required for the correctness of rollbackViaRefetch with eMRC=false. See |
| Comments |
| Comment by William Schultz (Inactive) [ 08/May/20 ] | |
|
henrik.edin That commit originally appeared in 4.3.3, so it predated v4.4. | |
| Comment by William Schultz (Inactive) [ 01/Apr/20 ] | |
|
Adding a note here for future reference about why we can't take unstable checkpoints on a clean shutdown after rollback. We don’t want to take stable checkpoints after rollback until we reach max(localTopOfOplog, syncSourceTopOfOplog), and we take an unstable checkpoint before leaving rollback to include all writes performed during the rollback. We also set appliedThrough to the common point so that if we crash and restart before the first stable checkpoint we replay oplog entries from the common point for recovery. The question is whether it is safe to take additional unstable checkpoints before we reach max(localTopOfOplog, syncSourceTopOfOplog). If we did this and restarted, we would replay oplog entries from the common point, potentially replaying ones we already applied, which could encounter oplog idempotency issues. Initial sync is allowed to fail if it encounters an idempotency issues, but startup recovery must not. So, we can't take unstable checkpoints on shutdown either in order to avoid idempotency issues on recovery. | |
| Comment by Githook User [ 11/Feb/20 ] | |
|
Author: {'name': 'William Schultz', 'username': 'will62794', 'email': 'william.schultz@mongodb.com'}Message: (cherry picked from commit 759e930c88081aa0fb86e34a3ce7b2ed190c806e) create mode 100644 jstests/replsets/rollback_dup_ids_clean_shutdown_during_rollback.js | |
| Comment by Githook User [ 22/Jan/20 ] | |
|
Author: {'email': 'william.schultz@mongodb.com', 'username': 'will62794', 'name': 'William Schultz'}Message: | |
| Comment by William Schultz (Inactive) [ 21/Jan/20 ] | |
|
The WiredTiger checkpoint logic at shutdown inside __wt_txn_global_shutdown. | |
| Comment by Tess Avitabile (Inactive) [ 09/Jan/20 ] | |
|
Great! Thank you for letting me know. | |
| Comment by William Schultz (Inactive) [ 09/Jan/20 ] | |
|
steven.vannelli tess.avitabile@mongodb.com I have a proper repro written and a patch in progress so am fairly close to being ready for a code review. | |
| Comment by William Schultz (Inactive) [ 06/Dec/19 ] | |
|
Applying the attached diff (to base commit bb7120c13) and running
should reproduce the issue. I have also reproduced this on the 4.2 branch. |