[SERVER-45010] Clean shutdown after rollbackViaRefetch with eMRC=false can cause us to incorrectly overwrite unstable checkpoints Created: 06/Dec/19  Updated: 29/Oct/23  Resolved: 22/Jan/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 4.2.1, 4.3.2
Fix Version/s: 4.2.4, 4.3.3

Type: Bug Priority: Major - P3
Reporter: William Schultz (Inactive) Assignee: William Schultz (Inactive)
Resolution: Fixed Votes: 0
Labels: KP42
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File bf-15306-repro.diff    
Issue Links:
Backports
Depends
Related
related to SERVER-46714 dbtest StorageTimestampTests suite re... Closed
related to SERVER-38925 Rollback via refetch can cause _id du... Closed
is related to SERVER-47219 Correct downgrade_after_rollback_via_... Closed
is related to WT-5466 Add ability to skip taking a checkpoi... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.2
Sprint: Repl 2019-12-30, Repl 2020-01-13, Repl 2020-01-27
Participants:
Linked BF Score: 50

 Description   

At the end of rollbackViaRefetch when eMRC=false, we will take an unstable checkpoint before proceeding. After this is complete, we enter RECOVERING state and try to catch up our oplog. If the server is shut down cleanly after we take these unstable checkpoints, though, upon shutdown we will explicitly take another checkpoint, which will be stable because we have a stable timestamp set. We do set the initialDataTimestamp ahead of the stable timestamp after rollbackViaRefetch so that no checkpoints are taken, but the logic that normally handles that in the WTCheckpointThread is bypassed during shutdown, so we take a full stable checkpoint regardless of the initialDataTimestamp value. Taking a stable checkpoint and recovering from it on restart in this case causes us to break the assumptions required for the correctness of rollbackViaRefetch with eMRC=false. See SERVER-38925 for an explanation of why these unstable checkpoints are necessary.



 Comments   
Comment by William Schultz (Inactive) [ 08/May/20 ]

henrik.edin That commit originally appeared in 4.3.3, so it predated v4.4.

Comment by William Schultz (Inactive) [ 01/Apr/20 ]

Adding a note here for future reference about why we can't take unstable checkpoints on a clean shutdown after rollback. We don’t want to take stable checkpoints after rollback until we reach max(localTopOfOplog, syncSourceTopOfOplog), and we take an unstable checkpoint before leaving rollback to include all writes performed during the rollback. We also set appliedThrough to the common point so that if we crash and restart before the first stable checkpoint we replay oplog entries from the common point for recovery. The question is whether it is safe to take additional unstable checkpoints before we reach max(localTopOfOplog, syncSourceTopOfOplog). If we did this and restarted, we would replay oplog entries from the common point, potentially replaying ones we already applied, which could encounter oplog idempotency issues. Initial sync is allowed to fail if it encounters an idempotency issues, but startup recovery must not. So, we can't take unstable checkpoints on shutdown either in order to avoid idempotency issues on recovery.

Comment by Githook User [ 11/Feb/20 ]

Author:

{'name': 'William Schultz', 'username': 'will62794', 'email': 'william.schultz@mongodb.com'}

Message: SERVER-45010 Avoid taking a checkpoint on clean shutdown if stableTimestamp < initialDataTimestamp

(cherry picked from commit 759e930c88081aa0fb86e34a3ce7b2ed190c806e)

create mode 100644 jstests/replsets/rollback_dup_ids_clean_shutdown_during_rollback.js
Branch: v4.2
https://github.com/mongodb/mongo/commit/873a18d34703e037f52162af693dcdc5901cb28c

Comment by Githook User [ 22/Jan/20 ]

Author:

{'email': 'william.schultz@mongodb.com', 'username': 'will62794', 'name': 'William Schultz'}

Message: SERVER-45010 Avoid taking a checkpoint on clean shutdown if stableTimestamp < initialDataTimestamp
Branch: master
https://github.com/mongodb/mongo/commit/759e930c88081aa0fb86e34a3ce7b2ed190c806e

Comment by William Schultz (Inactive) [ 21/Jan/20 ]

The WiredTiger checkpoint logic at shutdown inside __wt_txn_global_shutdown.

Comment by Tess Avitabile (Inactive) [ 09/Jan/20 ]

Great! Thank you for letting me know.

Comment by William Schultz (Inactive) [ 09/Jan/20 ]

steven.vannelli tess.avitabile@mongodb.com I have a proper repro written and a patch in progress so am fairly close to being ready for a code review.

Comment by William Schultz (Inactive) [ 06/Dec/19 ]

Applying the attached diff (to base commit bb7120c13) and running

python3 buildscripts/resmoke.py --majorityReadConcern=off  jstests/replsets/rollback_dup_ids.js

should reproduce the issue. I have also reproduced this on the 4.2 branch.

Generated at Thu Feb 08 05:07:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.