Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 3.6.4, 3.7.3, WT3.1.0
Affects Version/s: None
Component/s: None
Labels:
- rollback-functional

Epic Link:
SPM-842
Sprint:
Storage 2018-02-26
Story Points:
None

The replication rollback project is running into a problem when replication's recovery rolls forward from a point earlier than the durable timestamp (i.e., the timestamp used by the last checkpoint).

For example, rolling forward a dropDatabase gets confused by collections created after the dropDatabase executed but before the final checkpoint completed.

MongoDB is currently tracking the effective checkpoint timestamp by writing a document with the stable_timestamp before executing WT_SESSION::checkpoint. There are several problems with this:

there are situations (such as shutdown) where writing a document is difficult (e.g., because the Global lock is held exclusive); and
there is a race between writing the document and the checkpoint choosing which stable_timestamp to use. We deliberately don't want to avoid moving stable_timestamp forward for the whole checkpoint operation, if it would otherwise be possible. That would defeat some I/O optimizations in WiredTiger (aka "scrubbing") where dirty data is flushed from cache multi-threaded before the critical section of a checkpoint starts.

If instead, WiredTiger stored the stable_timestamp chosen by checkpoint as part of the metadata (specifically in metadata for WiredTiger.wt, aka the turtle file), then allowed a call to WT_CONNECTION::query_timestamp with "get=durable_timestamp" to query it after a restart, then MongoDB could skip all of this work and there would be no possibility of races. Replication could use this timestamp to roll forward the oplog from exactly the point corresponding to the checkpoint.

Assignee:: Susan LoVerso (Inactive)
Reporter:: Michael Cahill (Inactive)
Votes:: 0 Vote for this issue
Watchers:: 7 Start watching this issue

Created:: Feb 08 2018 10:26:44 PM UTC
Updated:: Oct 29 2023 04:47:00 PM UTC
Resolved:: Feb 16 2018 12:08:57 AM UTC

Details

Description

Attachments

Forms

Activity

People

Dates