[SERVER-29213] Have KVWiredTigerEngine implement StorageEngine::recoverToStableTimestamp Created: 15/May/17  Updated: 30/Oct/23  Resolved: 24/Mar/18

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 3.7.4

Type: Improvement Priority: Major - P3
Reporter: Alexander Gorrod Assignee: Daniel Gottlieb (Inactive)
Resolution: Fixed Votes: 0
Labels: rollback-functional
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-33081 Reset `KeysCollectionManager` during ... Closed
depends on SERVER-33093 catalog::openCatalog must rebuild all... Closed
depends on SERVER-33126 Replication commit point can include ... Closed
depends on WT-3906 Respect stable_timestamp in WT_CONNEC... Closed
depends on SERVER-32594 Add mechanism to delete and recreate ... Closed
depends on SERVER-33159 RTT storage recovery from unclean shu... Closed
depends on SERVER-30081 Add a WiredTigerKVEngine "recovery" m... Closed
depends on SERVER-32144 Remove test coverage for replication ... Closed
depends on SERVER-33743 Use all_committed to set lastApplied ... Closed
depends on SERVER-29211 Change to explicitly journal a subset... Closed
depends on WT-3387 Add support for a stable timestamp Closed
depends on WT-3388 Online rollbackToStableTimestamp Closed
is depended on by SERVER-32844 Turn on recoverable rollback Javascri... Closed
is depended on by SERVER-34042 Move prepare_transaction.js to the tr... Closed
Documented
is documented by DOCS-11484 Document that oplog size setting is n... Closed
Duplicate
is duplicated by SERVER-30349 Atomically turn on recovery to a time... Closed
Problem/Incident
Related
related to SERVER-33812 First initial sync oplog read batch f... Closed
related to SERVER-32206 Catalog change to declare an index as... Closed
related to SERVER-34606 Test (and possibly fix) behavior arou... Closed
related to WT-3322 Add upgrade/downgrade support for alt... Closed
is related to SERVER-33161 Postpone WiredTigerKVEngine table dro... Closed
is related to SERVER-34070 Add flag to perform replication recov... Closed
is related to SERVER-34075 powercycle_replication* must run repl... Closed
Backwards Compatibility: Minor Change
Sprint: Repl 2018-02-12, Repl 2018-02-26, Repl 2018-03-12, Repl 2018-03-26, Repl 2018-04-09
Participants:
Linked BF Score: 0

 Description   

There is a system state in MongoDB called replication rollback, where a node in a replica set needs to be able to reset it's state to an earlier point in time.

That is currently handled by undoing operations via the oplog, but that's technically difficult. We are working on adding a mechanism that means the oplog will only need to be applied forwards instead. In order to use that mechanism for replication rollback MongoDB will need a method for resetting a WiredTiger database to an earlier point in time, and then running recovery to return the state to the desired point in time.

The goal of this ticket will be adding a function to the storage engine interface, which shuts down a storage engine, then re-opens and recovers the state as it would be after a fresh restart. In the case of WiredTiger this will be with all collections having data at a certain point in time, and the oplog containing information that at least covers the replica wide durable point.

After this Storage Engine restart method has been called it is expected that the oplog will be replayed to re-create collection data that wasn't durable before the restart.



 Comments   
Comment by Daniel Gottlieb (Inactive) [ 25/Mar/18 ]

With this patch, shutting down a node running with --replSet and bringing it back up as a standalone will not have the same data. The data will consist of what was majority committed in the replica set when the node shut down. SERVER-34070 will expose a flag that will replay replication recovery in standalone mode.

Comment by Githook User [ 24/Mar/18 ]

Author:

{'email': 'daniel.gottlieb@mongodb.com', 'name': 'Daniel Gottlieb', 'username': 'dgottlieb'}

Message: SERVER-29213: Have WiredTiger support recoverToStableTimestamp.
Branch: master
https://github.com/mongodb/mongo/commit/6ae04cd9f250fac877df94ecd4ddad33eaf5bc77

Comment by Daniel Gottlieb (Inactive) [ 22/Dec/17 ]

Note this work should investigate if any additional changes are required to preserve multikey entries in the catalog. See https://jira.mongodb.org/browse/SERVER-32206?focusedCommentId=1746446&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-1746446

Comment by Daniel Gottlieb (Inactive) [ 25/Jul/17 ]

This ticket includes calling WT's recover to stable timestamp method, running "glue layer recovery" (reconciling _mdb_catalog + WT table, i.e: rebuilding indexes where a drop was rolled back) and refreshing the in-memory catalog (see `mongo::repairDatabase`s handling of recreating the DatabaseCatalogEntry inside of a `dbHolder().close/open`).

Generated at Thu Feb 08 04:20:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.