[DOCS-12315] Update restore procedure Created: 03/Jan/19  Updated: 30/Oct/23  Resolved: 11/Jan/19

Status: Closed
Project: Documentation
Component/s: manual
Affects Version/s: None
Fix Version/s: Server_Docs_20231030

Type: Task Priority: Critical - P2
Reporter: Randolph Tan Assignee: Ravind Kumar (Inactive)
Resolution: Done Votes: 0
Labels: docs-sharding
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by DOCS-12347 Missing step to remove minOpTimeRecov... Closed
Related
is related to DOCS-9517 Docs for SERVER-24465: Remove recover... Closed
Participants:
Days since reply: 4 years, 43 weeks, 5 days ago
Epic Link: DOCSP-1769
Story Points: 0.5

 Description   

Description

Based on https://docs.mongodb.com/manual/tutorial/restore-sharded-cluster/.

Since the step includes dropping the local database of the config server (in effect, the oplog), it should also delete the _id: minOpTimeRecovery document in admin.system.version of every shard.

If the user decides to ever change the shardNames after restore, they would also need to drop config.cache.collections, config.cache.chunks.* (for v3.6 or greater) and drop config.cache.databases (for v4.0 or greater) in every shard.

Scope of changes

  • (Pending feedback from engineering) add step for removing minOptimeRecovery document
  • Add steps for removing various *.cache collections on shrad name change

 



 Comments   
Comment by Githook User [ 12/Apr/19 ]

Author:

{'name': 'ravind', 'username': 'rkumar-mongo', 'email': 'ravind.kumar@10gen.com'}

Message: DOCS-9517: Update restore sharded cluster tutorial

DOCS-12315: Fixup for restore sharded cluster proc
Branch: v3.4
https://github.com/mongodb/docs/commit/fcf307ece064f69c54ab669835e6de98eb2fbc95

Comment by Githook User [ 11/Jan/19 ]

Author:

{'email': 'ravind.kumar@mongodb.com', 'name': 'rkumar-mongo'}

Message: DOCS-12315: Fixup for restore sharded cluster proc
Branch: v3.6
https://github.com/mongodb/docs/commit/fd6d119e640336e4753b77a75f33bf3fb3b0ba5d

Comment by Githook User [ 11/Jan/19 ]

Author:

{'email': 'ravind.kumar@mongodb.com', 'name': 'rkumar-mongo'}

Message: DOCS-12315: Fixup for restore sharded cluster proc
Branch: v4.0
https://github.com/mongodb/docs/commit/210938baa562096be34602e24cc4d6a345e1fdbe

Comment by Ravind Kumar (Inactive) [ 11/Jan/19 ]

Fix deployed to master/4.2, 4.0, and 3.6. 3.4 work will be covered in wrap up work for DOCS-9517. eric.sommer hopefully this resolves the issue in whole. 

Comment by Githook User [ 11/Jan/19 ]

Author:

{'email': 'ravind.kumar@mongodb.com', 'name': 'rkumar-mongo'}

Message: DOCS-12315: Fixup for restore sharded cluster proc
Branch: master
https://github.com/mongodb/docs/commit/e659f15dd7686a6cbdb510fd927febdd3fb091c8

Comment by Ravind Kumar (Inactive) [ 11/Jan/19 ]

Based on discussions, I'm going to just re-add the step to remove the minOpTimeRecovery document in master/4.2, 4.0, and 3.6

3.4 will needs some extra validation before we can wrap that up, but that will be lower priority given other tickets on deck right now.

 

To emphasize one point, our documented backup procedure is an initial-sync procedure, and our restore procedures work on those assumptions.

Comment by Ravind Kumar (Inactive) [ 10/Jan/19 ]

Discussing more via slack to get some consensus here.

Comment by Esha Maharishi (Inactive) [ 10/Jan/19 ]

I agree with renctan. The bottom line is, if a backup procedure just copies the config servers' data files and starts a new cluster using them, the minOpTimeRecovery document should not be deleted from the shards (and the local database should not be dropped during the restore).

Comment by Randolph Tan [ 10/Jan/19 ]

The minOpTimeRecovery refers to the low watermark for the opTime the shard should be using when talking to the config server. If the restore procedure makes it such that all config nodes are identical, then the minOpTimeRecovery is no longer necessary (but shouldn't hurt to keep around). However, if the procedure involves dropping the oplog of the config, then the minOpTimeRecovery is no longer valid and can should be removed. I don't know whether the procedure is different across version, so the answer to "is this generally true to remove minOpTimeRecovery regardless of version?" depends on the procedure for that version.

Comment by Ravind Kumar (Inactive) [ 10/Jan/19 ]

See DOCS-12347 for additional discussion/information on the issue at hand.

 

esha.maharishi renctan is this generally true to remove minOpTimeRecovery regardless of version? HELP-5524 seemed to indicate it is necessary, but based on discussions in HELP-8560 it seems like based on our recovery proc (dropping local) it may not matter?

Comment by Esha Maharishi (Inactive) [ 04/Jan/19 ]

I updated HELP-5524 with this info as well.

Comment by Esha Maharishi (Inactive) [ 04/Jan/19 ]

ravind.kumar, I think what renctan suggested is right. The minOpTimeRecovery document can be deleted, only because all three types of CSRS restore involve restoring to a single node replica sets and then adding more nodes to them (the added nodes will sync from the first nodes).

It would have not been correct if, instead, a backup was taken of each CSRS node in the original cluster and restored to a corresponding number of nodes in the new cluster, which is what I had assumed was happening.

Comment by Ravind Kumar (Inactive) [ 04/Jan/19 ]

This conflicts with information provided in HELP-5524 where it was recommended users not delete the minOpTimeRecovery document.

Adding instructions for nuking the various cache collections is simple enough I think.

esha.maharishi can you comment here?

Generated at Thu Feb 08 08:04:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.