[DOCS-12315] Update restore procedure Created: 03/Jan/19 Updated: 30/Oct/23 Resolved: 11/Jan/19 |
|
| Status: | Closed |
| Project: | Documentation |
| Component/s: | manual |
| Affects Version/s: | None |
| Fix Version/s: | Server_Docs_20231030 |
| Type: | Task | Priority: | Critical - P2 |
| Reporter: | Randolph Tan | Assignee: | Ravind Kumar (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | docs-sharding | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Participants: | |||||||||||||||||
| Days since reply: | 4 years, 43 weeks, 5 days ago | ||||||||||||||||
| Epic Link: | DOCSP-1769 | ||||||||||||||||
| Story Points: | 0.5 | ||||||||||||||||
| Description |
DescriptionBased on https://docs.mongodb.com/manual/tutorial/restore-sharded-cluster/. Since the step includes dropping the local database of the config server (in effect, the oplog), it should also delete the _id: minOpTimeRecovery document in admin.system.version of every shard. If the user decides to ever change the shardNames after restore, they would also need to drop config.cache.collections, config.cache.chunks.* (for v3.6 or greater) and drop config.cache.databases (for v4.0 or greater) in every shard. Scope of changes
|
| Comments |
| Comment by Githook User [ 12/Apr/19 ] |
|
Author: {'name': 'ravind', 'username': 'rkumar-mongo', 'email': 'ravind.kumar@10gen.com'}Message:
|
| Comment by Githook User [ 11/Jan/19 ] |
|
Author: {'email': 'ravind.kumar@mongodb.com', 'name': 'rkumar-mongo'}Message: |
| Comment by Githook User [ 11/Jan/19 ] |
|
Author: {'email': 'ravind.kumar@mongodb.com', 'name': 'rkumar-mongo'}Message: |
| Comment by Ravind Kumar (Inactive) [ 11/Jan/19 ] |
|
Fix deployed to master/4.2, 4.0, and 3.6. 3.4 work will be covered in wrap up work for |
| Comment by Githook User [ 11/Jan/19 ] |
|
Author: {'email': 'ravind.kumar@mongodb.com', 'name': 'rkumar-mongo'}Message: |
| Comment by Ravind Kumar (Inactive) [ 11/Jan/19 ] |
|
Based on discussions, I'm going to just re-add the step to remove the minOpTimeRecovery document in master/4.2, 4.0, and 3.6 3.4 will needs some extra validation before we can wrap that up, but that will be lower priority given other tickets on deck right now.
To emphasize one point, our documented backup procedure is an initial-sync procedure, and our restore procedures work on those assumptions. |
| Comment by Ravind Kumar (Inactive) [ 10/Jan/19 ] |
|
Discussing more via slack to get some consensus here. |
| Comment by Esha Maharishi (Inactive) [ 10/Jan/19 ] |
|
I agree with renctan. The bottom line is, if a backup procedure just copies the config servers' data files and starts a new cluster using them, the minOpTimeRecovery document should not be deleted from the shards (and the local database should not be dropped during the restore). |
| Comment by Randolph Tan [ 10/Jan/19 ] |
|
The minOpTimeRecovery refers to the low watermark for the opTime the shard should be using when talking to the config server. If the restore procedure makes it such that all config nodes are identical, then the minOpTimeRecovery is no longer necessary (but shouldn't hurt to keep around). However, if the procedure involves dropping the oplog of the config, then the minOpTimeRecovery is no longer valid and can should be removed. I don't know whether the procedure is different across version, so the answer to "is this generally true to remove minOpTimeRecovery regardless of version?" depends on the procedure for that version. |
| Comment by Ravind Kumar (Inactive) [ 10/Jan/19 ] |
|
See
esha.maharishi renctan is this generally true to remove minOpTimeRecovery regardless of version? HELP-5524 seemed to indicate it is necessary, but based on discussions in HELP-8560 it seems like based on our recovery proc (dropping local) it may not matter? |
| Comment by Esha Maharishi (Inactive) [ 04/Jan/19 ] |
|
I updated HELP-5524 with this info as well. |
| Comment by Esha Maharishi (Inactive) [ 04/Jan/19 ] |
|
ravind.kumar, I think what renctan suggested is right. The minOpTimeRecovery document can be deleted, only because all three types of CSRS restore involve restoring to a single node replica sets and then adding more nodes to them (the added nodes will sync from the first nodes). It would have not been correct if, instead, a backup was taken of each CSRS node in the original cluster and restored to a corresponding number of nodes in the new cluster, which is what I had assumed was happening. |
| Comment by Ravind Kumar (Inactive) [ 04/Jan/19 ] |
|
This conflicts with information provided in HELP-5524 where it was recommended users not delete the minOpTimeRecovery document. Adding instructions for nuking the various cache collections is simple enough I think. esha.maharishi can you comment here? |