[DOCS-16470] Ops Manager steps for manual backup restore are incorrect Created: 01/Nov/23  Updated: 06/Feb/24

Status: Ready for Work
Project: Documentation
Component/s: Ops Manager
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ali Mir Assignee: Lander Kerbey
Resolution: Unresolved Votes: 0
Labels: Bug
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:
Days since reply: 1 week, 1 day ago

 Description   

Hello! I recently worked on HELP-47199 regarding a non-point-in-time manual snapshot restore that followed the Ops Manager documentation listed here. Ultimately, we came to the conclusion that the documentation for a manual restore has diverged from what we actually do in the automation agent.

Specifically, in the "FCV 4.2 or later" section under "Manual Restore", here are the following discrepancies:

  • Step 14: Add a New Replica Set Configuration
    • Here, we tell the user to insert a new replica set config with just one member (or it's implied), with the host being defined with the `ephemeralPortNewReplicaSet`. We should instead be inserting a replica set config with all replica set members in the config, with all their hosts defined with the final port the user will use for their replica set.
  • Step 18: Restart the Instance to Recover the Oplog
    • It's noted that we should start the node up as a replica set. Instead, what we should be doing is starting the node up as a standalone again, but with the additional oplog reply parameters set. These include:
      •  --setParameter recoverFromOplogAsStandalone=true
         --setParameter takeUnstableCheckpointOnShutdown=true
         --setParameter startupRecoveryForRestore=true
        

    • This will still recover the oplog.
  • Steps 23, 24, 28: We can get rid of these steps for the following reasons:
    • Step 23 and 24:
      • We don't need to remove the existing replica set config, since the one we inserted in step 14 now is the correct final one, with all members and the correct ports. As a result, we can skip both the stopping and insertion steps.
    • Step 28
      • As mentioned above, since we already have all replica set members in the new config, we don't need to set up the replica set configuration.

I'm happy to review these changes, and I would also get a second look from someone on the Cloud Backup team to confirm.



 Comments   
Comment by Ankur Solanki [ 30/Jan/24 ]

Hi Team/ali.mir@mongodb.com,

Can you please suggest when can we expect the manual restore procedure to be updated.

Regards,
Ankur

Comment by Akshat Joshi [ 06/Dec/23 ]

Hi lander.kerbey@mongodb.com

Could you please let us know when the corrected procedure will be live?

Best, 

Akshat

Comment by Ali Mir [ 02/Nov/23 ]

We've tested the above approach for a manual restore, and it works. The last comment in HELP-47199 has the detailed steps.

Comment by Ali Mir [ 01/Nov/23 ]

As a heads up, I'm just confirming that a manual restore succeeds with this approach. I'll confirm once it does and we can go ahead with the DOCS fix.

Generated at Thu Feb 08 08:15:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.