Details
-
Bug
-
Resolution: Done
-
Major - P3
-
None
-
None
Description
The preferred method for backing up a sharded cluster is as follows:
- Stop the balancer using sh.stopBalancer()
- Shut down one of the config servers
- Perform a backup on all of the shards (filesystem backup preferred)
- Back up the shut down config server (filesystem backup preferred)
- Restart the down config server
- Restart the balancer using sh.startBalancer()
This procedure is not documented in the manual. In addition, the procedure which is documented in the manual has the following errors:
1) It says to shut down the balancer using direct operations on the 'config.settings' collection. This is incorrect: the correct method is to use sh.stopBalancer()
2) If you shut down the balancer using direct operations, there is a delay between the time you issue the command and the time that the balancer stops. If you don't wait for the balancer to fully stop, then the backup will be inconsistent. One of the features of using sh.stopBalancer() is that it will wait for the balancer to fully stop
3) Locking the shards is not necessary if you are doing filesystem backups using snapshots: it's only necessary if you're using OS-level file copies
I suggest replacing the current procedure in the manual with the above procedure, and then suggesting stopping the balancer as an alternative method.
Attachments
Issue Links
- related to
-
SERVER-3456 Running fsync+lock on config server during backup results in blocked inserts
-
- Closed
-