[SERVER-19796] Cannot stop and backup hidden secondary without risk of inconsistent data. Created: 06/Aug/15 Updated: 06/Aug/15 Resolved: 06/Aug/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Admin, Replication |
| Affects Version/s: | 2.4.14 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Dave Muysson | Assignee: | Ramon Fernandez Marina |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Ubuntu 12.04 |
||
| Operating System: | Linux |
| Steps To Reproduce: | Using a replicated MongoD environment with one Primary, on Secondary, and one Hidden Secondary:
(restore process) This will work 95% of the time. The other 5%, you cannot get past step 4 of the restore because the data that was backed up was still behind. The shutdown command did not wait for the hidden secondary to be caught up to the primary before it shut down. |
| Participants: |
| Description |
|
Our offsite backup process operates by stopping the MongoD processes on our Hidden Secondary hosts, then capturing their data by cloning their database folder and copying it offsite. The restore process works by stopping the MongoD process, removing the existing data within the database folder, then cloning the offsite data back into the database folder. On several occasions now we have had the restored MongoD process never get out of 'startup2' as it believes there is data still to be replicated. We have followed the documented recommendations for stopping a replica host by issuing "db.adminCommand( {shutdown : 1})" to the MongoD process instead of stopping the process through upstart. The documentation for the shutdown command states that it will not run unless "a" secondary has caught up with the primary. If the wording matches the logic, I suspect that the shutdown command proceeds with stopping the Hidden Secondary MongoD host, even when it is behind, because "A" Secondary is up to date (i.e. the non-hidden secondary witihin the replicaset). |
| Comments |
| Comment by Ramon Fernandez Marina [ 06/Aug/15 ] |
|
dave.muysson@360pi.com, the documentation states the following about the shutdown command:
When the command is run on a secondary there's no wait period until that secondary is caught up. To follow the backup procedure described above you may need to wait until the hidden secondary is caught up. There are also other backup procedures you may want to consider in case they better fit your needs. Please note that the SERVER project is for reporting bugs or feature suggestions for the MongoDB server, and unfortunately we're not able to provide support here. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience. A question like this involving more discussion would be best posted on the mongodb-user group. See also our Technical Support page for additional support resources. Regards, |