[SERVER-32562] Importing local database into standalone server causes update to replset.minvalid Created: 05/Jan/18 Updated: 19/Jan/18 Resolved: 19/Jan/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Arnie Listhaus | Assignee: | Spencer Brody (Inactive) |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | ALL |
| Sprint: | Repl 2018-01-29 |
| Participants: | |
| Case: | (copied to CRM) |
| Description |
|
TL;DR Background I wanted to move a large database (1TB) to a new replica set without having any significant application downtime. Only one of my applications targeted the specific database in the replica set which was 1TB in size. The total size of all of the databases in the replica set was 6TB. The Initial Plan The plan was to add two secondary nodes to the existing replica set, allow the initial sync to populate the new nodes and then remove the new nodes from the replica set to create a new one with all of the data in tact. The removal of the new nodes and spinning up a new replica set (with an arbiter as the third node) could be done in minutes vs the many hours it would take for a backup/restore. Once the new replica set was up the application that targeted the 1TB database would be switched to the new replica set. At that point, he would drop the 1TB database in the original replica set and drop the 5TB databases in the new replica set. My requirement was that he would have at least 2 available data bearing nodes of each replica set so he would not lose high availability. The Steps
The Issues
Workarounds
Summary The steps above allowed me to achieve my goals. The fact that the server modified data in the local database when the local database was imported into a standalone server was unexpected and should be fixed albeit as a low priority. Note: a script to reproduce this issue is attached |
| Comments |
| Comment by Spencer Brody (Inactive) [ 19/Jan/18 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
arnie.listhaus, agreed that this is weird behavior, however this also isn't a supported way to start a new replica set, so I'm reluctant to spend any more time on this issue currently. Another way to do what you're trying to do (although this also isn't technically a supported procedure, so no promises that this won't break in the future), I believe, would be to change step 3 from dropping the local database to dropping all collections in the local database except the oplog. Although I haven't actually tested this procedure, so it's possible that that may not work. If that doesn't work, another idea (also untested) would be to just seed the oplog on the second node with the most recent oplog entry from the other node, rather than trying to dump/restore the entire local database. | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Judah Schvimer [ 10/Jan/18 ] | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Judah Schvimer [ 10/Jan/18 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
I've done a bit of debugging on this and I see what's happening by adding the following around the restore call:
The minvalid certainly is getting set but with some extra log lines I don't see anything explicitly setting it. It's definitely strange. | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vick Mena (Inactive) [ 08/Jan/18 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
An alternate workflow would be to use mongo-connector to mirror the database in question only and allow the application to seamlessly transition from one replica to another with zero downtime. A better approach would be to have mongomirror support non-Atlas targets and improve it's mirroring capabilities. |