[SERVER-10833] Initial Sync Performance Created: 20/Sep/13  Updated: 03/Jan/14  Resolved: 04/Nov/13

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.2.1, 2.4.6
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Winand Appelhoff Assignee: Unassigned
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

OpenSuse x64


Participants:

 Description   

After multiple harddisk failures on the secondary member of a replica set we are desperatly trying bring a new member up and running without taking down the whole system and manually copying the files from the primary

after mongo completed copying the data, the sync process is more or less stuck on creating the indices.
this process already took almost 1 week! and is still not finished yet

any error or loss of connection to the primary in the sync process will restart the whole process from the beginning, deleting all previously synced data and indices

log excerpt:

2013-09-20T11:22:07.010835+02:00 rack4-5 mongod.30000[25590]: Fri Sep 20 11:22:07.010 [rsSync] #011#011Index: (2/3) BTree Bottom Up Progress: 3141451300/3764787709#01183%
2013-09-20T11:22:17.005409+02:00 rack4-5 mongod.30000[25590]: Fri Sep 20 11:22:17.004 [rsSync] #011#011Index: (2/3) BTree Bottom Up Progress: 3141577600/3764787709#01183%
2013-09-20T11:22:27.010042+02:00 rack4-5 mongod.30000[25590]: Fri Sep 20 11:22:27.009 [rsSync] #011#011Index: (2/3) BTree Bottom Up Progress: 3141705300/3764787709#01183%

Size of the Data is currently at 2.4TB, increasing at a rate of about 20GB per day

is there any way to speed up this process?
is there any way to have some sort of incremental initial sync,since in case of an error, having to restart the whole process from the beginning is utterly insane
why do indices have to be rebuilt from scratch instead of just getting copied from the primary anyway

currently mongodb is basically unusable for our purpose, because in the event of a hardware failure, we are unable to get a replacement back online in a reasonable amount of time without shutting down the whole database



 Comments   
Comment by Scott Hernandez (Inactive) [ 20/Sep/13 ]

It is best to ask these kinds of question on the mongodb-user list (http://groups.google.com/forum/#!forum/mongodb-user). This project is meant for bugs and feature requests.

Building indexes is done as quickly as possible and there is no user tuning options, aside from having more memory or faster storage.

Replication works at the operation (logical) level and not at the block or file level so each member must create its own storage files – this model allows having different version and storage formats on different members.

It is best to use a filesystem backup or copy of the dbpath files to bring up a new member, to reduce the time and work done, as described in the docs here: http://docs.mongodb.org/manual/tutorial/resync-replica-set-member/#replica-set-resync-by-copying

If you have a recent backup which you can use to start the new secondary from I would suggest using it. As long as the backup was made during the current oplog windows of operation it will be able to replicate and catch up.

Generated at Thu Feb 08 03:24:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.