Major - P3
We are experiencing performance issues on the master when mongodump backs up on secondaries.
Out applications all perform read / writes from the master. The slaves are only used for replica set and backups.
Our backups are performed at 8am / 8pm every day and can take a considerable amount of time to run on the slaves.
Backups are run using the mongodump command as follows for each DB on the cluster.
'/usr/bin/mongodump --host=localhost --gzip --db=210596-game-live --excludeCollection=requestsByUserState --out=/mnt/backup --quiet'
Stopping the mongodump process from running on the secondaries resolves our masters performance issues.
So far we have seen this issue on two of our mongo clusters.
Please see details for one of the clusters we have seen this issue with.
We usually run a larger master to the slaves to reduce cost as follows.
gsp-aeu001-mo04 (M) r3.2xlarge
gsp-aeu001-mo05 (S) r3.xlarge
gsp-aeu001-mo06 (S) r3.xlarge
At the high level we can see at 8am/pm the number of commands/querys/updates etc drop off for a period of time whilst the backups run. This then causes the java applications connecting to the master to experience queuing threads.
At the same time as the commands fluctuate the performance of all three servers look as follows.
It appears that the performance of the secondaries backing up causes a degradation in performance on the master.
We have gathered the "/var/lib/mongodb/diagnostic.data/" data for each of the servers. Do you have a none public place for us to upload this data please?
Increasing the size of the Mongo secondaries to match the master has improved the performance, however we would like to understand why backups on secondaries cause performance issues on the master.