Priority: Major - P3
Affects Version/s: None
Fix Version/s: None
we are experiencing some performance problems in our delayed replica servers since we upgrade them to 2.2.2.
We have one big DB sharded in two partitions, each one containing one master one replica and one delayed replica. Each of this mongo instances is running on a dedicated server. The delayed replica has a "slaveDelay" value of "345600" (5 days) and with "buildIndexes: false". This scenario was working fine with our old 2.0.0 installation.
Last week we upgrade all servers to 2.2.2, following the upgrade instructions. When we finished, we noticed that the load in both delay servers had increased (from 0.x to 6/7) due to mongo processes performing disk operations. To solve it we perform a full resync (we deleted all the data in the dbpath and re-started the service).
It seemed to solve the problem, but yesterday, just when the one of the servers hit the "slaveDelay" lag, server's load has increased again to 5/6... I saw that the server with the problem has 2 "repl writer worker" while the other (still not in the slaveDelay limit), has only one, but I don't know if this is important or not. I imagine that when the second server hits the limit (between today and tomorrow) it will have the same problem. I searched in the forums but found nothing about how to proceed to solve it.
Can anyone tell me how to proceed?
Forgot to mention... I just changed one of the delayed servers to "--nojournal" to see if it changes something...
At the moment, no change with the "nojournal" option.