[SERVER-10474] How does replication work and what is the performance bottlenecks? Created: 09/Aug/13 Updated: 10/Dec/14 Resolved: 19/Aug/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 2.2.3 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Johnny Boy | Assignee: | Stennie Steneker (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | performance, replicaset | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Ubuntu LTS |
||
| Participants: |
| Description |
|
We've been having issues with replication lag. ), let it sync up and then do the same for the other slave. Only then were both able to catch up with the oplog. Why does it help to restart mongodb for it to start catching up when 1800+ seconds behind? Did not help to set maintenance mode. What's the bottlenecks of replication? Will the maximum capacity to replicate one database depend on how much one cpu core can handle? How do we monitor this limitation? |
| Comments |
| Comment by Stennie Steneker (Inactive) [ 19/Aug/13 ] |
|
Hi Johnny, As we haven't heard from you in a while I'm going to close this issue. If you have further support questions, the community forums like mongodb-users and StackOverflow are a better starting point. Thanks, |
| Comment by Stennie Steneker (Inactive) [ 13/Aug/13 ] |
|
Hi Johnny, You should not need to set maintenance mode or restart the secondary in order to have replication continue successfully. By chance are you running any scripts or commands to kill long running operations on the server? Can you open a ticket in the SUPPORT (Community Private) project and attach your mongod logs? Please reference As far as more details on replication mechanics, I would suggest reviewing the Replication documentation in the manual as well as:
Regards, |
| Comment by Johnny Boy [ 12/Aug/13 ] |
|
Hello! Thank you for the sum up. I will use the user group next time. Yes it is the hosts in the MMS group. The pattern is that when we do deploys / having peak time in traffic where the most amount of writes are being performed the replication lag keeps growing and have a hard time catching up without manual intervention. Is there anything in particular I should look for in the logs? As for the technicality on how the replication works, is there any detailed documentation I can read about that? Thank you! |
| Comment by Stennie Steneker (Inactive) [ 12/Aug/13 ] |
|
Hi Johnny, The SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion you should post on the mongodb-users group (http://groups.google.com/group/mongodb-user) or Stack Overflow. A question like this involving more discussion would be best posted on the mongodb-users group. In regards to replication lag, you would need to be more specific in terms of what you are seeing. For example, is this sustained replication lag or just apparent short jumps in replication lag. Assuming you are referring to hosts in the MMS group linked to your user account, it looks like you have had only one brief bump in replication lag over the past week rather than a sustained problem. Without seeing the full logs it is hard to know what else was happening at the time, but I would suspect that a resource issue such as networking or I/O contention could have affected your replication. If you would like to attach your logs here for review we may be able to provide more insight. I would note that issues and attachments in the SERVER project are publicly visible though. Regards, |