[SERVER-7810] Replication lag when updates handled by database. Created: 30/Nov/12 Updated: 11/Jul/16 Resolved: 22/Jan/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 2.2.2 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Azat Khuzhin | Assignee: | David Hows |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
version: 2.2.2-rc1 |
||
| Attachments: |
|
| Participants: |
| Description |
|
I have next configuration: All requests for insert/update going to server1/application server. There was about 20 concurrent updates. (collection that handle updates have 819 366 177 documents, and it is capped. I don't change object size in update.) But there is next interesting behavior:
Why SECONDARY can't handle operations like PRIMARY in this case? |
| Comments |
| Comment by Azat Khuzhin [ 13/Jan/13 ] | |||||||||||||||||||||||||||||||||||
|
Hi David, It seems not. Thanks. | |||||||||||||||||||||||||||||||||||
| Comment by David Hows [ 28/Dec/12 ] | |||||||||||||||||||||||||||||||||||
|
Hi Azat, Is there anything further we can do with this issue? Or can we close it? Cheers, David | |||||||||||||||||||||||||||||||||||
| Comment by Azat Khuzhin [ 17/Dec/12 ] | |||||||||||||||||||||||||||||||||||
|
> The memory-footprints.png output would have been the graph I would use for this, but it doesn't contain any data about cache or buffered memory (as free does). Additionally the memory footprints numbers are extraordinarily low (1M of virtual memory max) and I wanted to confirm these with top. Sorry, the units on graph is incorrect. What about cached/buffers/free see new attachments. Yes, I also thinks that IO is the main problem. I thought about MMS, but for now the internal monitoring system is enough. Thanks for help. | |||||||||||||||||||||||||||||||||||
| Comment by David Hows [ 17/Dec/12 ] | |||||||||||||||||||||||||||||||||||
|
Hi Azat, The memory-footprints.png output would have been the graph I would use for this, but it doesn't contain any data about cache or buffered memory (as free does). Additionally the memory footprints numbers are extraordinarily low (1M of virtual memory max) and I wanted to confirm these with top. The pagefault numbers are also supremely high averaging 5000 per minute according to your graph and the background flush times are also higher than i would expect. This leads me to believe you have memory usage and IO potential IO problems in your environment. Have you considered installing MMS? Its free and available at mms.10gen.com Cheers, David | |||||||||||||||||||||||||||||||||||
| Comment by Azat Khuzhin [ 14/Dec/12 ] | |||||||||||||||||||||||||||||||||||
|
Hi David, I'v already fix this, using "writeConcern", with "w=2", so this not reproduced now. But I found information about page faults and background flush times. See attachment. A simple document:
Update:
May be useful:
And what about top and free, could you explain what you need from this commands? | |||||||||||||||||||||||||||||||||||
| Comment by David Hows [ 11/Dec/12 ] | |||||||||||||||||||||||||||||||||||
|
Hi Azat, From what I have been able to gather from your graphs you may have issues with your working set or updates. Are you able to get information about mongod background flush times and pagefaults? Would it be possible for you to start using MMS for 24-48 hours and share your URL? Can you attach a sample document so i can see your schema and can you explain what kind of updates you do to your documents? If possible i would also like to see data about your mongod instances memory usage. Could you attach the output of:
Cheers, David | |||||||||||||||||||||||||||||||||||
| Comment by Azat Khuzhin [ 10/Dec/12 ] | |||||||||||||||||||||||||||||||||||
|
David, I attached graphs. If you have any questions about it, feel free to ask. | |||||||||||||||||||||||||||||||||||
| Comment by David Hows [ 07/Dec/12 ] | |||||||||||||||||||||||||||||||||||
|
Hi Azat, Would you be able to attach graphs of the following from your stats along with the disk and CPU statistics?
Can you also show the two points of data that show the hours replication lag? Cheers, David | |||||||||||||||||||||||||||||||||||
| Comment by Azat Khuzhin [ 04/Dec/12 ] | |||||||||||||||||||||||||||||||||||
|
The next mongodb performance status available:
Also there is other metrics like memory, disk, network and others. | |||||||||||||||||||||||||||||||||||
| Comment by David Hows [ 04/Dec/12 ] | |||||||||||||||||||||||||||||||||||
|
Hi Azat, Does your internal monitoring solution collect any of the mongodb performance stats from your instances? If so, which ones do you have available? We would like to see some of these indicators within your system to determine what is happening when the replication lag occurs. Cheers, David | |||||||||||||||||||||||||||||||||||
| Comment by Azat Khuzhin [ 02/Dec/12 ] | |||||||||||||||||||||||||||||||||||
|
No, we have internal monitoring system. Thanks. | |||||||||||||||||||||||||||||||||||
| Comment by Eliot Horowitz (Inactive) [ 01/Dec/12 ] | |||||||||||||||||||||||||||||||||||
|
Is the replica set in mms? |