[SERVER-9308] Locking and Slow Query issues after upgrading from Mongo 2.0.7 to 2.2.3 Created: 09/Apr/13 Updated: 11/Jul/16 Resolved: 30/Apr/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 2.2.3 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Minor - P4 |
| Reporter: | Eric Hernandez | Assignee: | Hannes Magnusson |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
debian squeeze 6.0.3 |
||
| Attachments: |
|
| Participants: |
| Description |
|
After upgrading from Mongo 2.0.7 to Mongo 2.2.3 we started experiencing a 6,500% increase in slow logs and we went from under 5% locking to over 20% locking on the entire instance. We suspect that the root cause is that Mongo 2.2.3 collects and reports more statistics than 2.0.7 and as a result these locking issues have now become visible whereas in 2.0.7 they were also happening but not being reported. The majority of slow queries reported are updates to documents by object_id. We also have a legacy process that is reading the oplog in order to extract changes and push them into mysql. The constant processing of the oplog creates read locks on the oplog which in turn causes updates in the database to wait on the locks intermittently. Does this theory sound plausible that we always had this issue but now it has become visible as a result of the upgrade? Sanitized version of typical slow log entry. Tue Apr 2 19:59:58 [conn313] update db.Collection query: { _id: "sanitized" }update: { $set: { sanitized_data: BinData, sanitized_time: new Date(1364957998272) }} nscanned:0 nupdated:1 fastmodinsert:1 keyUpdates:0 locks(micros) w:390138 390ms |
| Comments |
| Comment by Hannes Magnusson [ 30/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I don't seem to be able to pinpoint the actual fix we made for this in the driver, but I encourage you to update to atleast the 1.3.x series. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Hannes Magnusson [ 22/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Right, that definitely fits the profile of the bug, where in certain conditions after a while (either after multiple requests, or in a long lived cli processes) the WriteConcern value would get all screwed up. It is fixed in the 1.3 series, along with several other important fixes, and I strongly encourage you to upgrade | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Hernandez [ 22/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
It is a lone long lived process which I have described in detail on a earlier post made on April 10 on this ticket. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Hannes Magnusson [ 22/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Oh, wait it was executed via the cli. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Hannes Magnusson [ 22/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
You are running long lived webserver processes (MaxRequestPerChild set to 0 or a very high number)? The 1.2 series was abandoned a long time ago and all users strongly encouraged to upgrade to 1.3.x of the driver. I suspect it was just coincidence you started experiencing this after upgrading mongod. The bug has been there for a while, but in a very subtle place. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Hernandez [ 22/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
We were using: Application Server: Mongo Server: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Hannes Magnusson [ 22/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hey, which PHP driver version are you using? I believe you may be hitting an old driver bug where WriteConcern settings sometimes turned into gigantic numbers, causing it to timeout. I would recommend you upgrade to the latest release (currently 1.3.7) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 17/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Eric, we are going to try to reproduce this behavior. Can you tell us more about the environment you are running this php script in? What php version? What version of the driver/library? How is this script run (terminal, cron, nginx/apache/etc)? If this is running from a linux host, what kernel/distro/etc? Is the server running on "debian squeeze 6.0.3" (as listed on the env. section of the issue), or something else? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Hernandez [ 12/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Scott, <?php // Setup Mongo connection while (true) { else { $cursor = $opLogCollection->find(); } $cursor->sort(array('$natural' => 1))->tailable(); if ($cursor->dead()) { break; }foreach ($cursor as $entry) { // insert into mysql // ... $lastTimestamp = $entry['ts']; }if ($lastTimestamp) { // update timestamp file } sleep(2); | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 11/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Those docs were written starting with 2.2 so, no. I can summarize a bit though: The locking changes in 2.2 were to introduce database level/granular locks. Before then there was a single process wide reader-writer lock (mutex) but even those descriptions are a simplification of the interactions, and with the oplog in the local db. In addition to the locking changes the whole (internal) replication process was re-factored in 2.2. The drop in lock%, increase in network output, and reduction of connections don't seem consistent with just adding the await_data flag to the query. If you can provide a sample of your script, that reproduces the problem, then we can take a look at how it performs in each environment but aside from that it is hard to know what changes could have caused this on the server side. I'm glad you were able to get things working. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Hernandez [ 11/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Scott, So our problem is technically fixed but we still don't know why it starting happening after we upgraded to 2.2.3 Do you have the 2.0 version of this document? Or you can you explain the differences between 2.0 and 2.2 in respects to writes and locks in replica sets? Thank you for your help troubleshooting this issue and you can close this ticket as we are back in action at full speed. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 11/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Eric, without more information there is not much more I can do here than guess with what little information we have, which I already have. Please let us know if there is any more information you can share with us otherwise we would like to close this out over the next few days. Creating a new issue, or reopening this is always an option as well. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Hernandez [ 10/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
FYI: oplog processing turned back on at 1:22 PM PDT. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Hernandez [ 10/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Yes the CPU Dropped. I have also attached a cacti graph of load average for db2.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 10/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I saw the lock% drop on the "fe" db, but not change much on the local db (where the oplog is). Also, the query/getmore op-counters didn't drop which I would expect to see from stopping your oplog tailing code as your described it. This doesn't look like oplog related activity (at least not directly from tailing querying). Your oplog tailing code might shed some more light on it as well. Did mongod cpu drop as well when you stopped the oplog tail code? Hopefully the mongotop numbers will provide more insight into which collections are active in the two different states. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 10/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Also, it would be very useful to have those "mongotop" stats from both cases (when you are running the oplog tail, and without). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 10/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
It sounds like your oplog tailing code is not using the correct pattern nor flags like the server code does (for replication). The general pattern is to start from the last oplog ts value to get new oplog entries, not scan the whole thing over and over again. Can you please post the code so I can suggest some improvements. I will take a look at the window when your oplog code isn't running in a half-an-hour (well, after that of course). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Hernandez [ 10/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Scott, I would like to discuss the issue more before providing any mongo logs because they are quite revealing and would need to be sanitized before pushing to either private or public spaces. I have included the top and free -m output and will work on the munin on our dev environment this week. First I would like to reference the documentation "In replication, when MongoDB writes to a collection on the primary, MongoDB also writes to the primary’s oplog, which is a special collection in the local database. Therefore, MongoDB must lock both the collection’s database and the local database. The mongod must lock both databases at the same time keep both data consistent and ensure that write operations, even with replication, are “all-or-nothing” operations." We are using a tail-able cursor to process the opLog. When the cursor runs out of entries in the opLog it goes to sleep for 2 seconds. When it wakes up it re-scans the entire opLog. The opLog size is configured to 1GB and currently we are using 965 MB. Our current traffic levels are low enough that the process goes to sleep every 2 seconds and wakes up causing a re-scan of entire opLog. The constant rescanning of the opLog is behaving differently on 2.0.7 then on 2.2.3 because when we turn off the opLog extractor process, all issues immediately go away. I would also like to re-iterate that we didn't have any of these issues on 2.0.7 and we made no changes other than upgrading mongo. Arguably there is room for improvement on the opLog extractor process but we want to understand what the current root cause is before re-factoring our legacy processes. To prove our point about the opLog processing causing the lock% to go up and subsequent slow queries we will stop our opLog processing today 04/10/13 between 12:00PM and 1:00PM PDT. Please check the primary's lock graph on MMS and you should see a significant drop. (Hat tip to BrianF our Lead Software Architect who has help diagnosis this issue.)
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 10/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Everything is public in the SERVER project but we can move this to the Community Private project where things will be private between the submitter and us (10gen). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Hernandez [ 09/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Yes that is the primary in question. Would the mongotop output and logs be public or private after I upload it? I might be able to install munin-node but would need to test in our dev environment first. I will work on getting it going in our dev environment this week. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 09/Apr/13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
It is probably not a reporting change, nor the locking you are describing. There are many changes which went into 2.2.x. Please run mongotop for a few seconds and upload the output. Also, getting the output of "top" and "free -m" would be helpful. Is this the primary in question? https://mms.10gen.com/host/detail/25a7ad14d3fce290b5e2e71f8c1280cb If this is the node in MMS, it would be very helpful if you could install munin-node to collect hardware stats for MMS. Logs would be very useful too, but lets start at metrics and numbers to see what we can discern. |