[SERVER-26735] Slow response to queries and no way to investigate it Created: 22/Oct/16 Updated: 05/Mar/17 Resolved: 19/Nov/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying, Sharding |
| Affects Version/s: | 3.2.10 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Tudor Aursulesei | Assignee: | Unassigned |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL |
| Participants: |
| Description |
|
I have an 8 member replica set, with 3 replica sets for each shard, and i'm running a query which affects each shard equally. The query targets an update for almost all elements in a collection, but it has to be a full COLLSCAN. Most of the shards finish the query very fast (<5 second), but randomly, one or two of the shards take a very long time to execute, even as much as 1-2 minutes. Whenever this issue appears, it takes place on a different member of the shard as the last time. I've run db.currentOP() on the shard replica set master, and the only query running is the update. I've also tried mongotop, mongostat and even setProfilingLevel, but to no avail; i'm unable to find what is slowing me down. I've noticed between running this command twice, that the number of yields grows very fast:
I have very good hardware, more than enough free RAM. I've noticed that the primary of the slow shards tend to have much less free RAM than the others, so i decided to restart all the mongod processes in the cluster, and i had no problems for 2-3 days. After that, the issue appears again. How can i investigate this issue further? What is causing the operation to yield? I've been restarting the servers for almost 10 days now. Thanks |
| Comments |
| Comment by Tudor Aursulesei [ 05/Mar/17 ] |
|
This was caused by using ext4. I switched to XFS and the problem got away. |
| Comment by Kelsey Schubert [ 19/Nov/16 ] |
|
Hi thestick613, We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket. Regards, |
| Comment by Ramon Fernandez Marina [ 25/Oct/16 ] |
|
thestick613, can you please upload the mongod.log and the contents of the diagnostic.data directory for one of the affected nodes during a period of slow queries? Thanks, |