Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 3.2.10, 3.2.12
Component/s: Performance
Labels:
None

Operating System:
ALL
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

After two days of uptime, some of my shards start responding slower and slower to queries. Even if i stop all the workers doing load into the database and wait for all operations to finish, the mongod instances keep responding very slow when restarting the workers. Queries that took 0.1 seconds to run take 40-50 seconds or more. The operations that seem to trigger this behaviour are bulk updates to a collection ($pull a "job") and insertion into another collection from a different database. The wired tiger cache fills up on those machines, the mongod instance eats up all available RAM and CPU. Running

sync && echo 3 > /proc/sys/vm/drop_caches

doesn't help.

I am pretty sure this is a bug, because after i restart all the mongod instances i have no problem whatsoever for 2-3 days. I have a very fast storage so i don't mind loading the hot data. How can i investigate this problem? What metrics should i monitor? I've tried creating a dummy stress test script to run agains a 3.2 instance to see if i can trigger the bug and compare with a 3.4 instance, but i haven't succeeded yet.

Running too many bulk inserts and updates in the same collection seems to be the culprit. The operations start waiting one after the other and the yields start piling up one on eachother. This is when RAM and CPU usage spike and performance degrades until the restart.

Assignee:: Kelsey Schubert
Reporter:: Tudor Aursulesei
Participants:: Kelsey Schubert, Tudor Aursulesei
Votes:: 0 Vote for this issue
Watchers:: 8 Start watching this issue

Created:: Mar 20 2017 08:16:29 AM UTC
Updated:: May 31 2017 09:23:35 PM UTC
Resolved:: Mar 28 2017 09:03:59 PM UTC

Details

Description

Attachments

Activity

People

Dates