[SERVER-22941] mongoDB hangup after one or two days' high load mongoimport Created: 03/Mar/16  Updated: 08/Mar/17  Resolved: 17/Feb/17

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 3.2.0, 3.2.3
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: ??? Assignee: Mark Agarunov
Resolution: Incomplete Votes: 0
Labels: WTplaybook
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

redhat
as6u3


Attachments: Text File currentop.txt    
Operating System: Linux
Participants:

 Description   

My server configure:
E5 24cores/512GB memory,300GB for wiredTiger/1.5T SSD/linux as6u3 redhat.
My mongod instance configure:
wiredTiger/journal/commit interval:10s/wiredTiger cache 300GB/eviction thread 4-12

I have at most 12 mongoimport process running with only one mongod instance at the same time to load data.I hava about 50 collections.And each collection has about 10 indexes.
Usually,after one or two days,mongoDB hangup.Before mongoDB hangup,it can store almost 1TB data each day.I get that through using show dbs.
I find there always be an INSERT operation hangup through using db.currentOp().It may have been active for thousands or more seconds.It locks the DB which it insert into and blocks all operation after it.db.killOp() doesn't work either.It seems like you are creating an index in that DB.show dbs don't respond.Insert or query in that DB doesn't respond either.But,other DBs are OK.And through mongostat,i find mongod is still running,dirty% decreases.
so ,my question:
Does this question is related to index?
How does mongoDB maintain index?
Is there only one thread can maintain index each DB at the same time?
How to avoid this question?



 Comments   
Comment by Mark Agarunov [ 17/Jan/17 ]

Hello zzz,

My apologies for the delay in response. Unfortunately, we were unable to determine the root cause of the behavior described. Please note that the most recent release of MongoDB, 3.4.1, comes with many improvements which may help the issue you've observed. If this this still an problem for you, would you please upgrade to 3.4.1 and let us know if it resolves the issue?

Thanks,
Mark

Comment by ??? [ 08/Apr/16 ]

hi,Thomas,
I have tried to send a shutdown signal to mongoDB.But it seems does not work.Command [kill pid] may takes hours and mongoDB is still running.Only [kill -9] works,it just takes a few seconds.Then,i can start mongoDB in a few seconds too.
I hava executed db.currentOp() when i observe this issue,there are hundreds of ops,most of them are waiting for locks.Only one(may be a little more,i do not remmber that clearly) op is running and not waiting for locks.
I will send these logs to you as soon as i get them.
Besides that,sometimes my mongoDB instance receives very long document.These document may be about 100-1000KB.Does these too long document can cause this problem?
Thank you,
钟天舒

Comment by ??? [ 08/Apr/16 ]

hi,Thomas,
I have tried to send a shutdown signal to mongoDB.But it seems does not work.Command [kill pid] may takes hours and mongoDB still running.Only [kill -9] works,it just takes a few seconds.Then,i can start mongoDB in a few seconds too.
I hava executed db.currentOp() when i observe this issue,there are hundreds of ops,most of them are waiting for locks.Only one(may be a little more,i do not remmber clearly) op is running and not waiting for locks.
I will send these logs to you as soon as i get them.
Besides that,sometimes my mongoDB instance receives very long document.These document may be about 100-1000KB.Does these too long document can cause this problem?
Thank you,
钟天舒

------------------ 原始邮件 ------------------
发件人: "Thomas Schubert (JIRA)";<jira@mongodb.org>;
发送时间: 2016年4月8日(星期五) 凌晨0:38
收件人: "tCold"<526864170@qq.com>;

主题: [MongoDB-JIRA] (SERVER-22941) mongoDB hangup after one or two days'high load mongoimport

[ https://jira.mongodb.org/browse/SERVER-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1229314#comment-1229314 ]

Thomas Schubert commented on SERVER-22941:
------------------------------------------

Hi zzz,

We are still investigating the diagnostic data and logs that you have provided. Can you please upload the output of db.currentOp() when you observe this issue? Please also upload the logs after the affected MongoDB receives a shutdown signal.

Thank you,
Thomas

----------------------
This message was sent from MongoDB's issue tracking system. To respond to this ticket, please login to https://jira.mongodb.org using your JIRA or MMS credentials.

Comment by Kelsey Schubert [ 07/Apr/16 ]

Hi zzz,

We are still investigating the diagnostic data and logs that you have provided. Can you please upload the output of db.currentOp() when you observe this issue? Please also upload the logs after the affected MongoDB receives a shutdown signal.

Thank you,
Thomas

Comment by ??? [ 23/Mar/16 ]

I hava uploaded diagnostic.data and full logs through that portal.
By the way,i meet this question just after i upgrade mongoDB to version 3.2.

Comment by Ramon Fernandez Marina [ 10/Mar/16 ]

Thanks for the update. There was a similar behavior in SERVER-22062, but we checked that such behavior only affected 3.0 and you're using 3.2.3.

In addition to the diagnostic.data content mentioned above, if you had a script and dataset that reproduces the behavior you describe and you could upload it that would help a lot with the investigation.

Thanks,
Ramón.

Comment by ??? [ 10/Mar/16 ]

Now i create three dbs in my mongoDB instance.I switch to a new db to load data every 5 minutes.Each db loads data for only 5 minutes.That means,each db works 5 minutes and sleeps 10 minutes every 15 minutes.It has worked since last Tuesday,and that question doesn't show up.I didn't not reduce write pressure.In old plan,mongoDB can work only 2 or 3 days,then it hangup.
I will reconstruct that question next week.Because my mongoDB is deployed in intranet,it may takes time to transfer log to internet.
Thanks.

Comment by Ramon Fernandez Marina [ 03/Mar/16 ]

Can you please upload the contents of the diagnostic.data directory inside your dbpath and the full logs for this server from startup until you experience the hang?

You can upload this data through this portal privately and securely. Please let us know when the upload of the requested information is complete so we can investigate further.

Thanks,
Ramón.

Generated at Thu Feb 08 04:01:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.