[SERVER-39712] Mongodb Too many open files, WT_PANIC: WiredTiger library panic and data corruprion***aborting after fassert() failure Created: 21/Feb/19 Updated: 06/May/19 Resolved: 06/May/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Performance, Replication, Sharding, WiredTiger |
| Affects Version/s: | 4.0.6 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Salamuddin Pranayan | Assignee: | Danny Hatcher (Inactive) |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Participants: |
| Description |
|
Hello, our database already using sharding and replication, we often find these kind of errors, we attached the details below, some errors like this `Mongodb Too many open files, WT_PANIC: WiredTiger library panic and data corruprion***aborting after fassert() failure` often showed up. what should we do to solve it. Thanks in advance |
| Comments |
| Comment by Danny Hatcher (Inactive) [ 25/Apr/19 ] | |
|
salamflamo are you still experiencing this issue? | |
| Comment by Danny Hatcher (Inactive) [ 13/Mar/19 ] | |
|
Hello, I'm sorry; we're still trying to identify what could be causing your open file limit. Could you run the following against your mongod process?
Thank you, Danny | |
| Comment by Salamuddin Pranayan [ 09/Mar/19 ] | |
|
Hello, I upload a new file with name "mongod.log.2019-03-08T18-23-02" above, please check and you'll find an error message like this text below. I did all recommendation settings as mongodb website said also repairing the database, but I still got this error : 2019-03-09T01:22:50.649+0700 E STORAGE [conn439] WiredTiger error (24) [1552069370:648313][10992:0x7fd71e260700], file:index-30471-1406128765714834102.wt, WT_SESSION.open_cursor: __posix_open_file, 715: /var/lib/mongo/index-30471-1406128765714834102.wt: handle-open: open: Too many open files Raw: [1552069370:648313][10992:0x7fd71e260700], file:index-30471-1406128765714834102.wt, WT_SESSION.open_cursor: __posix_open_file, 715: /var/lib/mongo/index-30471-1406128765714834102.wt: handle-open: open: Too many open files ***aborting after fassert() failure I don't know what should I do for next way. Could you still help me to fixing this?. Thank you for your help. Thank you. | |
| Comment by Danny Hatcher (Inactive) [ 08/Mar/19 ] | |
|
Hello, Are you talking about the following lines?
When WiredTiger finds a partial metadata set it prints that informational message, skips that table and keeps going. So we are letting the cursor continue its cursor walk and complete. Consequently, these log lines do not indicate an issue that would prevent startup as MongoDB. In fact, we see that that MongoDB successfully started and began accepting connections. Unfortunately, the node is unable to connect to shard0013b:27017 which causes the other errors in the logs. From what I can see, from the most recent logs you uploaded your server should be up and functioning correctly. If you have other reasons for thinking there is corruption, please let me know. Thank you, Danny | |
| Comment by Salamuddin Pranayan [ 08/Mar/19 ] | |
|
Hi Danny, I already upload files that yor're requested. Please be patient to help me, is there a way to fixing corrupt collection?, from my database some index-xxxx-xxxxxxx.wt was corrupted, I already repair but still corrupt, please help me. Thank you very much. | |
| Comment by Danny Hatcher (Inactive) [ 06/Mar/19 ] | |
|
Hello, It looks like you are still experiencing "too many open files":
Please provide the following to our Secure Upload Portal: Thank you, Danny | |
| Comment by Salamuddin Pranayan [ 28/Feb/19 ] | |
|
Hello, Thank your for helping me, but I still have the same problem, I think this is because our database has corrupt , sometimes I got an error like this below 2019-02-28T17:09:37.007+0700 E - [ftdc] Assertion: Location13538: couldn't open [/proc/16102/stat] Unknown error src/mongo/util/processinfo_linux.cpp 81 ***aborting after fassert() failure
and sometimes I got an error because of WIREDTIGER PANIC . Then I continue trying to fixing this problem with "mongod --dbpath /var/lib/mongo --repair", but I still got the problem with ended like this below 2019-02-28T17:09:37.097+0700 E STORAGE [conn364] This may be due to data corruption. Please read the documentation for starting MongoDB with --repair here: http://dochub.mongodb.org/core/repair ***aborting after fassert() failure
So for now I still has that problem. What should I have to do? Please to be patient to help me. Thank you. | |
| Comment by Danny Hatcher (Inactive) [ 28/Feb/19 ] | |
|
Hello, Yes, please use the recommendations in our docs. Per the man page for sysctl, you can use sysctl -w variable=value to change the settings. When you change them, do you still see your issue? Thank you, Danny | |
| Comment by Salamuddin Pranayan [ 28/Feb/19 ] | |
|
Hello, From default setting here in the below
So I need to setting from here https://docs.mongodb.com/manual/administration/production-checklist-operations/ ?
| |
| Comment by Danny Hatcher (Inactive) [ 27/Feb/19 ] | |
|
Hello, I apologize, I should have asked this when I initially asked about ulimits. What is your kernel.pid_max value? You can find it via sysctl -a. Please run that command and if the values do not match the recommendations below from our Operations Checklist, please change them.
Thank you, Danny | |
| Comment by Salamuddin Pranayan [ 27/Feb/19 ] | |
|
Hello, Thank you, I just uploaded , there are 10 files , from diagnostic and log, I just uploaded for log file at 26th feb Thank you | |
| Comment by Danny Hatcher (Inactive) [ 26/Feb/19 ] | |
|
Hello, You can try increasing that limit but it would be rare for that to be necessary. In order for us to investigate further, please upload your mongod logs and "diagnostic.data" folder (located under your $dbpath) to our Secure Upload Portal. Only MongoDB engineers will be able to see any files you upload there and the contents will be deleted after some time. Thank you, Danny | |
| Comment by Salamuddin Pranayan [ 26/Feb/19 ] | |
|
Hello, Here when I check. Limit Soft Limit Hard Limit Units when I see open files it's just 64000, should I increasing the limit? Thank you | |
| Comment by Danny Hatcher (Inactive) [ 25/Feb/19 ] | |
|
Hello, Can you please provide the output of the following command when substituting in your mongod pid? This will tell us the ulimits currently being used by the process itself.
Thank you, Danny | |
| Comment by Salamuddin Pranayan [ 21/Feb/19 ] | |
|
Hi, `ulimit -f unlimited -t unlimited -v unlimited -l unlimited -n 64000 -m unlimited -u 64000` with guide from this page https://docs.mongodb.com/manual/reference/ulimit/index.html#recommended-ulimit-settings thank you | |
| Comment by Danny Hatcher (Inactive) [ 21/Feb/19 ] | |
|
Hello, You are most likely seeing these messages because your server does not match our recommended ulimit settings. If you follow our recommendations on that page, most notably -n (open files): 64000, do you still see these errors? Thank you, Danny |