[SERVER-50880] Mongod Server Failed with signal 6 Created: 11/Sep/20  Updated: 27/Oct/23  Resolved: 20/Jan/21

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 4.4.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Anandesh Sharma Assignee: Dmitry Agranat
Resolution: Community Answered Votes: 0
Labels: KP44, SWCW
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File error.log    
Issue Links:
Related
related to SERVER-50971 Invariant failure, WT_NOTFOUND: item ... Closed
related to SERVER-52530 Mongo v.4.4.1 crash - UnknownError -3... Closed
is related to SERVER-51386 Mongo 4.4.1 Crashes often Closed
Operating System: ALL
Participants:

 Description   

Let me first explain the scenario first!

I've installed mongo 3.6 and it was running fine but then I decided to upgrade, so I just took the backup using mongo tools. I removed and purged all the mongo packages and reinstalled the latest version. Restored the backup edited few changes in unit file and conf file.

[Unit] 
Description=MongoDB Database Server 
Documentation=https://docs.mongodb.org/manual 
After=network.target 
 
[Service] 
User=mongodb 
Group=mongodb 
EnvironmentFile=-/etc/default/mongod 
ExecStart=/usr/bin/mongod --config /etc/mongod.conf 
PIDFile=/run/mongodb/mongod.pid 
Restart=always 
RestartSec=1 
# file size 
LimitFSIZE=infinity 
# cpu time 
LimitCPU=infinity 
# virtual memory size 
LimitAS=infinity 
# open files 
LimitNOFILE=64000 
# processes/threads 
LimitNPROC=64000 
# locked memory 
LimitMEMLOCK=65536 
# total threads (user+kernel) 
TasksMax=infinity 
TasksAccounting=false 
 
# Recommended limits for mongod as specified in 
# https://docs.mongodb.com/manual/reference/ulimit/#recommended-ulimit-settings 
 
[Install] 
WantedBy=multi-user.target

I was running two mongo instances, copied everything separately (I'm sure, I did this perfectly) as it was the same when mongo 3.6 was running.

Now, sometimes my mongod instance is killed:

Sep 11 09:28:05 mlnode systemd[1]: mongod.service: Main process exited, code=killed, status=6/ABRT
Sep 11 09:28:05 mlnode systemd[1]: mongod.service: Failed with result 'signal'.

I've looked into the mongo logs as well but I was unable to understand what is the exact reason also I've looked at other Jira threads which says maybe this is an issue of  MEMLOCK limits, I've corrected that still, the problem persists.

{"t":{"$date":"2020-09-11T09:25:58.708+05:30"},"s":"F",  "c":"-",        "id":23083,   "ctx":"conn264761","msg":"In
variant failure","attr":{"expr":"ret","error":"UnknownError: -31803: WT_NOTFOUND: item not found","file":"src/mongo
/db/storage/wiredtiger/wiredtiger_record_store.cpp","line":1598}}
 
{"t":{"$date":"2020-09-11T09:25:58.708+05:30"},"s":"F",  "c":"-",        "id":23084,   "ctx":"conn264761","msg":"\n
\n***aborting after invariant() failure\n\n"}
 
{"t":{"$date":"2020-09-11T09:25:58.708+05:30"},"s":"F",  "c":"CONTROL",  "id":4757800, "ctx":"conn264761","msg":"Wr
iting fatal message","attr":{"message":"Got signal: 6 (Aborted).\n"}}

I want urgent help please! If anyone wants additional information then I'm ready to provide ASAP.

Thanks



 Comments   
Comment by Dmitry Agranat [ 20/Jan/21 ]

Hi anandeshsharma@gmail.com, I will go ahead and close this case as the issue was fixed in 4.4.3. Please reopen this case if you still experience the same issue after upgrading to 4.4.3.

Regards,
Dima

Comment by Dmitry Agranat [ 07/Jan/21 ]

Hi anandeshsharma@gmail.com,

We've made some work in 4.4.3 trying to fix this issue. Would be possible for you to try 4.4.3 and provide us with feedback?

Thanks,
Dima

Comment by Dmitry Agranat [ 12/Oct/20 ]

Hi anandeshsharma@gmail.com,

Were you able to set the locked-in-memory size to unlimited? Also, the team has built custom binaries for MongoDB Community build for Ubuntu 20.04 x86_64 with additional logging. The downloadable are here (public link).

This should not have any impact on the application, and will only print additional information in the event the same bug causes the server to crash.

Thanks,
Dima

Comment by Dmitry Agranat [ 05/Oct/20 ]

Hi anandeshsharma@gmail.com, have you managed to set the locked-in-memory size to unlimited? If yes, did the reported issue occur again?

Comment by Dmitry Agranat [ 24/Sep/20 ]

Hi anandeshsharma@gmail.com, yes, you will need to set your locked-in-memory size to unlimited.

Can you also compress and attach into the same uploader link an archive of all of /var/log/messages*, /var/log/syslog* and {{ /var/log/dmesg*}}?

Thanks,
Dima

Comment by Anandesh Sharma [ 18/Sep/20 ]

Yeah, I've uploaded it as, mongod.tar.gz & diagnostic.tar.gz

UNIX ulimits (hard) 

  • I set locked-in-memory to unlimited but it remains 65536

    -t: cpu time (seconds)              unlimited 
    -f: file size (blocks)              unlimited 
    -d: data seg size (kbytes)          unlimited 
    -s: stack size (kbytes)             unlimited 
    -c: core file size (blocks)         unlimited 
    -m: resident set size (kbytes)      unlimited 
    -u: processes                       125299 
    -n: file descriptors                1048576 
    -l: locked-in-memory size (kbytes)  65536 
    -v: address space (kbytes)          unlimited 
    -x: file locks                      unlimited 
    -i: pending signals                 125299 
    -q: bytes in POSIX msg queues       819200 
    -e: max nice                        0 
    -r: max rt priority                 0 
    -N 15:                              unlimited

    These are mongo limits

    Limit                     Soft Limit           Hard Limit           Units      
    Max cpu time              unlimited            unlimited            seconds    
    Max file size             unlimited            unlimited            bytes      
    Max data size             unlimited            unlimited            bytes      
    Max stack size            8388608              unlimited            bytes      
    Max core file size        0                    unlimited            bytes      
    Max resident set          unlimited            unlimited            bytes      
    Max processes             64000                64000                processes  
    Max open files            64000                64000                files      
    Max locked memory         65536                65536                bytes      
    Max address space         unlimited            unlimited            bytes      
    Max file locks            unlimited            unlimited            locks      
    Max pending signals       125299               125299               signals    
    Max msgqueue size         819200               819200               bytes      
    Max nice priority         0                    0                     
    Max realtime priority     0                    0                     
    Max realtime timeout      unlimited            unlimited            us    

     

Comment by Dmitry Agranat [ 13/Sep/20 ]

Hi anandeshsharma@gmail.com,

Would you please archive (tar or zip) the full mongod.log files and the $dbpath/diagnostic.data directory (the contents are described here) and upload them to this support uploader location?

Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time.

Also, please post the MEMLOCK limit after you have corrected it?

Thanks,
Dima

Generated at Thu Feb 08 05:23:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.