[SERVER-29044] MongoDB 3.4 is shutting down without any error log Created: 03/May/17  Updated: 21/Jun/17  Resolved: 16/May/17

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.4.2
Fix Version/s: None

Type: Question Priority: Critical - P2
Reporter: Enrique Abott Assignee: Kelsey Schubert
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Zip Archive mongod.node4_3.4.zip     Text File rs.conf().txt    
Participants:

 Description   

When the shut down instance is being started other nodes would shut down for no apparent reason. Nodes that are starting are not able to find a node to sync to. Also when a node that is syncing to a node that shutdown, that node will also shut it self down.



 Comments   
Comment by Kelsey Schubert [ 16/May/17 ]

Hi eabott@synapsemail.com,

It appears that multiple mongod processes are running on the same host with the default WiredTiger cache sizes (which assume a single mongod process per host), as a result they are likely being killed by the OOM killer.

Please note that SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-users group.

See also our Technical Support page for additional support resources.

Kind regards,
Thomas

Comment by Enrique Abott [ 16/May/17 ]

Our production environment is very unstable because nodes are shutting down intermittently. Please advise.

Comment by Enrique Abott [ 11/May/17 ]

intermittent shutdown is happening again today.

Comment by Enrique Abott [ 05/May/17 ]

diagnostic.data uploaded

Comment by Kelsey Schubert [ 05/May/17 ]

Hi eabott@synapsemail.com,

In your dbpath, there should be directory called diagnostic.data. Inside, there should be files like metrics.2017-01-17T23-40-47Z-00000. I'd like to examine these metrics files, would you please upload them for each node?

If they are too large to attach to the ticket, I've created an upload portal for you to use.

Thank you,
Thomas

Comment by Enrique Abott [ 05/May/17 ]

I attached the documents that you asked for but I am confused as to the "archive diagnostic.data" request from each node. Could you pls elaborate on how to produce this?

Comment by Kelsey Schubert [ 03/May/17 ]

Hi eabott@synapsemail.com,

Thank you for reporting this behavior.

To help us investigate this issue, please provide following:

  • output of rs.conf()
  • complete log files from each node
  • an archive of diagnostic.data directory from each node

Additionally, please identify when the issue occurred.

Thank you,
Thomas

Generated at Thu Feb 08 04:19:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.