[SERVER-26627] Config server crash after shard removal Created: 14/Oct/16  Updated: 15/Oct/16  Resolved: 14/Oct/16

Status: Closed
Project: Core Server
Component/s: Sharding, Stability, WiredTiger
Affects Version/s: 3.0.8
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: dabest1 Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Steps To Reproduce:

Not sure if it is easily reproducible:
Remove shard from 2 shard cluster.

Participants:

 Description   

Mongod crashed on 1st config server after removal of a shard from a cluster.

MongoDB version: 3.0.8.

Log file on config server shows:
2016-10-13T22:13:38.506+0000 I CONTROL [signalProcessingThread] got signal 15 (Terminated), will terminate after current cmd ends
2016-10-13T22:13:38.506+0000 I CONTROL [signalProcessingThread] now exiting
2016-10-13T22:13:38.506+0000 I NETWORK [signalProcessingThread] shutdown: going to close listening sockets...
2016-10-13T22:13:38.506+0000 I NETWORK [signalProcessingThread] closing listening socket: 6
2016-10-13T22:13:38.506+0000 I NETWORK [signalProcessingThread] closing listening socket: 7
2016-10-13T22:13:38.506+0000 I NETWORK [signalProcessingThread] removing socket file: /tmp/mongodb-27017.sock
2016-10-13T22:13:38.506+0000 I NETWORK [signalProcessingThread] shutdown: going to flush diaglog...
2016-10-13T22:13:38.506+0000 I NETWORK [signalProcessingThread] shutdown: going to close sockets...
2016-10-13T22:13:38.514+0000 I STORAGE [signalProcessingThread] WiredTigerKVEngine shutting down
2016-10-13T22:13:38.532+0000 I STORAGE [conn12551] got request after shutdown()
2016-10-13T22:13:38.533+0000 I STORAGE [conn12550] got request after shutdown()
2016-10-13T22:13:38.617+0000 I STORAGE [signalProcessingThread] shutdown: removing fs lock...
2016-10-13T22:13:38.617+0000 I CONTROL [signalProcessingThread] dbexit: rc: 0

Restarting the service failed with permissions errors. It seems like the ownership of few files was changed from mongod:mongod to root:root during or before the crash. Some kind of bug in mongod? I don't think anyone changed the ownership.

These are the files with wrong ownership:
rw-rr- 1 root root 5701900 Oct 13 22:13 mongod.log
rw-rr- 1 root root 913 Oct 13 22:13 WiredTiger.turtle
rw-rr- 1 root root 64598272 Oct 13 22:13 WiredTigerLog.0000000452

This type of crash happened before, where 1st config server crashed after shard removal, but I do not recall if ownership was also broken.



 Comments   
Comment by dabest1 [ 15/Oct/16 ]

Thank you, I have identified an automated process which has shut down the config server during the downsizing process. However, I think you should still keep this bug open as the ownership of the files was incorrectly set during the shutdown.

Comment by Kelsey Schubert [ 14/Oct/16 ]

Hi dabest1,

Thanks for your report. It appears that the mongo process received a SIGTERM, and shutdown cleanly. I would suggest investigating who or what is sending the SIGTERM.

Please note that SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-users group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-users group.

Kind regards,
Thomas

Generated at Thu Feb 08 04:12:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.