[SERVER-13790]  Auto-repair operation after an Unexpected Shutdown Created: 30/Apr/14  Updated: 10/Dec/14  Resolved: 01/May/14

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 2.0.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Gabriel Badescu Assignee: Mark Benvenuto
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongod_part.log    
Issue Links:
Duplicate
duplicates SERVER-13338 NT Service does not return failure co... Closed
Operating System: ALL
Steps To Reproduce:

Sometimes because of a power failure, when Windows start the mongodb service fail to start. Mongodb service it is set on 'Automatic' startup-type. Recovery service options are: first & second failure = 'restart service', subsequent failures = 'take no action'.
Mongodb service remains in 'starting' state an for each try to start step write in logs lines like in attached file 'mongod_part.log'.

Participants:

 Description   

Hello,

I read the next article http://docs.mongodb.org/manual/tutorial/recover-data-following-unexpected-shutdown/ where there are described the steps to start mongo service again.

Could you please implement an auto-repair mechanism so the user don't have to do any manual intervention.

We are requesting for this because we have customers that reported this problem and mongod.log file reaches huge size(tens of giga written in logs because service try to start in a loop without succes). It is not comfortable to discover your hard-disk running out of space.

We are using mongo 2.0.5 32 bits version.



 Comments   
Comment by Thomas Rueckstiess [ 01/May/14 ]

Hi Gabriel,

Looking at your log file it seems that the mongod process can't start because of the existing lock (mongod.lock). I understand that auto recovery from failures would be nice, but in this case we can't automatically delete the lock file because we can't tell if it is in place because another mongod process is already running or because it wasn't correctly deleted due to a hard shutdown or crash. In such a case, you really want manual intervention to fix the issue.

However, the real issue here is that the Windows service control manager does not get the correct return code from mongod and keeps restarting the process. That's why it doesn't stop after the second failure, and why your log file fills up over time. This is a bug and is being tracked in SERVER-13338.

As a workaround in the mean time you would have to disable the automatic restart feature for now, or monitor the size of the log files for such issues.

I'll close this ticket as duplicate of SERVER-13338 now. Feel free to watch the other ticket for updates.

Kind Regards,
Thomas

Generated at Thu Feb 08 03:32:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.