[SERVER-1680] Error during service startup causes server to start over and over and over Created: 25/Aug/10  Updated: 12/Jul/16  Resolved: 15/Mar/11

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 1.6.1
Fix Version/s: 1.9.0

Type: Bug Priority: Trivial - P5
Reporter: Collin Sauve Assignee: Robert Stam
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows Vista 64bit


Operating System: Windows
Participants:

 Description   

Due to an oversight on my part, the path that I specified for --dbpath did not exist. This caused the service to start and stop several hundred times within 10 minutes. After 10 minutes I had a 3MB log file.

Very trivial problem, but this is not the behaviour I would expect.
Expected behaviour would be either;
1. log the error, exit the process (recommended)
2. wait until error condition is corrected.



 Comments   
Comment by auto [ 16/Mar/11 ]

Author:

{u'login': u'rstam', u'name': u'rstam', u'email': u'robert@10gen.com'}

Message: Fixed SERVER-2719, SERVER-1680 and SERVER-1434 (all had the same underlying cause).
https://github.com/mongodb/mongo/commit/e4df10c2e034e9cb25e0f8de13b5c5e9b00283a9

Comment by auto [ 15/Mar/11 ]

Author:

{u'login': u'rstam', u'name': u'rstam', u'email': u'robert@10gen.com'}

Message: Fixed SERVER-2719, SERVER-1680 and SERVER-1434 (all had the same underlying cause).
https://github.com/mongodb/mongo/commit/6ebf0c03153113927cf28dd5151bdd0cc9a2eead

Comment by Jason R. Coombs [ 24/Feb/11 ]

On further consideration, leaving the lockfile on an unclean shutdown is the standard way for MongoDB to behave. Adding --dur with 1.8 addresses the restart on unclean shutdown issue.

Still, if MongoDB is designed to fail on startup (and not automatically recover from the failure), it should not configure the service to rapidly restart on failure.

Comment by Jason R. Coombs [ 24/Feb/11 ]

I also encountered this problem, but with a properly-configured service.

With 1.8.0rc0, if the lockfile is present (perhaps due to an unclean shutdown), MongoDB will attempt to start, find the lock file, and exit.

Because the service is configured to immediately restart, Windows will once again invoke the service. This behavior causes the system to go into a tight loop, consuming 100% CPU and filling the event log with hundreds of messages per second.

The problem is not as much MongoDB as it is the way it installs itself as a service. It configures the Recovery options to "Restart the Service" on all failures with a 0 minute delay.

So, there are a couple of factors at play here.

First, MongoDB should probably be more polite about restarting on failure. A 1 minute delay on startup failure is probably adequate. In fact, it's probably not desirable to have MongoDB restart on "Subsequent failures". If the service is repeatedly failing, user intervention is probably required.

Second, MongoDB should be using a locking mechanism that can detect if there's actually another database actively locking the file system. Since this bug is about the service startup, I'll log another ticket about the lock file.

I would really like to see this ticket tagged for 1.8.

Comment by Adrian Hills [ 10/Jan/11 ]

I hit the same problem after mistakenly forgetting to specify the dbpath arg. Easy mistake to make - would be nice if it handled it as my log file grew to 8MB within a minute or so and the service could not be stopped. In the end, had to rename the log directory which killed the service.

Comment by Eliot Horowitz (Inactive) [ 25/Aug/10 ]

Not sure if this is even plausble to do, but would be nice if worked.

Generated at Thu Feb 08 02:57:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.