[SERVER-38086] MongoDB cannot start after repair Created: 12/Nov/18 Updated: 05/Dec/18 Resolved: 05/Dec/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Pavel Zeger [X] | Assignee: | Kelsey Schubert |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | ALL |
| Sprint: | Storage NYC 2018-11-19 |
| Participants: |
| Description |
|
Hello. I installed a MongoDB 4.0.4 standalone server in order to perform a migration from Windows to CentOS. After of an unexpected shutdown of the server (virtualization issue) I cannot start a MongoDB server. I perform a repair procedure as described in an documentation: mongod --repair --directoryperdb --dbpath /opt/mongodb/data and assign permissions to mongod again after that by chown. I added a repair log here. Any way MOngoDB cannot start and write the following output in the log: 2018-11-12T09:39:47.043+0000 I CONTROL [main] ***** SERVER RESTARTED ***** , processManagement: { fork: true, pidFilePath: "/var/run/mongodb/mongod.pid", timeZoneInfo: "/usr/share/zoneinfo" }, storage: { dbPath: "/opt/mongodb/data", directoryPerDB: true, engine: "wiredTiger", journal: { enabled: true } }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.log" } } Need your help ASAP. Regards, Pavel |
| Comments |
| Comment by Pavel Zeger [X] [ 26/Nov/18 ] | |||||||||||||||||
|
Thanks. Unfortunately we encountered additionals issues with XFS filesystem and are investigating this problem right now. If it will be related to MongoDB and our application I'll open a new thread here. Meanwhile we use the previous machine for our archiving purposes. Regards, Pavel | |||||||||||||||||
| Comment by Kelsey Schubert [ 20/Nov/18 ] | |||||||||||||||||
|
Hi PavelZeger, Thanks for these logs. When WiredTiger finds a partial metadata set it prints that informational message, skips that table and keeps going. So we are letting the cursor continue its cursor walk and complete. Consequently, these log lines do not indicate an issue that would prevent startup as MongoDB. In fact, we see that that MongoDB successfully started and began accepting connections. If you're continuing to have problems with the start up service killing mongod, would you please provide the output of "systemctl status mongod.service" and "journalctl -xe" Kind regards, | |||||||||||||||||
| Comment by Pavel Zeger [X] [ 18/Nov/18 ] | |||||||||||||||||
|
Please pay attention to the following rows in the log about metadata in a collection:
Also we have the same issue with the services in staging environments: after unexpected shutdown and repairing after that we cannot start the services.
| |||||||||||||||||
| Comment by Pavel Zeger [X] [ 18/Nov/18 ] | |||||||||||||||||
|
I started mongod with another option:
Here is the log:
| |||||||||||||||||
| Comment by Pavel Zeger [X] [ 18/Nov/18 ] | |||||||||||||||||
|
I also tried this option:
The result is:
| |||||||||||||||||
| Comment by Pavel Zeger [X] [ 18/Nov/18 ] | |||||||||||||||||
|
I did it:
Mongod still didn't start. I also run daemon-reload. mongo.lock is empty.Data directory still after successfull repair. | |||||||||||||||||
| Comment by Kelsey Schubert [ 15/Nov/18 ] | |||||||||||||||||
|
Hi PavelZeger, Thanks for the additional information. By default, systemd is setting the TimeoutStartSec to 90 seconds. In this case, mongod is taking longer than 90 seconds to start, resulting in systemd sending the SIGTERM. To allow mongod to complete its startup operations, please include the following setting in etc/systemd/system/multi-user.target.wants/mongod.service under the [Service] section:
Kind regards, | |||||||||||||||||
| Comment by Pavel Zeger [X] [ 15/Nov/18 ] | |||||||||||||||||
|
I added a part of mongod.log when I started the service with a verbosity level of 5. Hope it will help to investigate the issue: [root@uk1lv8818 ~]# time systemctl start mongod real 1m30.045s
| |||||||||||||||||
| Comment by Pavel Zeger [X] [ 15/Nov/18 ] | |||||||||||||||||
|
How can you close the issue if your platform cannot start at all??! The SIGTERM wasn't send by any user - the mongod wrote it each time when cannot start! Can you solve it or no because we also want to migrate from MongoDB due to lack of stability of your platform. The mongod service cannot start and it's not important how time I can wait for it (I was waiting a whole day and won't wate). If it's your solution I'll suggest to all my clients simple to migrate from MongoDB because you cannot offer a stable data platform. | |||||||||||||||||
| Comment by Kelsey Schubert [ 14/Nov/18 ] | |||||||||||||||||
|
I suspect that mongod is still progressing as part of its standard start-up routine. I would suggest allowing mongod more time to start up. If you would like more informational logging, you may start mongod with logging set at verbosity 2, -vv, which should provide more granule updates about the actions that it is taking during this time. Please let us know if you continue to encounter issues after taking the steps outline above and provide the more verbose logs so we can investigate. | |||||||||||||||||
| Comment by Kelsey Schubert [ 14/Nov/18 ] | |||||||||||||||||
|
Hi PavelZeger, Thanks for reporting this issue, we're investigating. It looks like the process is being killed by external SIGTERM, I assume you're issuing this because the mongod isn't making progress starting up, but am wondering how long you've waited before issuing this command. Thanks, |