Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Won't Fix
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: MMAPv1, Storage
Labels:
None

Assigned Teams:

Storage Execution
Operating System:
ALL
Case:
Linked BF Score:
0
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

If MongoD encounters an error during shutdown after the journal files are cleaned up (prior to clearing mongod.lock) it will subsequently refuse to start. If the same error is encountered before the journal is deleted, MongoD will subsequently start correctly.

This leads to the bizarre operational condition that when journaling is enabled, the MongoD is more likely to start after a SIGKILL (kill -9) than a SIGTERM (kill -15).

MongoD halts if it finds a non-empty mongod.lock file but cannot locate journal files. MongoD deletes journal files during shutdown after flushing of data to disk prior to clearing the mongod.lock. However, certain other tasks are carried out inbetween these operations with the clearing of mongod.lock content being the last thing done. The content of mongod.lock is cleared too late in the shutdown sequence to be a reliable indicator of whether the journal files were applied successfully (and were deleted) as opposed to going missing.

In the error message starting mongod, it is stated that "this is likely human error or filesystem corruption.". However, there's no human error or filesystem corruption.
2. The recovery procedure documented at http://dochub.mongodb.org/core/repair indicated that it is for the case where journaling is turned off. We hit this with journaling on.

Possible alternative:
Given that the journal files are idempotent MongoD could leave a "journal is clear" signal file indicating the journal was cleared down correctly (i.e "data is stable"). The deletion of the journal files can then proceed. Should the MongoD crash or halt for whatever reason after this point, either the journal files will persist or the signal file indicating the journal was already applied will persist. Either way, the MongoD can uniquely determine the stability of the data files next time it is started.

related to

SERVER-32091 Powercycle - remove mongod.lock file for MMAPV1 test

Closed

Assignee:: [DO NOT USE] Backlog - Storage Execution Team
Reporter:: Andrew Ryder (Inactive)
Participants:: [DO NOT USE] Backlog - Storage Execution Team, Andrew Ryder
Votes:: 1 Vote for this issue
Watchers:: 7 Start watching this issue

Created:: Sep 02 2014 03:52:39 AM UTC
Updated:: Dec 06 2022 05:01:58 AM UTC
Resolved:: Sep 14 2018 07:56:29 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates