[SERVER-19778] Mongod recovery fails after power cycle, mmap storage, no journal files present Created: 05/Aug/15  Updated: 18/Oct/15  Resolved: 18/Oct/15

Status: Closed
Project: Core Server
Component/s: MMAPv1, Storage
Affects Version/s: 3.1.7
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jonathan Abrahams Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: 32powercycle, 32qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File powertest.sh     Text File pt-317-mmap-win.log    
Issue Links:
Depends
Duplicate
duplicates SERVER-7861 Add marker file for journaling to ens... Closed
Operating System: ALL
Steps To Reproduce:

powertest.sh -s <ip addr of server> -e mmapv1 -R -N 20 -F <root of mongo src> -l 100

Participants:

 Description   

The power cycle test for mmapv1 uncoverwed a mongod recovery failure because of missing journal files:

2015-08-05T11:24:48.645-0700 I CONTROL  [main] ***** SERVER RESTARTED *****
2015-08-05T11:24:48.880-0700 I CONTROL  [main] Trying to install Windows service 'mongod-pt-317-mmap'
2015-08-05T11:24:48.886-0700 I CONTROL  [main] Service 'mongod-pt-317-mmap' (mongod-pt-317-mmap) installed with command line 'C:\cygwin64\home\Administrator\pt-317-mmap\MongoDB\Server\3.1\bin\mongod.exe --storageEngine mmapv1 --logpath C:\\cygwin64\\home\\Administrator\\pt-317-mmap\\logs\\mongod-mmapv1-recovery.log --logappend --dbpath C:\\cygwin64\\home\\Administrator\\pt-317-mmap\\data\\mmapv1 --port 37017 --setParameter enableTestCommands=1 -replSet pc --service'
2015-08-05T11:24:48.886-0700 I CONTROL  [main] Service can be started from the command line with 'net start mongod-pt-317-mmap'
2015-08-05T11:24:48.989-0700 I CONTROL  [main] ***** SERVER RESTARTED *****
2015-08-05T11:24:49.233-0700 I CONTROL  [main] Trying to start Windows service 'MongoDB'
2015-08-05T11:24:49.234-0700 I STORAGE  [thread1] Service running
2015-08-05T11:24:49.257-0700 W -        [initandlisten] Detected unclean shutdown - C:\\cygwin64\\home\\Administrator\\pt-317-mmap\\data\\mmapv1\mongod.lock is not empty.
2015-08-05T11:24:49.291-0700 I STORAGE  [initandlisten] **************
old lock file: C:\\cygwin64\\home\\Administrator\\pt-317-mmap\\data\\mmapv1\mongod.lock.  probably means unclean shutdown,
but there are no journal files to recover.
this is likely human error or filesystem corruption.
please make sure that your journal directory is mounted.
found 8 dbs.
see: http://dochub.mongodb.org/core/repair for more information
*************
2015-08-05T11:24:49.397-0700 I STORAGE  [initandlisten] exception in initAndListen: 12596 old lock file, terminating
2015-08-05T11:24:49.397-0700 I CONTROL  [serviceStopWorker] dbexit:  rc: 49

After this failure, I deleted the mongod.lock file and reran manually and saw this error:

$ pt-317-mmap/MongoDB/Server/3.1/bin/mongod.exe --storageEngine mmapv1 --dbpath C:\\cygwin64\\home\\Administrator\\pt-317-mmap\\data\\mmapv1 --port 37017 --setParameter enableTestCommands=1 -replSet pc
2015-08-05T19:29:12.832+0100 I JOURNAL  [initandlisten] journal dir=C:\cygwin64\home\Administrator\pt-317-mmap\data\mmapv1\journal
2015-08-05T19:29:12.864+0100 I JOURNAL  [initandlisten] recover : no journal files present, no recovery needed
2015-08-05T19:29:12.926+0100 I JOURNAL  [durability] Durability thread started
2015-08-05T19:29:13.001+0100 I JOURNAL  [journal writer] Journal writer thread started
2015-08-05T19:29:13.017+0100 I CONTROL  [initandlisten] MongoDB starting : pid=84 port=37017 dbpath=C:\cygwin64\home\Administrator\pt-317-mmap\data\mmapv1 64-bit host=CAPJA-Win2012R2
2015-08-05T19:29:13.017+0100 I CONTROL  [initandlisten]
2015-08-05T19:29:13.017+0100 I CONTROL  [initandlisten] ** NOTE: This is a development version (3.1.7-pre-) of MongoDB.
2015-08-05T19:29:13.017+0100 I CONTROL  [initandlisten] **       Not recommended for production.
2015-08-05T19:29:13.017+0100 I CONTROL  [initandlisten]
2015-08-05T19:29:13.018+0100 I CONTROL  [initandlisten] targetMinOS: Windows 7/Windows Server 2008 R2
2015-08-05T19:29:13.018+0100 I CONTROL  [initandlisten] db version v3.1.7-pre-
2015-08-05T19:29:13.018+0100 I CONTROL  [initandlisten] git version: d56b18054ae884c979135c34ba698e052b66d6c2
2015-08-05T19:29:13.018+0100 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1p-fips 9 Jul 2015
2015-08-05T19:29:13.018+0100 I CONTROL  [initandlisten] allocator: tcmalloc
2015-08-05T19:29:13.018+0100 I CONTROL  [initandlisten] options: { net: { port: 37017 }, replication: { replSet: "pc" }, setParameter: { enableTestCommands: "1" }, storage: { dbPath: "C:\cygwin64\home\Administrator\pt-317-mmap\data\mmapv1", engine: "mmapv1" } }
2015-08-05T19:29:13.057+0100 I STORAGE  [FileAllocator] allocating new datafile C:\cygwin64\home\Administrator\pt-317-mmap\data\mmapv1\drop_database0.0, filling with zeroes...
2015-08-05T19:29:13.058+0100 I STORAGE  [FileAllocator] creating directory C:\cygwin64\home\Administrator\pt-317-mmap\data\mmapv1\_tmp
2015-08-05T19:29:13.069+0100 I STORAGE  [FileAllocator] done allocating datafile C:\cygwin64\home\Administrator\pt-317-mmap\data\mmapv1\drop_database0.0, size: 64MB,  took 0 secs
2015-08-05T19:29:13.106+0100 I CONTROL  [initandlisten] CreateFileMappingW for C:\cygwin64\home\Administrator\pt-317-mmap\data\mmapv1\drop_database2.ns failed with errno:1006 The volume for a file has been externally altered so that the opened file is no longer valid. (file size is 0) in MemoryMappedFile::map
2015-08-05T19:29:13.107+0100 I -        [initandlisten] Assertion failure _state.get() < 0 C:\data\mci\src\src\mongo/util/concurrency/rwlock.h 204
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\util\stacktrace_windows.cpp(174)                            mongo::printStackTrace+0x43
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\util\log.cpp(134)                                           mongo::logContext+0xac
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\util\assert_util.cpp(137)                                   mongo::verifyFailed+0x167
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\storage\mmap_v1\mmap_windows.cpp(158)                    mongo::MemoryMappedFile::close+0x54
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\storage\mmap_v1\mmap_windows.cpp(286)                    mongo::MemoryMappedFile::map+0x4c4
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\storage\mmap_v1\mmap.cpp(113)                            mongo::MemoryMappedFile::mapWithOptions+0xd6
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\storage\mmap_v1\durable_mapped_file.cpp(250)             mongo::DurableMappedFile::open+0x146
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\storage\mmap_v1\catalog\namespace_index.cpp(159)         mongo::NamespaceIndex::init+0xac
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\storage\mmap_v1\mmap_v1_database_catalog_entry.cpp(170)  mongo::MMAPV1DatabaseCatalogEntry::MMAPV1DatabaseCatalogEntry+0x199
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\storage\mmap_v1\mmap_v1_engine.cpp(270)                  mongo::MMAPV1Engine::getDatabaseCatalogEntry+0x175
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\catalog\database_holder.cpp(119)                         mongo::DatabaseHolder::openDb+0x2fc
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(329)                                              mongo::repairDatabasesAndCheckVersion+0x522
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(500)                                              mongo::_initAndListen+0xd35
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(594)                                              mongo::initAndListen+0x27
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(823)                                              mongoDbMain+0x216
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(633)                                              wmain+0x35
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] mongod.exe    f:\dd\vctools\crt\crtw32\dllstuff\crtexe.c(623)                           __tmainCRTStartup+0x10f
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten] KERNEL32.DLL                                                                            BaseThreadInitThunk+0xd
2015-08-05T19:29:16.938+0100 I CONTROL  [initandlisten]
2015-08-05T19:29:16.962+0100 W STORAGE  [initandlisten] database C:\cygwin64\home\Administrator\pt-317-mmap\data\mmapv1 drop_database2 could not be opened due to DBException 0: assertion C:\data\mci\src\src\mongo/util/concurrency/rwlock.h:204
2015-08-05T19:29:16.977+0100 I STORAGE  [initandlisten] exception in initAndListen: 0 assertion C:\data\mci\src\src\mongo/util/concurrency/rwlock.h:204, terminating
2015-08-05T19:29:16.983+0100 W REPL     [initandlisten] ReplicationCoordinatorImpl::shutdown() called before startReplication() finished.  Shutting down without cleaning up the replication system
2015-08-05T19:29:16.983+0100 I CONTROL  [initandlisten] now exiting
2015-08-05T19:29:16.983+0100 I NETWORK  [initandlisten] shutdown: going to close listening sockets...
2015-08-05T19:29:16.983+0100 I NETWORK  [initandlisten] shutdown: going to flush diaglog...
2015-08-05T19:29:16.983+0100 I NETWORK  [initandlisten] shutdown: going to close sockets...
2015-08-05T19:29:16.983+0100 I STORAGE  [initandlisten] shutdown: waiting for fs preallocator...
2015-08-05T19:29:16.983+0100 I STORAGE  [initandlisten] shutdown: final commit...
2015-08-05T19:29:17.177+0100 I JOURNAL  [initandlisten] journalCleanup...
2015-08-05T19:29:17.177+0100 I JOURNAL  [initandlisten] removeJournalFiles
2015-08-05T19:29:17.178+0100 I JOURNAL  [initandlisten] Terminating durability thread ...
2015-08-05T19:29:17.188+0100 I JOURNAL  [journal writer] Journal writer thread stopped
2015-08-05T19:29:17.188+0100 I JOURNAL  [durability] Durability thread stopped
2015-08-05T19:29:17.189+0100 I STORAGE  [initandlisten] shutdown: closing all files...
2015-08-05T19:29:17.191+0100 I STORAGE  [initandlisten] closeAllFiles() finished
2015-08-05T19:29:17.191+0100 I STORAGE  [initandlisten] shutdown: removing fs lock...
2015-08-05T19:29:17.191+0100 I CONTROL  [initandlisten] dbexit:  rc: 100



 Comments   
Comment by Jonathan Abrahams [ 05/Aug/15 ]

Same problem on Linux:

2015-08-05T12:28:17.253-0400 I CONTROL  [main] ***** SERVER RESTARTED *****
2015-08-05T12:28:17.298-0400 W -        [initandlisten] Detected unclean shutdown - /home/jonathan/pt-317-mmap/data/mmapv1/mongod.lock is not empty.
2015-08-05T12:28:17.302-0400 I STORAGE  [initandlisten] **************
old lock file: /home/jonathan/pt-317-mmap/data/mmapv1/mongod.lock.  probably means unclean shutdown,
but there are no journal files to recover.
this is likely human error or filesystem corruption.
please make sure that your journal directory is mounted.
found 3 dbs.
see: http://dochub.mongodb.org/core/repair for more information
*************
2015-08-05T12:28:17.310-0400 I STORAGE  [initandlisten] exception in initAndListen: 12596 old lock file, terminating
2015-08-05T12:28:17.310-0400 I CONTROL  [initandlisten] dbexit:  rc: 100

Generated at Thu Feb 08 03:52:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.