[SERVER-15579] MongoD MMAP V1 engine does not recover from partially created .ns file Created: 09/Oct/14  Updated: 06/Dec/22  Resolved: 14/Sep/18

Status: Closed
Project: Core Server
Component/s: MMAPv1, Storage
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Backlog - Storage Execution Team
Resolution: Won't Fix Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-16581 Init script should return only after ... Closed
Assigned Teams:
Storage Execution
Operating System: ALL
Participants:
Linked BF Score: 0

 Description   

This was reproduced through the repl7.js test, which kills mongod using signal 9 while the server is creating a new database. Upon restart, the .ns file is found to be incomplete and startup fails.

 m31001| 2014-10-09T01:31:29.577+0000 I INDEXING [replslave] allocating new ns file /data/db/repl7tests-slave/aa.ns, filling with zeroes...

Server is killed using SIGKILL at this point and upon restart, this failure occurs and causes the server to shut down:

 m31001| 2014-10-09T01:31:34.609+0000 I -        [initandlisten] MongoDB starting : pid=28630 port=31001 dbpath=/data/db/repl7tests-slave slave=1 64-bit host=ip-10-171-61-242
 m31001| 2014-10-09T01:31:34.609+0000 I -        [initandlisten] 
 m31001| 2014-10-09T01:31:34.609+0000 I -        [initandlisten] ** NOTE: This is a development version (2.7.8-pre-) of MongoDB.
 m31001| 2014-10-09T01:31:34.609+0000 I -        [initandlisten] **       Not recommended for production.
 m31001| 2014-10-09T01:31:34.609+0000 I -        [initandlisten] 
 m31001| 2014-10-09T01:31:34.610+0000 I -        [initandlisten] db version v2.7.8-pre-
 m31001| 2014-10-09T01:31:34.610+0000 I -        [initandlisten] git version: bc8c4b81164c17f95d01262e6aeb22f934e5ed8f
 m31001| 2014-10-09T01:31:34.610+0000 I -        [initandlisten] OpenSSL version: OpenSSL 1.0.1 14 Mar 2012
 m31001| 2014-10-09T01:31:34.610+0000 I -        [initandlisten] build info: Linux ip-10-231-169-199 3.2.0-36-virtual #57-Ubuntu SMP Tue Jan 8 22:04:49 UTC 2013 x86_64 BOOST_LIB_VERSION=1_49
 m31001| 2014-10-09T01:31:34.610+0000 I -        [initandlisten] allocator: tcmalloc
 m31001| 2014-10-09T01:31:34.610+0000 I -        [initandlisten] options: { net: { bindIp: "127.0.0.1", http: { enabled: false }, port: 31001 }, nopreallocj: true, replication: { oplogSizeMB: 40 }, security: { keyFile: "/data/mci/shell/src/jstests/libs/authTestsKey" }, setParameter: { authenticationMechanisms: "MONGODB-CR", enableLocalhostAuthBypass: "false", enableTestCommands: "1" }, slave: true, source: "127.0.0.1:31000", storage: { dbPath: "/data/db/repl7tests-slave", journal: { enabled: true }, preallocDataFiles: false, smallFiles: true } }
 m31001| 2014-10-09T01:31:34.611+0000 I JOURNAL  [initandlisten] journal dir=/data/db/repl7tests-slave/journal
 m31001| 2014-10-09T01:31:34.611+0000 I JOURNAL  [initandlisten] recover : no journal files present, no recovery needed
 m31001| 2014-10-09T01:31:34.637+0000 I INDEXING [initandlisten] bad .ns file: /data/db/repl7tests-slave/aa.ns
 m31001| 2014-10-09T01:31:34.638+0000 I -        [initandlisten] warning database /data/db/repl7tests-slave aa could not be opened
 m31001| 2014-10-09T01:31:34.638+0000 I -        [initandlisten] DBException 10079: bad .ns file length, cannot open database
 m31001| 2014-10-09T01:31:34.638+0000 I STORAGE  [initandlisten] exception in initAndListen: 10079 bad .ns file length, cannot open database, terminating
 m31001| 2014-10-09T01:31:34.638+0000 I COMMANDS [initandlisten] now exiting
 m31001| 2014-10-09T01:31:34.638+0000 I NETWORK  [initandlisten] shutdown: going to close listening sockets...
 m31001| 2014-10-09T01:31:34.638+0000 I NETWORK  [initandlisten] removing socket file: /tmp/mongodb-31001.sock
 m31001| 2014-10-09T01:31:34.638+0000 I NETWORK  [initandlisten] shutdown: going to flush diaglog...
 m31001| 2014-10-09T01:31:34.638+0000 I NETWORK  [initandlisten] shutdown: going to close sockets...
 m31001| 2014-10-09T01:31:34.638+0000 I STORAGE  [initandlisten] shutdown: waiting for fs preallocator...
 m31001| 2014-10-09T01:31:34.638+0000 I STORAGE  [initandlisten] shutdown: final commit...
 m31001| 2014-10-09T01:31:34.643+0000 I STORAGE  [initandlisten] shutdown: closing all files...
 m31001| 2014-10-09T01:31:34.649+0000 I STORAGE  [initandlisten] closeAllFiles() finished
 m31001| 2014-10-09T01:31:34.649+0000 I JOURNAL  [initandlisten] journalCleanup...
 m31001| 2014-10-09T01:31:34.649+0000 I JOURNAL  [initandlisten] removeJournalFiles
 m31001| 2014-10-09T01:31:34.650+0000 I STORAGE  [initandlisten] shutdown: removing fs lock...
 m31001| 2014-10-09T01:31:34.650+0000 I COMMANDS [initandlisten] dbexit:  rc: 100


Generated at Thu Feb 08 03:38:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.