[SERVER-20342] Corruption, or an unclean shutdown while writing the first section in a journal file Created: 10/Sep/15  Updated: 09/Jun/16  Resolved: 05/Mar/16

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 2.6.5
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Robin Pedersen Assignee: Kelsey Schubert
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

We are currently deploying MongoDB on a number of sites, running Ubuntu 12.04 precise. On one of these systems, after running MongoDB for a few weeks, we discovered that the 'mongod' service had stopped. Trying to start it gave the following log:

2015-09-01T15:22:23.447+0200 ***** SERVER RESTARTED *****
2015-09-01T15:22:23.463+0200 [initandlisten] MongoDB starting : pid=31061 port=27017 dbpath=/var/lib/mongodb 64-bit host=intelecom-edda-ferd-gondan-pcie-cb1
2015-09-01T15:22:23.463+0200 [initandlisten] db version v2.6.5
2015-09-01T15:22:23.463+0200 [initandlisten] git version: e99d4fcb4279c0279796f237aa92fe3b64560bf6
2015-09-01T15:22:23.463+0200 [initandlisten] build info: Linux build8.nj1.10gen.cc 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
2015-09-01T15:22:23.463+0200 [initandlisten] allocator: tcmalloc
2015-09-01T15:22:23.463+0200 [initandlisten] options: { config: "/etc/mongod.conf", net: { bindIp: "127.0.0.1" }, storage: { dbPath: "/var/lib/mongodb" }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.log" } }
2015-09-01T15:22:23.464+0200 [initandlisten] journal dir=/var/lib/mongodb/journal
2015-09-01T15:22:23.464+0200 [initandlisten] recover begin
2015-09-01T15:22:23.471+0200 [initandlisten] recover lsn: 2811296823
2015-09-01T15:22:23.471+0200 [initandlisten] recover /var/lib/mongodb/journal/j._12
2015-09-01T15:22:23.479+0200 [initandlisten] Journal file header invalid. This could indicate corruption, or an unclean shutdown while writing the first section in a journal file.
2015-09-01T15:22:23.481+0200 [initandlisten] recover error: abrupt end to file /var/lib/mongodb/journal/j._12, yet it isn't the last journal file
2015-09-01T15:22:23.482+0200 [initandlisten] dbexception during recovery: 13535 recover abrupt journal file end
2015-09-01T15:22:23.487+0200 [initandlisten] exception in initAndListen: 13535 recover abrupt journal file end, terminating
2015-09-01T15:22:23.487+0200 [initandlisten] dbexit:
2015-09-01T15:22:23.487+0200 [initandlisten] shutdown: going to close listening sockets...
2015-09-01T15:22:23.487+0200 [initandlisten] shutdown: going to flush diaglog...
2015-09-01T15:22:23.487+0200 [initandlisten] shutdown: going to close sockets...
2015-09-01T15:22:23.487+0200 [initandlisten] shutdown: waiting for fs preallocator...
2015-09-01T15:22:23.487+0200 [initandlisten] shutdown: lock for final commit...
2015-09-01T15:22:23.487+0200 [initandlisten] shutdown: final commit...
2015-09-01T15:22:23.487+0200 [initandlisten] shutdown: closing all files...
2015-09-01T15:22:23.487+0200 [initandlisten] closeAllFiles() finished
2015-09-01T15:22:23.487+0200 [initandlisten] shutdown: removing fs lock...
2015-09-01T15:22:23.487+0200 [initandlisten] dbexit: really exiting now

In this particular case, there was no data, so it was trivial to fix the problem by wiping the data directory and restarting the service. However, we obviously have some concerns about this.

How likely is this to happen again? How often is mongod 'writing the first section of a journal file'?
It this something that might be fixed in a later version?
How do we restore normal operation, without losing data? Would it be a good idea to automate this with something like a 'watchdog'?



 Comments   
Comment by Ramon Fernandez Marina [ 05/Mar/16 ]

Apologies for the radio silence robinp. We were not able to reproduce this on our end, and since you are not impacted by this issue I'm going to close this ticket.

Please note that 2.6 will be end-of-life soon, so I'd recommend you plan to upgrade to a newer version. The latest stable version, MongoDB 3.2.4, is scheduled for release next week.

Regards,
Ramón.

Comment by Ramon Fernandez Marina [ 29/Oct/15 ]

robinp, I just realized there was never a public response to this ticket, my apologies about that. I understand this is not affecting your production setup and you were able to fix the problem.

We're investigating what happens if mongod is killed while writing the journal, as it seems the most likely scenario here. We'll update this ticket when we know more.

Regards,
Ramón.

Generated at Thu Feb 08 03:53:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.