[SERVER-5380] unable to start mongod give exception in initAndListen: 13536 journal version number mismatch 0, terminatin. Created: 23/Mar/12  Updated: 15/Aug/12  Resolved: 11/Apr/12

Status: Closed
Project: Core Server
Component/s: Admin, Stability
Affects Version/s: 2.1.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: jitendra Assignee: siddharth.singh@10gen.com
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux.


Attachments: File mongod_40000.rar    
Issue Links:
Related
related to SERVER-5630 mongos unable to start give error "di... Closed
related to SERVER-5681 unable to start mongod give exception... Closed
Operating System: Linux
Participants:

 Description   

Unable to start mongod

Logs :

Thu Mar 22 17:38:12 [initandlisten] info no lsn file in journal/ directory
Thu Mar 22 17:38:12 [initandlisten] recover lsn: 0
Thu Mar 22 17:38:12 [initandlisten] recover /u01/shard3/journal/j._0
Thu Mar 22 17:38:12 [initandlisten] journal file version number mismatch. recover with old version of mongod, terminate cleanly, then upgrade.
Thu Mar 22 17:38:12 [initandlisten] User Assertion: 13536:journal version number mismatch 0
Thu Mar 22 17:38:12 [initandlisten] exception during recovery
Thu Mar 22 17:38:12 [initandlisten] exception in initAndListen: 13536 journal version number mismatch 0, terminating
Thu Mar 22 17:38:12 dbexit:
Thu Mar 22 17:38:12 [initandlisten] shutdown: going to close listening sockets...



 Comments   
Comment by siddharth.singh@10gen.com [ 13/Apr/12 ]

Hi Jitendra,

Can you please ask this and any follow up questions through our free support channel on mongodb-user google group (http://groups.google.com/group/mongodb-user). It also helps a wider audience as they can browse the forum in case they run into a similar issue.

We primarily use JIRA as a bug tracking system as it helps us keep things cleaner.

Thanks.

Comment by jitendra [ 13/Apr/12 ]

hi Siddharth Singh,

How to know mongodb running with NUMA.

Thanks for reply.

Comment by siddharth.singh@10gen.com [ 12/Apr/12 ]

To start mongod after a journal corruption scenario, just remove the journal files and start mongod again.

Comment by jitendra [ 12/Apr/12 ]

hi Siddharth Singh,

what is the solution when a failed assertion on journal validity would be reported as "journal file header invalid. This could indicate corruption..."".

what is recommendation to start mongod server in case of "journal file header invalid. This could indicate corruption..."".

we will test mongo without NUMA.

Thanks for reply.

Comment by siddharth.singh@10gen.com [ 09/Apr/12 ]

Also, I saw in the logs that you are running mongo with NUMA. We do not recommend that setup. Please see this more details : http://www.mongodb.org/display/DOCS/NUMA

Comment by siddharth.singh@10gen.com [ 09/Apr/12 ]

Hi Jitendra,

Your journal files seem to be corrupt. Unclean restarts of your machine/system are a good reason why the journal files might have got corrupted. Before replaying the journal file, we check the version of the journal file. If the versions do not match we throw an error. In most cases, such a mismatch can happen when the users upgrade their mongod (hence the error message regarding recovering with old mongod version that you saw before). In your case, the versions do not match because of the journal corruption.

In our current master branch we have already made better recovery error message changes. A failed assertion on journal validity would be reported as "journal file header invalid. This could indicate corruption..."

Thanks for reporting this.

Comment by jitendra [ 05/Apr/12 ]

hi Siddharth Singh,

PFA mongod_40000.rar.

mongod_40000.rar have full logs.

Kindly help me with this....

Thanks in Advance,
Jitendra Verma

Comment by jitendra [ 05/Apr/12 ]

Full logs

Comment by siddharth.singh@10gen.com [ 02/Apr/12 ]

Can you please attach the full logs to this ticket. JIRA provides file attachment (~150 MB)

Comment by jitendra [ 02/Apr/12 ]

hi
logs:

          • SERVER RESTARTED *****
            Tue Mar 13 17:12:27 BackgroundJob starting: DataFileSync
            Tue Mar 13 17:12:27 isInRangeTest passed
            Tue Mar 13 17:12:27 shardKeyTest passed
            Tue Mar 13 17:12:27 shardObjTest passed
            Tue Mar 13 17:12:27 versionCmpTest passed
            Tue Mar 13 17:12:27 versionArrayTest passed
            Tue Mar 13 17:12:27 [initandlisten] MongoDB starting : pid=5565 port=40000 dbpath=/u01/shard5 64-bit host=ct-node-ft-93
            Tue Mar 13 17:12:27 [initandlisten] db version v2.0.3-rc0, pdfile version 4.5
            Tue Mar 13 17:12:27 [initandlisten] git version: 643c3a25c7fa272a3ff343a7ed653f0cef17f60f
            Tue Mar 13 17:12:27 [initandlisten] build info: Linux ip-10-110-9-236 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41
            Tue Mar 13 17:12:27 [initandlisten] options: { bind_ip: "-all", dbpath: "/u01/shard5", journalCommitInterval: 2, logappend: true, logpath: "/usr/local/ct/depend/mongo/logs/mongod_40000.log", port: 40000, quiet: true, shardsvr: true, smallfiles: true, vvv: true }

            Tue Mar 13 17:12:27 [initandlisten] flushing directory /u01/shard5
            Tue Mar 13 17:12:27 [initandlisten] journal dir=/u01/shard5/journal
            Tue Mar 13 17:12:27 [initandlisten] recover begin
            Tue Mar 13 17:12:27 [initandlisten] info no lsn file in journal/ directory
            Tue Mar 13 17:12:27 [initandlisten] recover lsn: 0
            Tue Mar 13 17:12:27 [initandlisten] recover /u01/shard5/journal/j._0
            Tue Mar 13 17:12:27 [initandlisten] journal file version number mismatch. recover with old version of mongod, terminate cleanly, then upgrade.
            Tue Mar 13 17:12:27 [initandlisten] User Assertion: 13536:journal version number mismatch 0
            Tue Mar 13 17:12:27 [initandlisten] exception during recovery
            Tue Mar 13 17:12:27 [initandlisten] exception in initAndListen: 13536 journal version number mismatch 0, terminating
            Tue Mar 13 17:12:27 dbexit:
            Tue Mar 13 17:12:27 [initandlisten] shutdown: going to close listening sockets...
            Tue Mar 13 17:12:27 [initandlisten] shutdown: going to flush diaglog...
            Tue Mar 13 17:12:27 [initandlisten] shutdown: going to close sockets...
            Tue Mar 13 17:12:27 [initandlisten] shutdown: waiting for fs preallocator...
            Tue Mar 13 17:12:27 [initandlisten] shutdown: lock for final commit...
            Tue Mar 13 17:12:27 [initandlisten] shutdown: final commit...
            Tue Mar 13 17:12:27 [initandlisten] shutdown: closing all files...
            Tue Mar 13 17:12:27 [initandlisten] closeAllFiles() finished
            Tue Mar 13 17:12:27 [initandlisten] shutdown: removing fs lock...
            Tue Mar 13 17:12:27 dbexit: really exiting now

m\c means machine (system).

Thanks in Advance,
Jitendra Verma

Comment by siddharth.singh@10gen.com [ 02/Apr/12 ]

Hi Jitendra,

Can you please attach the full log. Also, can you please tell what do you mean by "m\c". I will try to reproduce this case. Meanwhile can you try to upgrade to 2.0.4,and see if that helps.

Comment by jitendra [ 02/Apr/12 ]

I am using MongoDB version 2.0.3-rc0... i am running it with -vvv option.
I can upload complete log file if you want... or snap shot of the logs when error condition occurred....
One thing i am sure is we have not changed the version of mongodb...
Also from logs i verified the version of mongodb before the error and after the error... they were same...

THIS ERROR is very frequent in my environment... Kindly help me with this....

Thanks in Advance,
Jitendra Verma

Comment by jitendra [ 02/Apr/12 ]

setup details:

All mongods and config server run with enable journal option mongo server details are following.

total mongod server : 8
total mongod (configserver) server:3
total mongos server :1

just m\c reboot multiple time.some mongods did not start given error:
"journal file version number mismatch.
recover with old version of mongod,
terminate cleanly, then upgrade."

Comment by siddharth.singh@10gen.com [ 30/Mar/12 ]

Hi Jitendra,

Are all the mongods of the same version ? How are you running them ? If you could please give us more details about your setup and the exact steps that you took before you saw this error that would be helpful.

Comment by jitendra [ 30/Mar/12 ]

i did not change version of mongod. only restart multiple time.

Comment by Eliot Horowitz (Inactive) [ 30/Mar/12 ]

You need start with the version of mongod you were running before.

Comment by jitendra [ 30/Mar/12 ]

No it have not fixed. just m\c reboot multiple time.I tried to run multiple mongods.

some mongods did not start gived error:
"journal file version number mismatch.
recover with old version of mongod,
terminate cleanly, then upgrade."

How to resolve this problem.

Comment by Eliot Horowitz (Inactive) [ 30/Mar/12 ]

sounds like you have it fixed

Comment by jitendra [ 28/Mar/12 ]

sorry some mistake above reply.
I was not upgrade mongod version. just m\c reboot multiple time.I tried to run multiple mongods.
some mongods did not start giving journal file version number mismatch. recover with old version of mongod, terminate cleanly, then upgrade.

Comment by jitendra [ 28/Mar/12 ]

I was not upgrade mongod version. just m\c reboot multiple time.some mongod start giving journal file version number mismatch. recover with old version of mongod, terminate cleanly, then upgrade.

Comment by Eliot Horowitz (Inactive) [ 23/Mar/12 ]

Note:

Thu Mar 22 17:38:12 [initandlisten] journal file version number mismatch. recover with old version of mongod, terminate cleanly, then upgrade.

What version were you running with before?

Generated at Thu Feb 08 03:08:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.