[SERVER-4869] mongod is not started give error Got signal: 7 (Bus error). Created: 03/Feb/12  Updated: 30/Mar/12  Resolved: 06/Feb/12

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.0.2
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: jitendra Assignee: Mathias Stearn
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux Debian


Operating System: Linux
Participants:

 Description   

10 mongod dshards ,one config server, one mongos are running single m\c (OS is Linux Debian).

m\c restart 3 time , some of mongod was not started.
logs

          • SERVER RESTARTED *****

Fri Feb 3 23:45:20 BackgroundJob starting: DataFileSync
Fri Feb 3 23:45:20 isInRangeTest passed
Fri Feb 3 23:45:20 shardKeyTest passed
Fri Feb 3 23:45:20 shardObjTest passed
Fri Feb 3 23:45:20 versionCmpTest passed
Fri Feb 3 23:45:20 versionArrayTest passed
Fri Feb 3 23:45:20 [initandlisten] MongoDB starting : pid=2878 port=45000 dbpath=/sata1/shard6 64-bit host=debian
Fri Feb 3 23:45:20 [initandlisten] db version v2.0.2, pdfile version 4.5
Fri Feb 3 23:45:20 [initandlisten] git version: 514b122d308928517f5841888ceaa4246a7f18e3
Fri Feb 3 23:45:20 [initandlisten] build info: Linux bs-linux64.10gen.cc 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41
Fri Feb 3 23:45:20 [initandlisten] options:

{ bind_ip: "-all", dbpath: "/sata1/shard6", journalCommitInterval: 2, logappend: true, logpath: "/usr/local/ct/depend/mongo/logs/mongod_45000.log", port: 45000, quiet: true, shardsvr: true, vvv: true }

Fri Feb 3 23:45:20 [initandlisten] flushing directory /sata1/shard6
Fri Feb 3 23:45:34 [initandlisten] journal dir=/sata1/shard6/journal
Fri Feb 3 23:45:34 [initandlisten] recover begin
Fri Feb 3 23:45:35 [initandlisten] recover lsn: 1167565
Fri Feb 3 23:45:35 [initandlisten] recover /sata1/shard6/journal/j._0
Fri Feb 3 23:45:35 [initandlisten] recover skipping application of section seq:928940 < lsn:1167565
Fri Feb 3 23:45:39 [initandlisten] recover skipping application of section seq:988595 < lsn:1167565
Fri Feb 3 23:45:39 [initandlisten] recover skipping application of section seq:1048255 < lsn:1167565
Fri Feb 3 23:45:40 [initandlisten] recover skipping application of section seq:1107905 < lsn:1167565
Fri Feb 3 23:46:34 Invalid access at address: 0x7fdd4f465000

Fri Feb 3 23:46:34 Got signal: 7 (Bus error).

Fri Feb 3 23:46:34 [DataFileSync] flushing mmap took 14129ms for 4 files
Fri Feb 3 23:46:36 Backtrace:
0xa8d669 0xa8dc40 0x7fddb30daf60 0x7fddb263ea00 0x76c2fc 0x76cbbd 0x76cfaa 0x76d751 0x76dac9 0x76e621 0x76e8f5 0x76f6ed 0x76f80b 0x75df7f 0xa91509 0xa91e7d 0xa9bbaa 0x7fddb25ddc4d 0x500f09
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0xa8d669]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x220) [0xa8dc40]
/lib/libpthread.so.0(+0xef60) [0x7fddb30daf60]
/lib/libc.so.6(memcpy+0x2b0) [0x7fddb263ea00]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur11RecoveryJob5writeERKNS0_18ParsedJournalEntryE+0x1fc) [0x76c2fc]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur11RecoveryJob10applyEntryERKNS0_18ParsedJournalEntryEbb+0x8d) [0x76cbbd]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur11RecoveryJob12applyEntriesERKSt6vectorINS0_18ParsedJournalEntryESaIS3_EE+0x6a) [0x76cfaa]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur11RecoveryJob14processSectionEPKNS0_11JSectHeaderEPKvjPKNS0_11JSectFooterE+0x6f1) [0x76d751]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur11RecoveryJob17processFileBufferEPKvj+0xf9) [0x76dac9]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur11RecoveryJob11processFileEN5boost10filesystem10basic_pathISsNS3_11path_traitsEEE+0xd1) [0x76e621]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur11RecoveryJob2goERSt6vectorIN5boost10filesystem10basic_pathISsNS4_11path_traitsEEESaIS7_EE+0xc5) [0x76e8f5]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur8_recoverEv+0x18d) [0x76f6ed]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur7recoverEv+0x3b) [0x76f80b]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur7startupEv+0x4f) [0x75df7f]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo14_initAndListenEi+0x3c9) [0xa91509]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo13initAndListenEi+0x1d) [0xa91e7d]
/usr/local/ct/depend/mongo/bin/mongod(main+0x9aaa) [0xa9bbaa]
/lib/libc.so.6(__libc_start_main+0xfd) [0x7fddb25ddc4d]
/usr/local/ct/depend/mongo/bin/mongod(__gxx_personality_v0+0x3e9) [0x500f09]

Logstream::get called in uninitialized state
Fri Feb 3 23:46:36 ERROR: Client::shutdown not called: initandlisten

please give me solution what to do in this condition.



 Comments   
Comment by jitendra [ 08/Feb/12 ]

how to handle the situation.

Comment by Eliot Horowitz (Inactive) [ 06/Feb/12 ]

Overall it looks the file system is corrupt.
Have you run fsck?

Comment by jitendra [ 06/Feb/12 ]

No i mean different mongod and different journal on same system.

Comment by Eliot Horowitz (Inactive) [ 06/Feb/12 ]

Huh?

Do you mean same journal different data?

Comment by jitendra [ 06/Feb/12 ]

m\c was same but mongods was different.

Comment by Eliot Horowitz (Inactive) [ 06/Feb/12 ]

only need to track in 1 place SERVER-4877

Comment by jitendra [ 06/Feb/12 ]

ya same server and same time.

Comment by Eliot Horowitz (Inactive) [ 06/Feb/12 ]

Is this the same server as SERVER-4877?

Comment by jitendra [ 06/Feb/12 ]

A application was inserting data into mongodb. meanwhile restart mongodb m\c by pressing reset button.

logs

Mon Feb 6 15:35:24 [conn2] runQuery called admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:24 [conn2] run command admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:24 [conn2] command admin.$cmd command:

{ getlasterror: 1 }

ntoreturn:1 reslen:67 0ms
Mon Feb 6 15:35:24 [conn2] insert 00060212.Database 56ms
Mon Feb 6 15:35:24 [conn2] runQuery called admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:24 [conn2] run command admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:24 [conn2] command admin.$cmd command:

{ getlasterror: 1 }

ntoreturn:1 reslen:67 0ms
Mon Feb 6 15:35:25 [conn2] insert 00060212.Database 24ms
Mon Feb 6 15:35:25 [conn2] runQuery called admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:25 [conn2] run command admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:25 [conn2] command admin.$cmd command:

{ getlasterror: 1 }

ntoreturn:1 reslen:67 0ms
Mon Feb 6 15:35:26 [conn2] insert 00060212.Database 103ms
Mon Feb 6 15:35:26 [conn2] runQuery called admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:26 [conn2] run command admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:26 [conn2] command admin.$cmd command:

{ getlasterror: 1 }

ntoreturn:1 reslen:67 0ms
Mon Feb 6 15:35:26 [conn2] insert 00060212.Database 110ms
Mon Feb 6 15:35:26 [conn2] runQuery called admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:26 [conn2] run command admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:26 [conn2] command admin.$cmd command:

{ getlasterror: 1 }

ntoreturn:1 reslen:67 0ms
Mon Feb 6 15:35:27 [conn2] insert 00060212.Database 38ms
Mon Feb 6 15:35:27 [conn2] runQuery called admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:27 [conn2] run command admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:27 [conn2] command admin.$cmd command:

{ getlasterror: 1 }

ntoreturn:1 reslen:67 0ms
Mon Feb 6 15:35:28 [conn2] insert 00060212.Database 46ms
Mon Feb 6 15:35:28 [conn2] runQuery called admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:28 [conn2] run command admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:28 [conn2] command admin.$cmd command:

{ getlasterror: 1 }

ntoreturn:1 reslen:67 0ms
Mon Feb 6 15:35:29 [conn2] insert 00060212.Database 83ms
Mon Feb 6 15:35:29 [conn2] runQuery called admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:29 [conn2] run command admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:29 [conn2] command admin.$cmd command:

{ getlasterror: 1 }

ntoreturn:1 reslen:67 0ms
Mon Feb 6 15:35:29 [conn2] insert 00060212.Database 43ms
Mon Feb 6 15:35:29 [conn2] runQuery called admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:29 [conn2] run command admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:29 [conn2] command admin.$cmd command:

{ getlasterror: 1 }

ntoreturn:1 reslen:67 0ms
Mon Feb 6 15:35:30 [conn2] insert 00060212.Database 129ms
Mon Feb 6 15:35:30 [conn2] runQuery called admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:30 [conn2] run command admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:30 [conn2] command admin.$cmd command:

{ getlasterror: 1 }

ntoreturn:1 reslen:67 0ms
Mon Feb 6 15:35:31 [PeriodicTask::Runner] task: WriteBackManager::cleaner took: 0ms
Mon Feb 6 15:35:31 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took: 0ms
Mon Feb 6 15:35:31 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took: 0ms
Mon Feb 6 15:35:31 [conn2] insert 00060212.Database 82ms
Mon Feb 6 15:35:31 [conn2] runQuery called admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:31 [conn2] run command admin.$cmd

{ getlasterror: 1 }

Mon Feb 6 15:35:31 [conn2] command admin.$cmd command:

{ getlasterror: 1 }

ntoreturn:1 reslen:67 0ms

          • SERVER RESTARTED *****

Mon Feb 6 15:41:18 BackgroundJob starting: DataFileSync
Mon Feb 6 15:41:18 isInRangeTest passed
Mon Feb 6 15:41:18 shardKeyTest passed
Mon Feb 6 15:41:18 shardObjTest passed
Mon Feb 6 15:41:18 versionCmpTest passed
Mon Feb 6 15:41:18 versionArrayTest passed
Mon Feb 6 15:41:18 [initandlisten] MongoDB starting : pid=2307 port=10000 dbpath=/sata1/shard1 64-bit host=debian
Mon Feb 6 15:41:18 [initandlisten] db version v2.0.2, pdfile version 4.5

Mon Feb 6 15:41:18 BackgroundJob starting: DataFileSync
Mon Feb 6 15:41:18 isInRangeTest passed
Mon Feb 6 15:41:18 shardKeyTest passed
Mon Feb 6 15:41:18 shardObjTest passed
Mon Feb 6 15:41:18 versionCmpTest passed
Mon Feb 6 15:41:18 versionArrayTest passed
Mon Feb 6 15:41:18 [initandlisten] MongoDB starting : pid=2307 port=10000 dbpath=/sata1/shard1 64-bit host=debian
Mon Feb 6 15:41:18 [initandlisten] db version v2.0.2, pdfile version 4.5
Mon Feb 6 15:41:18 [initandlisten] git version: 514b122d308928517f5841888ceaa4246a7f18e3
Mon Feb 6 15:41:18 [initandlisten] build info: Linux bs-linux64.10gen.cc 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41
Mon Feb 6 15:41:18 [initandlisten] options:

{ bind_ip: "-all", dbpath: "/sata1/shard1", journalCommitInterval: 2, logappend: true, logpath: "/usr/local/ct/depend/mongo/logs/mongod_10000.log", port: 10000, quiet: true, shardsvr: true, vvv: true }

Mon Feb 6 15:41:18 [initandlisten] flushing directory /sata1/shard1
Mon Feb 6 15:41:18 [initandlisten] journal dir=/sata1/shard1/journal
Mon Feb 6 15:41:18 [initandlisten] recover begin
Mon Feb 6 15:41:18 [initandlisten] recover lsn: 756319
Mon Feb 6 15:41:18 [initandlisten] recover /sata1/shard1/journal/j._0
Mon Feb 6 15:41:18 [initandlisten] recover skipping application of section seq:338040 < lsn:756319
Mon Feb 6 15:41:18 [initandlisten] recover skipping application of section seq:397694 < lsn:756319
Mon Feb 6 15:41:18 [initandlisten] recover skipping application of section seq:457339 < lsn:756319
Mon Feb 6 15:41:18 [initandlisten] recover skipping application of section seq:516969 < lsn:756319
Mon Feb 6 15:41:18 [initandlisten] recover skipping application of section seq:576589 < lsn:756319
Mon Feb 6 15:41:18 [initandlisten] recover skipping application of section seq:637179 < lsn:756319
Mon Feb 6 15:41:18 [initandlisten] recover skipping application of section seq:696739 < lsn:756319
Mon Feb 6 15:41:54 Invalid access at address: 0x7fd3b4bda00c

Mon Feb 6 15:41:54 Got signal: 7 (Bus error).

Mon Feb 6 15:41:54 Backtrace:
0xa8d669 0xa8dc40 0x7fd3fd4d1f60 0x76da88 0x76e621 0x76e8f5 0x76f6ed 0x76f80b 0x75df7f 0xa91509 0xa91e7d 0xa9bbaa 0x7fd3fc9d4c4d 0x500f09
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0xa8d669]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x220) [0xa8dc40]
/lib/libpthread.so.0(+0xef60) [0x7fd3fd4d1f60]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur11RecoveryJob17processFileBufferEPKvj+0xb8) [0x76da88]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur11RecoveryJob11processFileEN5boost10filesystem10basic_pathISsNS3_11path_traitsEEE+0xd1) [0x76e621]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur11RecoveryJob2goERSt6vectorIN5boost10filesystem10basic_pathISsNS4_11path_traitsEEESaIS7_EE+0xc5) [0x76e8f5]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur8_recoverEv+0x18d) [0x76f6ed]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur7recoverEv+0x3b) [0x76f80b]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo3dur7startupEv+0x4f) [0x75df7f]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo14_initAndListenEi+0x3c9) [0xa91509]
/usr/local/ct/depend/mongo/bin/mongod(_ZN5mongo13initAndListenEi+0x1d) [0xa91e7d]
/usr/local/ct/depend/mongo/bin/mongod(main+0x9aaa) [0xa9bbaa]
/lib/libc.so.6(__libc_start_main+0xfd) [0x7fd3fc9d4c4d]
/usr/local/ct/depend/mongo/bin/mongod(__gxx_personality_v0+0x3e9) [0x500f09]

Logstream::get called in uninitialized state
Mon Feb 6 15:41:54 ERROR: Client::shutdown not called: initandlisten

Comment by Eliot Horowitz (Inactive) [ 06/Feb/12 ]

Can you describe what happened to this system before?
Do you have old logs?

Comment by jitendra [ 06/Feb/12 ]

hi please repl.

Generated at Thu Feb 08 03:07:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.