[SERVER-31259] Mongo DB fails on restart Created: 26/Sep/17  Updated: 27/Jul/18  Resolved: 26/Sep/17

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.2.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Harish Assignee: Kelsey Schubert
Resolution: Done Votes: 0
Labels: envns, rns, wtc
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: Linux
Participants:

 Description   

MogoDB is failing on restart.. need urgent help please....

logs is as below

2017-09-26T18:03:14.051+0530 I CONTROL  [main] ***** SERVER RESTARTED *****
2017-09-26T18:03:14.060+0530 I CONTROL  [initandlisten] MongoDB starting : pid=4159 port=27017 dbpath=/mnt/cdr1/mongodb 64-bit host=a13f01s03
2017-09-26T18:03:14.060+0530 I CONTROL  [initandlisten] db version v3.2.1
2017-09-26T18:03:14.061+0530 I CONTROL  [initandlisten] git version: a14d55980c2cdc565d4704a7e3ad37e4e535c1b2
2017-09-26T18:03:14.061+0530 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1f 6 Jan 2014
2017-09-26T18:03:14.061+0530 I CONTROL  [initandlisten] allocator: tcmalloc
2017-09-26T18:03:14.061+0530 I CONTROL  [initandlisten] modules: none
2017-09-26T18:03:14.061+0530 I CONTROL  [initandlisten] build environment:
2017-09-26T18:03:14.061+0530 I CONTROL  [initandlisten]     distmod: ubuntu1404
2017-09-26T18:03:14.061+0530 I CONTROL  [initandlisten]     distarch: x86_64
2017-09-26T18:03:14.061+0530 I CONTROL  [initandlisten]     target_arch: x86_64
2017-09-26T18:03:14.061+0530 I CONTROL  [initandlisten] options: { config: "/etc/mongod.conf", net: { port: 27017 }, processManagement: { fork: true }, storage: { dbPath: "/mnt/cdr1/mongodb", journal: { enabled: true } }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.log" } }
2017-09-26T18:03:14.099+0530 I -        [initandlisten] Detected data files in /mnt/cdr1/mongodb created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2017-09-26T18:03:14.099+0530 W -        [initandlisten] Detected unclean shutdown - /mnt/cdr1/mongodb/mongod.lock is not empty.
2017-09-26T18:03:14.099+0530 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
2017-09-26T18:03:14.099+0530 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=18G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-09-26T18:04:22.889+0530 E STORAGE  [initandlisten] WiredTiger (0) [1506429262:889453][4159:0x7fa6a540dd00], file:index-12-4515115940070283174.wt, WT_CURSOR.insert: read checksum error for 8192B block at offset 1541156864: block header checksum of 3323072155 doesn't match expected checksum of 4255826968
2017-09-26T18:04:22.889+0530 E STORAGE  [initandlisten] WiredTiger (0) [1506429262:889662][4159:0x7fa6a540dd00], file:index-12-4515115940070283174.wt, WT_CURSOR.insert: index-12-4515115940070283174.wt: encountered an illegal file format or internal value
2017-09-26T18:04:22.889+0530 E STORAGE  [initandlisten] WiredTiger (-31804) [1506429262:889857][4159:0x7fa6a540dd00], file:index-12-4515115940070283174.wt, WT_CURSOR.insert: the process must exit and restart: WT_PANIC: WiredTiger library panic
2017-09-26T18:04:22.889+0530 I -        [initandlisten] Fatal Assertion 28558
2017-09-26T18:04:22.890+0530 I -        [initandlisten] 
 
***aborting after fassert() failure
 
 
2017-09-26T18:04:22.910+0530 F -        [initandlisten] Got signal: 6 (Aborted).
 
 0x12ea922 0x12e9a89 0x12ea292 0x7fa6a3d89340 0x7fa6a39eacc9 0x7fa6a39ee0d8 0x1275382 0x1074313 0x1a298cc 0x1a29a6d 0x1a29e54 0x19690c6 0x19855ba 0x198ab22 0x19a7835 0x19792a7 0x19c28ce 0x1a37dd5 0x19e00c0 0x1a38559 0x19b8ee7 0x19b2239 0x105a917 0x1057b60 0xf84ee8 0x9913c1 0x94cc49 0x7fa6a39d5ec5 0x98ef5c
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"EEA922"},{"b":"400000","o":"EE9A89"},{"b":"400000","o":"EEA292"},{"b":"7FA6A3D79000","o":"10340"},{"b":"7FA6A39B4000","o":"36CC9"},{"b":"7FA6A39B4000","o":"3A0D8"},{"b":"400000","o":"E75382"},{"b":"400000","o":"C74313"},{"b":"400000","o":"16298CC"},{"b":"400000","o":"1629A6D"},{"b":"400000","o":"1629E54"},{"b":"400000","o":"15690C6"},{"b":"400000","o":"15855BA"},{"b":"400000","o":"158AB22"},{"b":"400000","o":"15A7835"},{"b":"400000","o":"15792A7"},{"b":"400000","o":"15C28CE"},{"b":"400000","o":"1637DD5"},{"b":"400000","o":"15E00C0"},{"b":"400000","o":"1638559"},{"b":"400000","o":"15B8EE7"},{"b":"400000","o":"15B2239"},{"b":"400000","o":"C5A917"},{"b":"400000","o":"C57B60"},{"b":"400000","o":"B84EE8"},{"b":"400000","o":"5913C1"},{"b":"400000","o":"54CC49"},{"b":"7FA6A39B4000","o":"21EC5"},{"b":"400000","o":"58EF5C"}],"processInfo":{ "mongodbVersion" : "3.2.1", "gitVersion" : "a14d55980c2cdc565d4704a7e3ad37e4e535c1b2", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.13.0-35-generic", "version" : "#62-Ubuntu SMP Fri Aug 15 01:58:42 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "D76764D44DE9B088362776AF243199FDCF5756E8" }, { "b" : "7FFFB0ABF000", "elfType" : 3, "buildId" : "B25EDEA74063E2308FE1BF3608006A9E3D860BA9" }, { "b" : "7FA6A4F9D000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "CA0C4DCB4A63C439D8467DCBDBBDDF66004DEC9C" }, { "b" : "7FA6A4BC3000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "230EBE6145B6681D0CB7E4C9021F0D899C02E0C4" }, { "b" : "7FA6A49BB000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "92FCF41EFE012D6186E31A59AD05BDBB487769AB" }, { "b" : "7FA6A47B7000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "C1AE4CB7195D337A77A3C689051DABAA3980CA0C" }, { "b" : "7FA6A44B3000", "path" : "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", "elfType" : 3, "buildId" : "4BF6F7ADD8244AD86008E6BF40D90F8873892197" }, { "b" : "7FA6A41AD000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "1D76B71E905CB867B27CEF230FCB20F01A3178F5" }, { "b" : "7FA6A3F97000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "8D0AA71411580EE6C08809695C3984769F25725B" }, { "b" : "7FA6A3D79000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "9318E8AF0BFBE444731BB0461202EF57F7C39542" }, { "b" : "7FA6A39B4000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "30C94DC66A1FE95180C3D68D2B89E576D5AE213C" }, { "b" : "7FA6A51FB000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9F00581AB3C73E3AEA35995A0C50D24D59A01D47" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x12ea922]
 mongod(+0xEE9A89) [0x12e9a89]
 mongod(+0xEEA292) [0x12ea292]
 libpthread.so.0(+0x10340) [0x7fa6a3d89340]
 libc.so.6(gsignal+0x39) [0x7fa6a39eacc9]
 libc.so.6(abort+0x148) [0x7fa6a39ee0d8]
 mongod(_ZN5mongo13fassertFailedEi+0x82) [0x1275382]
 mongod(+0xC74313) [0x1074313]
 mongod(__wt_eventv+0x40C) [0x1a298cc]
 mongod(__wt_err+0x8D) [0x1a29a6d]
 mongod(__wt_panic+0x24) [0x1a29e54]
 mongod(__wt_bm_read+0x76) [0x19690c6]
 mongod(__wt_bt_read+0x1EA) [0x19855ba]
 mongod(__wt_page_in_func+0x192) [0x198ab22]
 mongod(__wt_row_search+0x8F5) [0x19a7835]
 mongod(__wt_btcur_insert+0x467) [0x19792a7]
 mongod(+0x15C28CE) [0x19c28ce]
 mongod(+0x1637DD5) [0x1a37dd5]
 mongod(__wt_log_scan+0xA10) [0x19e00c0]
 mongod(__wt_txn_recover+0x459) [0x1a38559]
 mongod(__wt_connection_workers+0x37) [0x19b8ee7]
 mongod(wiredtiger_open+0x15F9) [0x19b2239]
 mongod(_ZN5mongo18WiredTigerKVEngineC2ERKSsS2_S2_mbbb+0x557) [0x105a917]
 mongod(+0xC57B60) [0x1057b60]
 mongod(_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv+0x588) [0xf84ee8]
 mongod(_ZN5mongo13initAndListenEi+0x321) [0x9913c1]
 mongod(main+0x149) [0x94cc49]
 libc.so.6(__libc_start_main+0xF5) [0x7fa6a39d5ec5]
 mongod(+0x58EF5C) [0x98ef5c]
-----  END BACKTRACE  -----



 Comments   
Comment by Harish [ 26/Sep/17 ]

Dear Kelsey
1. Node is not a part of replica set
2. There is no backups,
3. I tried repair but did not work.

Please help me at-least in making the node up as the mongos is not able to connect with other shards also since this node is down. Even if the corrupt date is lost, I would like to make the node up at the earliest to recover the rest of data.. please help me
thanks
Harish

Comment by Kelsey Schubert [ 26/Sep/17 ]

Hi toharishs,

This error indicates that some data files have become corrupt in some way. Unfortunately, in cases like this it is very challenging to determine the root cause of the corruption without a reproduction. Often this type of issue is the result of faulty behavior below mongod (e.g. bad disks or memory). I would recommend the following steps in order to resolve the issue:

  1. If this node is part of replica set, perform a clean resync.
  2. If there are recent backups, restore from a backup.
  3. Start mongod with --repair

Kind regards,
Kelsey

Generated at Thu Feb 08 04:26:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.