[SERVER-20160] after crashed, mongod can't startup Created: 27/Aug/15  Updated: 26/Sep/15  Resolved: 26/Sep/15

Status: Closed
Project: Core Server
Component/s: Admin, WiredTiger
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: shixiong Assignee: Ramon Fernandez Marina
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File WiredTiger     File WiredTiger.basecfg     File WiredTiger.turtle     File WiredTiger.wt     File _mdb_catalog.wt     File new_files.tgz     File sizeStorer.wt     File storage.bson    
Operating System: ALL
Participants:

 Description   

Help! When i tried to fix this issue by server-18448, it can not work.
mongo version: 3.0.4
cmd: mongod -shardsvr -port 27017 -dbpath=/shard/shard160/ --storageEngine wiredTiger --repair

2015-08-27T20:18:03.686+0800 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=3G,session_max=20000,eviction=(threads_max=4),statistics=(fast),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2015-08-27T20:18:03.859+0800 E STORAGE  [initandlisten] WiredTiger (2) [1440677883:858985][2229:0x7f17b0fc2b60]: /shard/shard160//journal/WiredTigerLog.0000000263: No such file or directory
2015-08-27T20:18:03.859+0800 E STORAGE  [initandlisten] WiredTiger (0) [1440677883:859579][2229:0x7f17b0fc2b60], file:WiredTiger.wt, cursor.next: read checksum error [24576B @ 77824, 146658712 != 325107312]
2015-08-27T20:18:03.859+0800 E STORAGE  [initandlisten] WiredTiger (0) [1440677883:859605][2229:0x7f17b0fc2b60], file:WiredTiger.wt, cursor.next: WiredTiger.wt: encountered an illegal file format or internal value
2015-08-27T20:18:03.859+0800 E STORAGE  [initandlisten] WiredTiger (-31804) [1440677883:859618][2229:0x7f17b0fc2b60], file:WiredTiger.wt, cursor.next: the process must exit and restart: WT_PANIC: WiredTiger library panic
2015-08-27T20:18:03.859+0800 I -        [initandlisten] Fatal Assertion 28558
2015-08-27T20:18:03.871+0800 I CONTROL  [initandlisten] 
 0xf605f9 0xf09361 0xeece61 0xd893ea 0x1395d89 0x1395f45 0x13963e4 0x12e9df2 0x13034fc 0x1307df5 0x13050db 0x13190d8 0x12edfc6 0x1337609 0x13a1ead 0x13a2348 0x132fc21 0x1329773 0xd73050 0xd70e28 0xa6f30d 0x7e3bf2 0x7e8b19 0x7f17afc71d5d 0x7e19b9
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B605F9"},{"b":"400000","o":"B09361"},{"b":"400000","o":"AECE61"},{"b":"400000","o":"9893EA"},{"b":"400000","o":"F95D89"},{"b":"400000","o":"F95F45"},{"b":"400000","o":"F963E4"},{"b":"400000","o":"EE9DF2"},{"b":"400000","o":"F034FC"},{"b"
:"400000","o":"F07DF5"},{"b":"400000","o":"F050DB"},{"b":"400000","o":"F190D8"},{"b":"400000","o":"EEDFC6"},{"b":"400000","o":"F37609"},{"b":"400000","o":"FA1EAD"},{"b":"400000","o":"FA2348"},{"b":"400000","o":"F2FC21"},{"b":"400000","o":"F29773"},{"b":"400000","o":"973050"},{"b":"400000","o":"970E28"},{"b":"400000","o":"66F30D"},{"b":"400000","o":"3E3BF2"},{"b":"400000","o":"3E8B19"},{"b":"7F17AFC53000","o":"1ED5D"},{"b":"400000","o":"3E19B9"}],"processInfo":{ "mongodbVersion" : "3.0.4", "gitVersion" : "0481c958daeb2969800511e7475dc66986fa9ed5", "uname" : { "sysname" : "Linux", "release" : "2.6.32-358.el6.x86_64", "version" : "#1 SMP Fri Feb 22 00:31:26 UTC 2013", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "7FFFC54F9000", "elfType" : 3 }, { "b" : "7F17B0B93000", "path" : "/lib64/libpthread.so.0", "elfType" : 3 }, { "b" : "7F17B098B000", "path" : "/lib64/librt.so.1", "elfType" : 3 }, { "b" : "7F17B0787000", "path" : "/lib64/libdl.so.2", "elfType" : 3 }, { "b" : "7F17B0481000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3 }, { "b" : "7F17B01FD000", "path" : "/lib64/libm.so.6", "elfType" : 3 }, { "b" : "7F17AFFE7000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3 }, { "b" : "7F17AFC53000", "path" : "/lib64/libc.so.6", "elfType" : 3 }, { "b" : "7F17B0DB0000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3 } ] }} mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf605f9]
 mongod(_ZN5mongo10logContextEPKc+0xE1) [0xf09361]
 mongod(_ZN5mongo13fassertFailedEi+0x61) [0xeece61]
 mongod(+0x9893EA) [0xd893ea]
 mongod(__wt_eventv+0x489) [0x1395d89]
 mongod(__wt_err+0x95) [0x1395f45]
 mongod(__wt_panic+0x24) [0x13963e4]
 mongod(__wt_bm_read+0x72) [0x12e9df2]
 mongod(__wt_bt_read+0x1AC) [0x13034fc]
 mongod(__wt_cache_read+0x1C5) [0x1307df5]
 mongod(__wt_page_in_func+0x40B) [0x13050db]
 mongod(__wt_tree_walk+0x2D8) [0x13190d8]
 mongod(__wt_btcur_next+0x316) [0x12edfc6]
 mongod(+0xF37609) [0x1337609]
 mongod(+0xFA1EAD) [0x13a1ead]
 mongod(__wt_txn_recover+0x3E8) [0x13a2348]
 mongod(__wt_connection_workers+0x61) [0x132fc21]
 mongod(wiredtiger_open+0x11B3) [0x1329773]
 mongod(_ZN5mongo18WiredTigerKVEngineC1ERKSsS2_bb+0x300) [0xd73050]
 mongod(+0x970E28) [0xd70e28]
 mongod(_ZN5mongo23GlobalEnvironmentMongoD22setGlobalStorageEngineERKSs+0x30D) [0xa6f30d]
 mongod(_ZN5mongo13initAndListenEi+0x422) [0x7e3bf2]
 mongod(main+0x139) [0x7e8b19]
 libc.so.6(__libc_start_main+0xFD) [0x7f17afc71d5d]
 mongod(+0x3E19B9) [0x7e19b9]
-----  END BACKTRACE  -----
2015-08-27T20:18:03.871+0800 I -        [initandlisten] 
 
***aborting after fassert() failure



 Comments   
Comment by Ramon Fernandez Marina [ 26/Sep/15 ]

shixiong, we haven't heard back from you for a while, so I assume you were able to bring this node back into service and I'm going to close this ticket.

Regards,
Ramón.

Comment by Ramon Fernandez Marina [ 28/Aug/15 ]

shixiong, if you remove the collection from the shell that should remove all the files related to this collection, so no need to do step #2 above.

If the data in this collection is not important then the simplest way to make progress is indeed to remove it.

Comment by shixiong [ 28/Aug/15 ]

Hi, Ramon. I guess you need file [collection-2-6133292058608880639.wt] to fix, but its size is over 15G. The collection qiaodazhao.resume_meta_data is not important, so can i remove this collection for skipping? and how?

1, In mongo shell, remove this collection.
2, login this server, remove file collection-2-6133292058608880639.wt
It can work, can't it ?

Comment by shixiong [ 27/Aug/15 ]

it works. but another error come:

2015-08-27T21:12:32.738+0800 I STORAGE  [initandlisten] WiredTiger progress session.verify 510
2015-08-27T21:12:32.739+0800 I STORAGE  [initandlisten] WiredTiger progress session.verify 510
2015-08-27T21:12:32.739+0800 I STORAGE  [initandlisten] Verify succeeded on uri table:collection-8-6133292058608880639. Not salvaging.
2015-08-27T21:12:32.855+0800 I INDEX    [initandlisten] build index on: qdzresumekey.resumekey properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "qdzresumekey.resumekey" }
2015-08-27T21:12:32.855+0800 I INDEX    [initandlisten] 	 building index using bulk method
2015-08-27T21:12:33.733+0800 I INDEX    [initandlisten] build index on: qdzresumekey.resumekey properties: { v: 1, key: { _id: "hashed" }, name: "_id_hashed", ns: "qdzresumekey.resumekey" }
2015-08-27T21:12:33.733+0800 I INDEX    [initandlisten] 	 building index using bulk method
2015-08-27T21:12:49.243+0800 I STORAGE  [initandlisten] repairDatabase qiaodazhao
2015-08-27T21:12:49.243+0800 I STORAGE  [initandlisten] Repairing collection qiaodazhao.resume_meta_data
2015-08-27T21:12:50.317+0800 E STORAGE  [initandlisten] WiredTiger (0) [1440681170:317009][2378:0x7f7604adfb60], file:collection-2-6133292058608880639.wt, session.verify: read checksum error [339968B @ 15603752960, 430024179 != 3229926353]
2015-08-27T21:12:50.317+0800 E STORAGE  [initandlisten] WiredTiger (0) [1440681170:317083][2378:0x7f7604adfb60], file:collection-2-6133292058608880639.wt, session.verify: collection-2-6133292058608880639.wt: encountered an illegal file format or internal value
2015-08-27T21:12:50.317+0800 E STORAGE  [initandlisten] WiredTiger (-31804) [1440681170:317101][2378:0x7f7604adfb60], file:collection-2-6133292058608880639.wt, session.verify: the process must exit and restart: WT_PANIC: WiredTiger library panic
2015-08-27T21:12:50.317+0800 I -        [initandlisten] Fatal Assertion 28558
2015-08-27T21:12:50.787+0800 I CONTROL  [initandlisten] 
 0xf605f9 0xf09361 0xeece61 0xd893ea 0x1395d89 0x1395f45 0x13963e4 0x12e795e 0x12e7df8 0x12eab9f 0x12ead18 0x1315cfa 0x139106e 0x13912a8 0x1391736 0xd717b5 0xd71f51 0xcf5250 0xbf1268 0x7e4214 0x7e8b19 0x7f760378ed5d 0x7e19b9
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B605F9"},{"b":"400000","o":"B09361"},{"b":"400000","o":"AECE61"},{"b":"400000","o":"9893EA"},{"b":"400000","o":"F95D89"},{"b":"400000","o":"F95F45"},{"b":"400000","o":"F963E4"},{"b":"400000","o":"EE795E"},{"b":"400000","o":"EE7DF8"},{"b"
:"400000","o":"EEAB9F"},{"b":"400000","o":"EEAD18"},{"b":"400000","o":"F15CFA"},{"b":"400000","o":"F9106E"},{"b":"400000","o":"F912A8"},{"b":"400000","o":"F91736"},{"b":"400000","o":"9717B5"},{"b":"400000","o":"971F51"},{"b":"400000","o":"8F5250"},{"b":"400000","o":"7F1268"},{"b":"400000","o":"3E4214"},{"b":"400000","o":"3E8B19"},{"b":"7F7603770000","o":"1ED5D"},{"b":"400000","o":"3E19B9"}],"processInfo":{ "mongodbVersion" : "3.0.4", "gitVersion" : "0481c958daeb2969800511e7475dc66986fa9ed5", "uname" : { "sysname" : "Linux", "release" : "2.6.32-358.el6.x86_64", "version" : "#1 SMP Fri Feb 22 00:31:26 UTC 2013", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "7FFFA4EFF000", "elfType" : 3 }, { "b" : "7F76046B0000", "path" : "/lib64/libpthread.so.0", "elfType" : 3 }, { "b" : "7F76044A8000", "path" : "/lib64/librt.so.1", "elfType" : 3 }, { "b" : "7F76042A4000", "path" : "/lib64/libdl.so.2", "elfType" : 3 }, { "b" : "7F7603F9E000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3 }, { "b" : "7F7603D1A000", "path" : "/lib64/libm.so.6", "elfType" : 3 }, { "b" : "7F7603B04000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3 }, { "b" : "7F7603770000", "path" : "/lib64/libc.so.6", "elfType" : 3 }, { "b" : "7F76048CD000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3 } ] }} mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf605f9]
 mongod(_ZN5mongo10logContextEPKc+0xE1) [0xf09361]
 mongod(_ZN5mongo13fassertFailedEi+0x61) [0xeece61]
 mongod(+0x9893EA) [0xd893ea]
 mongod(__wt_eventv+0x489) [0x1395d89]
 mongod(__wt_err+0x95) [0x1395f45]
 mongod(__wt_panic+0x24) [0x13963e4]
 mongod(__wt_block_extlist_read+0x6E) [0x12e795e]
 mongod(__wt_block_extlist_read_avail+0x28) [0x12e7df8]
 mongod(+0xEEAB9F) [0x12eab9f]
 mongod(__wt_block_verify_start+0x108) [0x12ead18]
 mongod(__wt_verify+0x4AA) [0x1315cfa]
 mongod(__wt_schema_worker+0x35E) [0x139106e]
 mongod(__wt_schema_worker+0x598) [0x13912a8]
 mongod(+0xF91736) [0x1391736]
 mongod(_ZN5mongo18WiredTigerKVEngine16_salvageIfNeededEPKc+0x45) [0xd717b5]
 mongod(_ZN5mongo18WiredTigerKVEngine11repairIdentEPNS_16OperationContextERKNS_10StringDataE+0x51) [0xd71f51]
 mongod(_ZN5mongo15KVStorageEngine17repairRecordStoreEPNS_16OperationContextERKSs+0xA0) [0xcf5250]
 mongod(_ZN5mongo14repairDatabaseEPNS_16OperationContextEPNS_13StorageEngineERKSsbb+0x2A8) [0xbf1268]
 mongod(_ZN5mongo13initAndListenEi+0xA44) [0x7e4214]
 mongod(main+0x139) [0x7e8b19]
 libc.so.6(__libc_start_main+0xFD) [0x7f760378ed5d]
 mongod(+0x3E19B9) [0x7e19b9]
-----  END BACKTRACE  -----
2015-08-27T21:12:50.787+0800 I -        [initandlisten] 
 
***aborting after fassert() failure
 

Comment by Ramon Fernandez Marina [ 27/Aug/15 ]

Hi shixiong, sorry you've run into this issue. I've tried to repair your files and the result is in the new_files.tgz archive. Can you please put extract its contents into your dbpath and try again?

Thanks,
Ramón.

Generated at Thu Feb 08 03:53:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.