[SERVER-56285] MongoDB does not start after sudden shutdown, WiredTiger.wt file corrupted Created: 22/Apr/21  Updated: 28/Apr/21  Resolved: 26/Apr/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.6.23
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ehsan Nazarian Assignee: Edwin Zhou
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File WiredTiger.turtle     File WiredTiger.wt     File WiredTigerLAS.wt     File _mdb_catalog.wt     File sizeStorer.wt    
Operating System: ALL
Participants:

 Description   

After a sudden power failure and shutdown, our MongoDB v3.6.1 does not start and it seems WiredTiger.wt file is corrupted. 

here is the log:

Error formatting macro: code: java.lang.StackOverflowError

2021-02-23T08:46:36.573+0330 I CONTROL [main] ***** SERVER RESTARTED ***** 2021-02-23T08:46:36.633+0330 I CONTROL [initandlisten] MongoDB starting : pid=16712 port=27017 dbpath=/opt/mongoData/data 64-bit host=db-server 2021-02-23T08:46:36.633+0330 I CONTROL [initandlisten] db version v3.6.1 2021-02-23T08:46:36.633+0330 I CONTROL [initandlisten] git version: 025d4f4fe61efd1fb6f0005be20cb45a004093d1 2021-02-23T08:46:36.633+0330 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.2g 1 Mar 2016 2021-02-23T08:46:36.633+0330 I CONTROL [initandlisten] allocator: tcmalloc 2021-02-23T08:46:36.633+0330 I CONTROL [initandlisten] modules: none 2021-02-23T08:46:36.633+0330 I CONTROL [initandlisten] build environment: 2021-02-23T08:46:36.633+0330 I CONTROL [initandlisten] distmod: ubuntu1604 2021-02-23T08:46:36.633+0330 I CONTROL [initandlisten] distarch: x86_64 2021-02-23T08:46:36.633+0330 I CONTROL [initandlisten] target_arch: x86_64 2021-02-23T08:46:36.633+0330 I CONTROL [initandlisten] options: { config: "/opt/mongoData/mongod.conf", net:

{ bindIp: "0.0.0.0", maxIncomingConnections: 5000, port: 27017 }

, processManagement:

{ fork: true }

, storage: { dbPath: "/opt/mongoData/data", journal:

{ enabled: true }

}, systemLog:

{ destination: "file", logAppend: true, path: "/opt/mongoData/log/mongod.log" }

} 2021-02-23T08:46:36.633+0330 W - [initandlisten] Detected unclean shutdown - /opt/mongoData/data/mongod.lock is not empty. 2021-02-23T08:46:36.639+0330 I - [initandlisten] Detected data files in /opt/mongoData/data created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'. 2021-02-23T08:46:36.639+0330 W STORAGE [initandlisten] Recovering data from the last clean checkpoint. 2021-02-23T08:46:36.639+0330 I STORAGE [initandlisten] 2021-02-23T08:46:36.639+0330 I STORAGE [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine 2021-02-23T08:46:36.639+0330 I STORAGE [initandlisten] ** See http://dochub.mongodb.org/core/prodnotes-filesystem 2021-02-23T08:46:36.640+0330 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=15566M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress), 2021-02-23T08:46:36.784+0330 E STORAGE [initandlisten] WiredTiger error (0) [1614057396:784810][16712:0x7f76ec6c09c0], file:WiredTiger.wt, WT_CURSOR.insert: read checksum error for 28672B block at offset 135168: block header checksum of 342409453 doesn't match expected checksum of 3396487557 2021-02-23T08:46:36.784+0330 E STORAGE [initandlisten] WiredTiger error (0) [1614057396:784939][16712:0x7f76ec6c09c0], file:WiredTiger.wt, WT_CURSOR.insert: WiredTiger.wt: encountered an illegal file format or internal value: (__wt_block_read_off, 227) 2021-02-23T08:46:36.784+0330 E STORAGE [initandlisten] WiredTiger error (-31804) [1614057396:784958][16712:0x7f76ec6c09c0], file:WiredTiger.wt, WT_CURSOR.insert: the process must exit and restart: WT_PANIC: WiredTiger library panic 2021-02-23T08:46:36.784+0330 F - [initandlisten] Fatal Assertion 28558 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 361 2021-02-23T08:46:36.785+0330 F - [initandlisten]***aborting after fassert() failure 2021-02-23T08:46:36.852+0330 F - [initandlisten] Got signal: 6 (Aborted). 0x5579c45b61f1 0x5579c45b5409 0x5579c45b58ed 0x7f76eb0b1390 0x7f76ead0b428 0x7f76ead0d02a 0x5579c2d4eeff 0x5579c2e1b43e 0x5579c2cec277 0x5579c2cec49c 0x5579c2cec7c5 0x5579c2f174b5 0x5579c2e9b93b 0x5579c2ea2ee2 0x5579c2ecc5a4 0x5579c2f27481 0x5579c2ee036d 0x5579c2e8d6fa 0x5579c2efa8c0 0x5579c2e8e5d4 0x5579c2e2ced7 0x5579c2e29dec 0x5579c2dff7f5 0x5579c2de3f28 0x5579c2fd1170 0x5579c2ce8080 0x5579c2dc2fdb 0x5579c2d50bc9 0x7f76eacf6830 0x5579c2db2a19 ----- BEGIN BACKTRACE ----- {"backtrace":[

{"b":"5579C23C4000","o":"21F21F1","s":"_ZN5mongo15printStackTraceERSo"}

,

{"b":"5579C23C4000","o":"21F1409"}

,

{"b":"5579C23C4000","o":"21F18ED"}

,

{"b":"7F76EB0A0000","o":"11390"}

,

{"b":"7F76EACD6000","o":"35428","s":"gsignal"}

,

{"b":"7F76EACD6000","o":"3702A","s":"abort"}

,

{"b":"5579C23C4000","o":"98AEFF","s":"_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj"}

,

{"b":"5579C23C4000","o":"A5743E"}

,

{"b":"5579C23C4000","o":"928277","s":"__wt_eventv"}

,

{"b":"5579C23C4000","o":"92849C","s":"__wt_err"}

,

{"b":"5579C23C4000","o":"9287C5","s":"__wt_panic"}

,

{"b":"5579C23C4000","o":"B534B5","s":"__wt_bm_read"}

,

{"b":"5579C23C4000","o":"AD793B","s":"__wt_bt_read"}

,

{"b":"5579C23C4000","o":"ADEEE2","s":"__wt_page_in_func"}

,

{"b":"5579C23C4000","o":"B085A4","s":"__wt_row_search"}

,

{"b":"5579C23C4000","o":"B63481","s":"__wt_btcur_insert"}

,

{"b":"5579C23C4000","o":"B1C36D"}

,

{"b":"5579C23C4000","o":"AC96FA"}

,

{"b":"5579C23C4000","o":"B368C0","s":"__wt_log_scan"}

,

{"b":"5579C23C4000","o":"ACA5D4","s":"__wt_txn_recover"}

,

{"b":"5579C23C4000","o":"A68ED7","s":"__wt_connection_workers"}

,

{"b":"5579C23C4000","o":"A65DEC","s":"wiredtiger_open"}

,

{"b":"5579C23C4000","o":"A3B7F5","s":"_ZN5mongo18WiredTigerKVEngineC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_PNS_11ClockSourceES8_mbbbb"}

,

{"b":"5579C23C4000","o":"A1FF28"}

,

{"b":"5579C23C4000","o":"C0D170","s":"_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv"}

,

{"b":"5579C23C4000","o":"924080"}

,

{"b":"5579C23C4000","o":"9FEFDB","s":"_ZN5mongo11mongoDbMainEiPPcS1_"}

,

{"b":"5579C23C4000","o":"98CBC9","s":"main"}

,

{"b":"7F76EACD6000","o":"20830","s":"__libc_start_main"}

,

{"b":"5579C23C4000","o":"9EEA19","s":"_start"}

],"processInfo":{ "mongodbVersion" : "3.6.1", "gitVersion" : "025d4f4fe61efd1fb6f0005be20cb45a004093d1", "compiledModules" : [], "uname" :

{ "sysname" : "Linux", "release" : "4.4.0-131-generic", "version" : "#157-Ubuntu SMP Thu Jul 12 15:51:36 UTC 2018", "machine" : "x86_64" }

, "somap" : [

{ "b" : "5579C23C4000", "elfType" : 3, "buildId" : "87C7D2729B4125CF6CBE2FD954E1BBD0441BBA53" }

,

{ "b" : "7FFE6F7EA000", "elfType" : 3, "buildId" : "B9971A47FDB350363AE2C13A267B2AA114363622" }

,

{ "b" : "7F76EC296000", "path" : "/lib/x86_64-linux-gnu/libresolv.so.2", "elfType" : 3, "buildId" : "6EF73266978476EF9F2FD2CF31E57F4597CB74F8" }

,

{ "b" : "7F76EC02D000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "473092A9AF373FB0CAB555F9A003BC67F47756B6" }

,

{ "b" : "7F76EBBE8000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "8942CA58A3B910E883CC31E04A23DBD09729B4B0" }

,

{ "b" : "7F76EB9E0000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "89C34D7A182387D76D5CDA1F7718F5D58824DFB3" }

,

{ "b" : "7F76EB7DC000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "8CC8D0D119B142D839800BFF71FB71E73AEA7BD4" }

,

{ "b" : "7F76EB4D3000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "DFB85DE42DAFFD09640C8FE377D572DE3E168920" }

,

{ "b" : "7F76EB2BD000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "68220AE2C65D65C1B6AAA12FA6765A6EC2F5F434" }

,

{ "b" : "7F76EB0A0000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "CE17E023542265FC11D9BC8F534BB4F070493D30" }

,

{ "b" : "7F76EACD6000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "B5381A457906D279073822A5CEB24C4BFEF94DDB" }

,

{ "b" : "7F76EC4B1000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "5D7B6259552275A3C17BD4C3FD05F5A6BF40CAA5" }

] }} mongod(ZN5mongo15printStackTraceERSo+0x41) [0x5579c45b61f1] mongod(+0x21F1409) [0x5579c45b5409] mongod(+0x21F18ED) [0x5579c45b58ed] libpthread.so.0(+0x11390) [0x7f76eb0b1390] libc.so.6(gsignal+0x38) [0x7f76ead0b428] libc.so.6(abort+0x16A) [0x7f76ead0d02a] mongod(_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj+0x0) [0x5579c2d4eeff] mongod(+0xA5743E) [0x5579c2e1b43e] mongod(wt_eventv+0x3D7) [0x5579c2cec277] mongod(wt_err+0x9D) [0x5579c2cec49c] mongod(wt_panic+0x33) [0x5579c2cec7c5] mongod(wt_bm_read+0x135) [0x5579c2f174b5] mongod(wt_bt_read+0x1FB) [0x5579c2e9b93b] mongod(wt_page_in_func+0xA62) [0x5579c2ea2ee2] mongod(wt_row_search+0x7B4) [0x5579c2ecc5a4] mongod(wt_btcur_insert+0xD71) [0x5579c2f27481] mongod(+0xB1C36D) [0x5579c2ee036d] mongod(+0xAC96FA) [0x5579c2e8d6fa] mongod(wt_log_scan+0xCE0) [0x5579c2efa8c0] mongod(wt_txn_recover+0x3D4) [0x5579c2e8e5d4] mongod(wt_connection_workers+0x37) [0x5579c2e2ced7] mongod(wiredtiger_open+0x19DC) [0x5579c2e29dec] mongod(_ZN5mongo18WiredTigerKVEngineC1ERKNSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_PNS_11ClockSourceES8_mbbbb+0x845) [0x5579c2dff7f5] mongod(+0xA1FF28) [0x5579c2de3f28] mongod(_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv+0x650) [0x5579c2fd1170] mongod(+0x924080) [0x5579c2ce8080] mongod(_ZN5mongo11mongoDbMainEiPPcS1+0x86B) [0x5579c2dc2fdb] mongod(main+0x9) [0x5579c2d50bc9] libc.so.6(__libc_start_main+0xF0) [0x7f76eacf6830] mongod(_start+0x29) [0x5579c2db2a19] ----- END BACKTRACE -----

 

I tried --repair option, but no success. I also copied DB data files to another path and tried to run MongoDB v4.0.24 with --repair option on this data, but it was not successful.

 

Related files are attached.

Please help me to recover my data.

Thanks

 

 



 Comments   
Comment by Edwin Zhou [ 26/Apr/21 ]

Hi ehsannazarian1@gmail.com,

We're thrilled to hear you've been able to resolve this issue by running --repair using a later version of MongoDB! Since MongoDB 4.0.3, we've improved --repair to perform the same WiredTiger metadata repair that was performed in the tickets that you linked in your comment above.

To avoid a problem like this in the future, it is our strong recommendation to:

Since you were able to successfully perform this repair, I will now close this ticket.

Best,
Edwin

Comment by Ehsan Nazarian [ 24/Apr/21 ]

Fortunately, the problem is resolved! When I ran mongod v4.2 with --repair option for the first time, repair operation was unsuccessful with following lines in the last part of the log:

 

2021-04-24T06:21:27.080+0430 E  STORAGE  [initandlisten] WiredTiger error (-31802) [1619229087:80010][76419:0x7f378c898b40], file:collection-0--5191484369938535967.wt, txn-recover: __wt_block_read_off, 285: collection-0--5191484369938535967.wt: fatal read error: WT_ERROR: non-specific WiredTiger error Raw: [1619229087:80010][76419:0x7f378c898b40], file:collection-0--5191484369938535967.wt, txn-recover: __wt_block_read_off, 285: collection-0--5191484369938535967.wt: fatal read error: WT_ERROR: non-specific WiredTiger error
2021-04-24T06:21:27.080+0430 E  STORAGE  [initandlisten] WiredTiger error (-31804) [1619229087:80030][76419:0x7f378c898b40], file:collection-0--5191484369938535967.wt, txn-recover: __wt_panic, 489: the process must exit and restart: WT_PANIC: WiredTiger library panic Raw: [1619229087:80030][76419:0x7f378c898b40], file:collection-0--5191484369938535967.wt, txn-recover: __wt_panic, 489: the process must exit and restart: WT_PANIC: WiredTiger library panic
2021-04-24T06:21:27.080+0430 E  STORAGE  [initandlisten] WiredTiger error (-31804) [1619229087:80094][76419:0x7f378c898b40], file:collection-0--5191484369938535967.wt, txn-recover: __txn_op_apply, 288: operation apply failed during recovery: operation type 4 at LSN 10173/43453184: WT_PANIC: WiredTiger library panic Raw: [1619229087:80094][76419:0x7f378c898b40], file:collection-0--5191484369938535967.wt, txn-recover: __txn_op_apply, 288: operation apply failed during recovery: operation type 4 at LSN 10173/43453184: WT_PANIC: WiredTiger library panic
2021-04-24T06:21:27.080+0430 E  STORAGE  [initandlisten] WiredTiger error (-31804) [1619229087:80131][76419:0x7f378c898b40], file:collection-0--5191484369938535967.wt, txn-recover: __wt_txn_recover, 710: Recovery failed: WT_PANIC: WiredTiger library panic Raw: [1619229087:80131][76419:0x7f378c898b40], file:collection-0--5191484369938535967.wt, txn-recover: __wt_txn_recover, 710: Recovery failed: WT_PANIC: WiredTiger library panic
2021-04-24T06:21:27.080+0430 E  STORAGE  [initandlisten] WiredTiger error (0) [1619229087:80832][76419:0x7f378c898b40], connection: __wt_cache_destroy, 346: cache server: exiting with 16 pages in memory and 10 pages evicted Raw: [1619229087:80832][76419:0x7f378c898b40], connection: __wt_cache_destroy, 346: cache server: exiting with 16 pages in memory and 10 pages evicted
2021-04-24T06:21:27.080+0430 E  STORAGE  [initandlisten] WiredTiger error (0) [1619229087:80922][76419:0x7f378c898b40], connection: __wt_cache_destroy, 349: cache server: exiting with 111865 image bytes in memory Raw: [1619229087:80922][76419:0x7f378c898b40], connection: __wt_cache_destroy, 349: cache server: exiting with 111865 image bytes in memory
2021-04-24T06:21:27.080+0430 E  STORAGE  [initandlisten] WiredTiger error (0) [1619229087:80946][76419:0x7f378c898b40], connection: __wt_cache_destroy, 352: cache server: exiting with 121609 bytes in memory Raw: [1619229087:80946][76419:0x7f378c898b40], connection: __wt_cache_destroy, 352: cache server: exiting with 121609 bytes in memory
2021-04-24T06:21:27.080+0430 E  STORAGE  [initandlisten] WiredTiger error (0) [1619229087:80960][76419:0x7f378c898b40], connection: __wt_cache_destroy, 358: cache server: exiting with 105544 bytes dirty and 4 pages dirty Raw: [1619229087:80960][76419:0x7f378c898b40], connection: __wt_cache_destroy, 358: cache server: exiting with 105544 bytes dirty and 4 pages dirty
2021-04-24T06:21:27.083+0430 F  STORAGE  [initandlisten] Failed to salvage WiredTiger metadata: -31809: WT_TRY_SALVAGE: database corruption detected
2021-04-24T06:21:27.083+0430 F  -        [initandlisten] Fatal Assertion 50947 at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp 940
2021-04-24T06:21:27.083+0430 F  -        [initandlisten] \n\n***aborting after fassert() failure\n\n

 

But when I tried to remove collection-0–5191484369938535967.wt file and ran mongod --repair again, this time, all other data was repaired successfully

Comment by Ehsan Nazarian [ 23/Apr/21 ]

Some similar issues:

https://jira.mongodb.org/browse/SERVER-36837

https://jira.mongodb.org/browse/SERVER-36246

https://jira.mongodb.org/browse/SERVER-36179

 

Generated at Thu Feb 08 05:38:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.