[SERVER-31931] Corrupt database after unexpected shutdown Created: 13/Nov/17  Updated: 07/Dec/17  Resolved: 14/Nov/17

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.2.14
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: NidgetGod Assignee: Mark Agarunov
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File WiredTiger.turtle     File WiredTiger.wt     File repair-SERVER-31931.tar.gz    
Operating System: ALL
Participants:

 Description   

MongoDB shell version: 3.2.14

Using wt

./wt -v -h /data/db -C "extensions=[./ext/compressors/snappy/.libs/libwiredtiger_snappy.so]" -R salvage collection-0--3259849781052165579.wt

display

[1510566574:512938][11240:0x7faa18f83740], file:WiredTiger.wt, connection: read checksum error for 4096B block at offset 102400: block header checksum of 634778719 doesn't match expected checksum of 2709631337
[1510566574:512956][11240:0x7faa18f83740], file:WiredTiger.wt, connection: WiredTiger.wt: encountered an illegal file format or internal value
[1510566574:512964][11240:0x7faa18f83740], file:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic
lt-wt: WT_PANIC: WiredTiger library panic

And mongod --repair --dbpath=/data/db --storageEngine=wiredTiger display

2017-11-13T10:08:05.474+0000 I CONTROL  [initandlisten] MongoDB starting : pid=9 port=27017 dbpath=/data/db 64-bit host=bce41bc1835d
2017-11-13T10:08:05.474+0000 I CONTROL  [initandlisten] db version v3.2.14
2017-11-13T10:08:05.474+0000 I CONTROL  [initandlisten] git version: 92f6668a768ebf294bd4f494c50f48459198e6a3
2017-11-13T10:08:05.474+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1t  3 May 2016
2017-11-13T10:08:05.474+0000 I CONTROL  [initandlisten] allocator: tcmalloc
2017-11-13T10:08:05.474+0000 I CONTROL  [initandlisten] modules: none
2017-11-13T10:08:05.474+0000 I CONTROL  [initandlisten] build environment:
2017-11-13T10:08:05.474+0000 I CONTROL  [initandlisten]     distmod: debian81
2017-11-13T10:08:05.474+0000 I CONTROL  [initandlisten]     distarch: x86_64
2017-11-13T10:08:05.474+0000 I CONTROL  [initandlisten]     target_arch: x86_64
2017-11-13T10:08:05.474+0000 I CONTROL  [initandlisten] options: { repair: true, storage: { dbPath: "/data/db", engine: "wiredTiger" } }
2017-11-13T10:08:05.478+0000 I STORAGE  [initandlisten] Detected WT journal files.  Running recovery from last checkpoint.
2017-11-13T10:08:05.478+0000 I STORAGE  [initandlisten] journal to nojournal transition config: create,cache_size=6G,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-11-13T10:08:05.482+0000 E STORAGE  [initandlisten] WiredTiger (0) [1510567685:482403][9:0x7f7ae3e9ecc0], file:WiredTiger.wt, connection: read checksum error for 4096B block at offset 102400: block header checksum of 634778719 doesn't match expected checksum of 2709631337
2017-11-13T10:08:05.482+0000 E STORAGE  [initandlisten] WiredTiger (0) [1510567685:482455][9:0x7f7ae3e9ecc0], file:WiredTiger.wt, connection: WiredTiger.wt: encountered an illegal file format or internal value
2017-11-13T10:08:05.482+0000 E STORAGE  [initandlisten] WiredTiger (-31804) [1510567685:482465][9:0x7f7ae3e9ecc0], file:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic
2017-11-13T10:08:05.482+0000 I -        [initandlisten] Fatal Assertion 28558
2017-11-13T10:08:05.482+0000 I -        [initandlisten] 
 
***aborting after fassert() failure
 
 
2017-11-13T10:08:05.502+0000 F -        [initandlisten] Got signal: 6 (Aborted).
 
 0x134d5b2 0x134c6e9 0x134cef2 0x7f7ae2afa890 0x7f7ae2775067 0x7f7ae2776448 0x12cdf12 0x10c4213 0x975dd9 0x975fd6 0x97619a 0x19e233c 0x19e28a3 0x19df139 0x19e3b39 0x19fcd5c 0x1a37053 0x1abcf7d 0x1abd499 0x1abd6bb 0x1a44936 0x1ab9ead 0x1a7e4ef 0x1a7e5ee 0x1a338aa 0x10ab2b1 0x10a7603 0xfcf73e 0x9c0490 0x976aad 0x7f7ae2761b45 0x9bbe67
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"F4D5B2","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"F4C6E9"},{"b":"400000","o":"F4CEF2"},{"b":"7F7AE2AEB000","o":"F890"},{"b":"7F7AE2740000","o":"35067","s":"gsignal"},{"b":"7F7AE2740000","o":"36448","s":"abort"},{"b":"400000","o":"ECDF12","s":"_ZN5mongo13fassertFailedEi"},{"b":"400000","o":"CC4213"},{"b":"400000","o":"575DD9","s":"__wt_eventv"},{"b":"400000","o":"575FD6","s":"__wt_err"},{"b":"400000","o":"57619A","s":"__wt_panic"},{"b":"400000","o":"15E233C","s":"__wt_block_extlist_read"},{"b":"400000","o":"15E28A3","s":"__wt_block_extlist_read_avail"},{"b":"400000","o":"15DF139","s":"__wt_block_checkpoint_load"},{"b":"400000","o":"15E3B39"},{"b":"400000","o":"15FCD5C","s":"__wt_btree_open"},{"b":"400000","o":"1637053","s":"__wt_conn_btree_open"},{"b":"400000","o":"16BCF7D","s":"__wt_session_get_btree"},{"b":"400000","o":"16BD499","s":"__wt_session_get_btree"},{"b":"400000","o":"16BD6BB","s":"__wt_session_get_btree_ckpt"},{"b":"400000","o":"1644936","s":"__wt_curfile_open"},{"b":"400000","o":"16B9EAD"},{"b":"400000","o":"167E4EF","s":"__wt_metadata_cursor_open"},{"b":"400000","o":"167E5EE","s":"__wt_metadata_cursor"},{"b":"400000","o":"16338AA","s":"wiredtiger_open"},{"b":"400000","o":"CAB2B1","s":"_ZN5mongo18WiredTigerKVEngineC2ERKSsS2_S2_mbbb"},{"b":"400000","o":"CA7603"},{"b":"400000","o":"BCF73E","s":"_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv"},{"b":"400000","o":"5C0490"},{"b":"400000","o":"576AAD","s":"main"},{"b":"7F7AE2740000","o":"21B45","s":"__libc_start_main"},{"b":"400000","o":"5BBE67"}],"processInfo":{ "mongodbVersion" : "3.2.14", "gitVersion" : "92f6668a768ebf294bd4f494c50f48459198e6a3", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.13.11-1-ARCH", "version" : "#1 SMP PREEMPT Thu Nov 2 10:25:56 CET 2017", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "056AF96295E28A047B8792D4BA955B26F5D681B2" }, { "b" : "7FFEAC3AB000", "path" : "linux-vdso.so.1", "elfType" : 3, "buildId" : "5AC4C2244FB8E38AAE813AD1676F4EA039254BF0" }, { "b" : "7F7AE3A27000", "path" : "/usr/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "21115992A1F885E1ACE88AADA60F126AD9759D03" }, { "b" : "7F7AE362B000", "path" : "/usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "32E9A5B9EED626E93DEEB00A49033F78652DB9A3" }, { "b" : "7F7AE3423000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "A63C95FB33CCA970E141D2E13774B997C1CF0565" }, { "b" : "7F7AE321F000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "D70B531D672A34D71DB42EB32B68E63F2DCC5B6A" }, { "b" : "7F7AE2F1E000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "152C93BA3E8590F7ED0BCDDF868600D55EC4DD6F" }, { "b" : "7F7AE2D08000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "D5FB04F64B3DAEA6D6B68B5E8B9D4D2BC1A6E1FC" }, { "b" : "7F7AE2AEB000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "9DA9387A60FFC196AEDB9526275552AFEF499C44" }, { "b" : "7F7AE2740000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "48C48BC6ABB794461B8A558DD76B29876A0551F0" }, { "b" : "7F7AE3C88000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "1D98D41FBB1EABA7EC05D0FD7624B85D6F51C03C" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x134d5b2]
 mongod(+0xF4C6E9) [0x134c6e9]
 mongod(+0xF4CEF2) [0x134cef2]
 libpthread.so.0(+0xF890) [0x7f7ae2afa890]
 libc.so.6(gsignal+0x37) [0x7f7ae2775067]
 libc.so.6(abort+0x148) [0x7f7ae2776448]
 mongod(_ZN5mongo13fassertFailedEi+0x82) [0x12cdf12]
 mongod(+0xCC4213) [0x10c4213]
 mongod(__wt_eventv+0x440) [0x975dd9]
 mongod(__wt_err+0x8D) [0x975fd6]
 mongod(__wt_panic+0x24) [0x97619a]
 mongod(__wt_block_extlist_read+0x6C) [0x19e233c]
 mongod(__wt_block_extlist_read_avail+0x33) [0x19e28a3]
 mongod(__wt_block_checkpoint_load+0x359) [0x19df139]
 mongod(+0x15E3B39) [0x19e3b39]
 mongod(__wt_btree_open+0xBDC) [0x19fcd5c]
 mongod(__wt_conn_btree_open+0x153) [0x1a37053]
 mongod(__wt_session_get_btree+0xDD) [0x1abcf7d]
 mongod(__wt_session_get_btree+0x5F9) [0x1abd499]
 mongod(__wt_session_get_btree_ckpt+0xAB) [0x1abd6bb]
 mongod(__wt_curfile_open+0x86) [0x1a44936]
 mongod(+0x16B9EAD) [0x1ab9ead]
 mongod(__wt_metadata_cursor_open+0x5F) [0x1a7e4ef]
 mongod(__wt_metadata_cursor+0x7E) [0x1a7e5ee]
 mongod(wiredtiger_open+0x191A) [0x1a338aa]
 mongod(_ZN5mongo18WiredTigerKVEngineC2ERKSsS2_S2_mbbb+0x681) [0x10ab2b1]
 mongod(+0xCA7603) [0x10a7603]
 mongod(_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv+0x59E) [0xfcf73e]
 mongod(+0x5C0490) [0x9c0490]
 mongod(main+0x15D) [0x976aad]
 libc.so.6(__libc_start_main+0xF5) [0x7f7ae2761b45]
 mongod(+0x5BBE67) [0x9bbe67]
-----  END BACKTRACE  -----



 Comments   
Comment by Mark Agarunov [ 14/Nov/17 ]

Hello nidgetgod,

Thanks for your response. I'm glad to hear that this fixed the issue and everything is working again. To prevent this type of problem in the future, we recommend implementing regular backups and/or replication to mitigate any issues related to unreliable storage layers or server failures.

Thanks,
Mark

Comment by NidgetGod [ 14/Nov/17 ]

Hi Mark,

It work. Thanks.

And is any solution we can repair by ourselves?

by N

Comment by Mark Agarunov [ 13/Nov/17 ]

Hello nidgetgod,

Thank you for the report. I've attached a repair attempt of the files you've provided. Would you please extract these files and replace them in your $dbpath and let us know if it resolves the issue? If you are still seeing errors after replacing these files, please provide the complete logs from mongod so that we can further investigate. Additionally, if this issue persists, please provide the following information:

  1. What kind of underlying storage mechanism are you using? Are the storage devices attached locally or over the network? Are the disks SSDs or HDDs? What kind of RAID and/or volume management system are you using?
  2. Would you please check the integrity of your disks?
  3. Has the database always been running this version of MongoDB? If not please describe the upgrade/downgrade cycles the database has been through.
  4. Have you manipulated (copied or moved) the underlying database files? If so, was mongod running?
  5. Have you ever restored this instance from backups?
  6. What method do you use to create backups?
  7. When was the underlying filesystem last checked and is it currently marked clean?

Thanks,
Mark

Generated at Thu Feb 08 04:28:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.