[SERVER-28077] mongodb fassert v3.4.1 when encountering invalid data format Created: 23/Feb/17  Updated: 27/Oct/23  Resolved: 28/Feb/17

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.4.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Melody [X] Assignee: Unassigned
Resolution: Gone away Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File mgo_log    
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

Reproduce easily just by running inserting benchmark

Participants:

 Description   

I encountered an annoying when perform inserting with Mongodb v3.4.1.
Everytime when I start the test with a clean dbpath and run for 10-20 miniutes, it will
crash with checksum error. I do not shutdown the machine or mongodb manually.

Options: ./mongod --dbpath /home/df --logpath mgo_test.log --logappend --oplogSize 50000 --replSet data --storageEngine wiredTiger --wiredTigerCacheSizeGB 12 --directoryperdb --fork

./mongo "rs.initiate()"



 Comments   
Comment by Michael Cahill (Inactive) [ 28/Feb/17 ]

Melody, I'm glad the issue is resolved. I will close this issue.

Comment by Melody [X] [ 28/Feb/17 ]

Michael Cahill,

First, I want to say thanks for your correction of my log format to make it more readable.

After careful check, it turns out that the crash roots in the bad memory chips. I replaced these memory chips with new ones and the problem has been solved. Thank you for your help and patience.

Comment by Michael Cahill (Inactive) [ 28/Feb/17 ]

Melody, you will need to sort out the block device and filesystem issues before it makes sense to test using MongoDB or any other database. We assume that the filesystem provides POSIX semantics, which includes reading back the same data that we write.

Have you considered testing the block device as part of a RAID mirror with one or more block devices that are known to be reliable? That may help isolate where the data is being corrupted.

Comment by Melody [X] [ 27/Feb/17 ]

Running the same test above using xfs, also crashed with different vmcore-dmesg.txt.

[ 5016.532894] ------------[ cut here ]------------
[ 5016.532917] kernel BUG at fs/buffer.c:3358!
[ 5016.532932] invalid opcode: 0000 [#1] SMP
[ 5016.532949] Modules linked in: shannon(OE) tcp_lp bnep bluetooth fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter intel_powerclamp coretemp intel_rapl kvm crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel lrw snd_hda_codec gf128mul glue_helper snd_hda_core snd_hwdep ablk_helper snd_seq snd_seq_device cryptd snd_pcm snd_timer snd mxm_wmi eeepc_wmi
[ 5016.533247]  asus_wmi sparse_keymap rfkill mei_me mei soundcore pcspkr acpi_pad sg wmi tpm_infineon shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic ahci libahci i915 crct10dif_pclmul crct10dif_common crc32c_intel serio_raw i2c_algo_bit drm_kms_helper libata e1000e drm ptp pps_core i2c_core video dm_mirror dm_region_hash dm_log dm_mod
[ 5016.533403] CPU: 3 PID: 10015 Comm: abrt-server Tainted: G           OE  ------------   3.10.0-327.el7.x86_64 #1
[ 5016.533436] Hardware name: ASUS All Series/Z97-A, BIOS 2801 11/11/2015
[ 5016.533458] task: ffff8806d05e2e00 ti: ffff8806d40bc000 task.ti: ffff8806d40bc000
[ 5016.533482] RIP: 0010:[<ffffffff81212461>]  [<ffffffff81212461>] free_buffer_head+0x51/0x60
[ 5016.533514] RSP: 0018:ffff8806d40bfb58  EFLAGS: 00010287
[ 5016.533531] RAX: ffff8804ef69f8d0 RBX: ffff8804ef69f888 RCX: 0000000000000000
[ 5016.533554] RDX: 0000000000000000 RSI: 0000000000001000 RDI: ffff8804ef69f888
[ 5016.533577] RBP: ffff8806d40bfb88 R08: 1010000000000000 R09: 04ef69f888080000
[ 5016.533601] R10: faf2974352762202 R11: ffffea0013bda7c0 R12: 0000000000000001
[ 5016.533624] R13: ffff88050f716558 R14: ffff8804ef69f888 R15: 0000000000001000
[ 5016.533647] FS:  00007f07fb0ac880(0000) GS:ffff88083fb80000(0000) knlGS:0000000000000000
[ 5016.533673] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5016.533692] CR2: 00007f07fcb09180 CR3: 00000006cd946000 CR4: 00000000001407e0
[ 5016.533714] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5016.533737] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 5016.533760] Stack:
[ 5016.533767]  ffffffff812125bc ffff8804ef69f888 000000000ba19861 ffff8804ef69f888
[ 5016.533796]  ffffea001388c400 ffff88050f716378 ffff8806d40bfbc0 ffffffffa03114e6
[ 5016.533823]  0000000000000000 000000000ba19861 ffff8804ef69f888 0000000000000000
[ 5016.533852] Call Trace:
[ 5016.533862]  [<ffffffff812125bc>] ? try_to_free_buffers+0x8c/0xe0
[ 5016.533904]  [<ffffffffa03114e6>] xfs_vm_releasepage+0x56/0x100 [xfs]
[ 5016.533926]  [<ffffffff81168f22>] try_to_release_page+0x32/0x50
[ 5016.533947]  [<ffffffff812135fb>] block_invalidatepage_range+0xfb/0x120
[ 5016.533981]  [<ffffffffa030f189>] xfs_vm_invalidatepage+0x39/0x90 [xfs]
[ 5016.534004]  [<ffffffff81178b70>] do_invalidatepage+0x40/0x60
[ 5016.534024]  [<ffffffff81178c32>] truncate_inode_page+0x72/0x80
[ 5016.534044]  [<ffffffff81178e76>] truncate_inode_pages_range+0x1f6/0x740
[ 5016.534068]  [<ffffffff8117943e>] truncate_inode_pages_final+0x5e/0x90
[ 5016.534106]  [<ffffffffa03300f9>] xfs_fs_evict_inode+0x29/0xb0 [xfs]
[ 5016.534130]  [<ffffffff811fa097>] evict+0xa7/0x170
[ 5016.534147]  [<ffffffff811fa8d5>] iput+0xf5/0x180
[ 5016.534164]  [<ffffffff811ef36e>] do_unlinkat+0x1ae/0x2b0
[ 5016.534184]  [<ffffffff811f221c>] ? vfs_readdir+0x8c/0xe0
[ 5016.534203]  [<ffffffff811f033b>] SyS_unlinkat+0x1b/0x40
[ 5016.534223]  [<ffffffff81645909>] system_call_fastpath+0x16/0x1b
[ 5016.534243] Code: 40 fd 00 00 b8 01 00 00 00 65 0f c1 04 25 44 fd 00 00 3d ff 0f 00 00 7f 09 5d c3 0f 1f 80 00 00 00 00 e8 93 fa ff ff 5d 66 90 c3 <0f> 0b 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55
[ 5016.534369] RIP  [<ffffffff81212461>] free_buffer_head+0x51/0x60
[ 5016.534391]  RSP <ffff8806d40bfb58>

And Mongodb log shown as below:

2017-02-27T13:12:39.186+0800 E STORAGE  [thread2] WiredTiger error (-31802) [1488172359:186294][9796:0x7f6588004700], file:t/index-15-5796781820101354619.wt, WT_SESSION.checkpoint: file contains a corrupted WiredTigerCheckpoint.102.alloc extent list, range 24666112-24666751 past end-of-file: WT_ERROR: non-specific WiredTiger error
2017-02-27T13:12:39.186+0800 E STORAGE  [thread2] WiredTiger error (-31804) [1488172359:186344][9796:0x7f6588004700], file:t/index-15-5796781820101354619.wt, WT_SESSION.checkpoint: the process must exit and restart: WT_PANIC: WiredTiger library panic
2017-02-27T13:12:39.186+0800 I -        [thread2] Fatal Assertion 28558 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 361
2017-02-27T13:12:39.186+0800 I -        [thread2] 
 
***aborting after fassert() failure
 
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"7F658C92F000","o":"1544151","s":"_ZN5mongo15printStackTraceERSo"},{"b":"7F658C92F000","o":"1543249"},{"b":"7F658C92F000","o":"154372D"},{"b":"7F658BBCD000","o":"F370"},{"b":"7F658B80C000","o":"351D7","s":"gsignal"},{"b":"7F658B80C000","o":"368C8","s":"abort"},{"b":"7F658C92F000","o":"7FD903","s":"_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj"},{"b":"7F658C92F000","o":"1279996"},{"b":"7F658C92F000","o":"807582"},{"b":"7F658C92F000","o":"807676","s":"__wt_err"},{"b":"7F658C92F000","o":"8078CE","s":"__wt_panic"},{"b":"7F658C92F000","o":"1E4387A"},{"b":"7F658C92F000","o":"1E3F7FA"},{"b":"7F658C92F000","o":"1E409A3"},{"b":"7F658C92F000","o":"1E64A6B"},{"b":"7F658C92F000","o":"1F041C4"},{"b":"7F658C92F000","o":"1F0613A"},{"b":"7F658C92F000","o":"1E77CC5"},{"b":"7F658C92F000","o":"1F32888"},{"b":"7F658C92F000","o":"1F34E16"},{"b":"7F658C92F000","o":"1F3534B"},{"b":"7F658C92F000","o":"1F221C5"},{"b":"7F658C92F000","o":"1E97A6D"},{"b":"7F658BBCD000","o":"7DC5"},{"b":"7F658B80C000","o":"F773D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.2", "gitVersion" : "3f76e40c105fc223b3e5aac3e20dcd026b83b38b", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-327.el7.x86_64", "version" : "#1 SMP Thu Nov 19 22:10:57 UTC 2015", "machine" : "x86_64" }, "somap" : [ { "b" : "7F658C92F000", "elfType" : 3, "buildId" : "5FA11DB6AB2E67269A3BA34DE6305590E354BC84" }, { "b" : "7FFEE8CFC000", "elfType" : 3, "buildId" : "0F488C85309E4AB293D3B758748A3037B01F9885" }, { "b" : "7F658C505000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "82E77ADE22BC9FFF8D3458BD37331E7EDF174C28" }, { "b" : "7F658C301000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "C5F560504E1AF52E29679C3B52FF11121015D6BB" }, { "b" : "7F658BFFF000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "721C7CC9488EFA25F83B48AF713AB27DBE48EF3E" }, { "b" : "7F658BDE9000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "408B46E291B2D4C9612E27C0509D165D7E186D40" }, { "b" : "7F658BBCD000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "C3DEB1FA27CD0C1C3CC575B944ABACBA0698B0F2" }, { "b" : "7F658B80C000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "8B2C421716985B927AA0CAF2A05D0B1F452367F7" }, { "b" : "7F658C70D000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "8F3E366E2DB73C330A3791DEAE31AE9579099B44" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x7f658de73151]
 mongod(+0x1543249) [0x7f658de72249]
 mongod(+0x154372D) [0x7f658de7272d]
 libpthread.so.0(+0xF370) [0x7f658bbdc370]
 libc.so.6(gsignal+0x37) [0x7f658b8411d7]
 libc.so.6(abort+0x148) [0x7f658b8428c8]
 mongod(_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj+0x0) [0x7f658d12c903]
 mongod(+0x1279996) [0x7f658dba8996]
 mongod(+0x807582) [0x7f658d136582]
 mongod(__wt_err+0x9D) [0x7f658d136676]
 mongod(__wt_panic+0x24) [0x7f658d1368ce]
 mongod(+0x1E4387A) [0x7f658e77287a]
 mongod(+0x1E3F7FA) [0x7f658e76e7fa]
 mongod(+0x1E409A3) [0x7f658e76f9a3]
 mongod(+0x1E64A6B) [0x7f658e793a6b]
 mongod(+0x1F041C4) [0x7f658e8331c4]
 mongod(+0x1F0613A) [0x7f658e83513a]
 mongod(+0x1E77CC5) [0x7f658e7a6cc5]
 mongod(+0x1F32888) [0x7f658e861888]
 mongod(+0x1F34E16) [0x7f658e863e16]
 mongod(+0x1F3534B) [0x7f658e86434b]
 mongod(+0x1F221C5) [0x7f658e8511c5]
 mongod(+0x1E97A6D) [0x7f658e7c6a6d]
 libpthread.so.0(+0x7DC5) [0x7f658bbd4dc5]
 libc.so.6(clone+0x6D) [0x7f658b90373d]
-----  END BACKTRACE  -----

Is there anything wrong with my machine configration? Or Mongodb requires specifical config?

Comment by Melody [X] [ 27/Feb/17 ]

Michael Cahill,

First I need to explain that /dev/dfa is a block device, and it is the SSD product of Shannon.

I tried the same test using ext4 in the other PC and encounter a different issue. The machine crashed and reboot during running.

vmcore-dmesg:

[ 1455.590346] general protection fault: 0000 [#1] SMP
[ 1455.590364] Modules linked in: ext4 mbcache jbd2 shannon(OE) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter intel_powerclamp coretemp intel_rapl snd_hda_codec_realtek iosf_mbi snd_hda_codec_generic snd_hda_codec_hdmi kvm snd_hda_intel snd_hda_codec irqbypass snd_hda_core snd_hwdep crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper snd_seq snd_seq_device ablk_helper
[ 1455.590568]  eeepc_wmi asus_wmi sparse_keymap rfkill snd_pcm cryptd mxm_wmi snd_timer snd pcspkr sg soundcore acpi_pad tpm_infineon mei_me mei wmi shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic i915 ahci libahci crct10dif_pclmul crct10dif_common crc32c_intel serio_raw i2c_algo_bit libata drm_kms_helper syscopyarea sysfillrect e1000e sysimgblt fb_sys_fops drm ptp pps_core i2c_core video fjes dm_mirror dm_region_hash dm_log dm_mod
[ 1455.590704] CPU: 0 PID: 13178 Comm: kworker/u8:1 Tainted: G           OE  ------------   3.10.0-514.6.2.el7.x86_64 #1
[ 1455.590728] Hardware name: ASUS All Series/Z97-A, BIOS 2801 11/11/2015
[ 1455.590747] Workqueue: writeback bdi_writeback_workfn (flush-253:0)
[ 1455.590763] task: ffff88071d37edd0 ti: ffff880668bfc000 task.ti: ffff880668bfc000
[ 1455.590780] RIP: 0010:[<ffffffff81183622>]  [<ffffffff81183622>] find_get_pages+0x62/0x170
[ 1455.590802] RSP: 0018:ffff880668bff7e0  EFLAGS: 00010246
[ 1455.590815] RAX: ffff8805172df8e0 RBX: ffff8807f42964d0 RCX: ffffea00107f895c
[ 1455.590831] RDX: ffff8805172df850 RSI: 0000000000000002 RDI: 0000000000000000
[ 1455.590847] RBP: ffff880668bff830 R08: dfffea00107f8980 R09: 0000000000000002
[ 1455.590863] R10: 0000000000000000 R11: 0000000000000220 R12: 0000000000086ee2
[ 1455.590879] R13: 000000000000000b R14: 000000000000000e R15: ffff880668bff898
[ 1455.590895] FS:  0000000000000000(0000) GS:ffff88083fa00000(0000) knlGS:0000000000000000
[ 1455.590914] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1455.590927] CR2: 000000c420145000 CR3: 0000000813228000 CR4: 00000000001407f0
[ 1455.590944] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1455.590960] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1455.590976] Stack:
[ 1455.590981]  ffff880668bff7e0 0000000000086eed 0000000000086f00 ffff880668bff7f0
[ 1455.591001]  000000002b80805f ffff880668bff888 ffff8807f4296378 0000000000086ee2
[ 1455.591020]  ffff880668bffc60 0000000000086ed4 ffff880668bff848 ffffffff8118e3fe
[ 1455.591040] Call Trace:
[ 1455.591048]  [<ffffffff8118e3fe>] pagevec_lookup+0x1e/0x30
[ 1455.591077]  [<ffffffffa0351c6b>] xfs_cluster_write+0xab/0x1b0 [xfs]
[ 1455.591104]  [<ffffffffa0351fd6>] xfs_vm_writepage+0x266/0x5d0 [xfs]
[ 1455.591120]  [<ffffffff8118b503>] __writepage+0x13/0x50
[ 1455.591133]  [<ffffffff8118c021>] write_cache_pages+0x251/0x4d0
[ 1455.591147]  [<ffffffff8118b4f0>] ? global_dirtyable_memory+0x70/0x70
[ 1455.591162]  [<ffffffff8118c2ed>] generic_writepages+0x4d/0x80
[ 1455.591186]  [<ffffffffa0351063>] xfs_vm_writepages+0x53/0x90 [xfs]
[ 1455.591201]  [<ffffffff8118d39e>] do_writepages+0x1e/0x40
[ 1455.591216]  [<ffffffff81228880>] __writeback_single_inode+0x40/0x210
[ 1455.591231]  [<ffffffff8122956e>] writeback_sb_inodes+0x25e/0x420
[ 1455.591246]  [<ffffffff812297cf>] __writeback_inodes_wb+0x9f/0xd0
[ 1455.591261]  [<ffffffff8122a013>] wb_writeback+0x263/0x2f0
[ 1455.591274]  [<ffffffff8122bf0c>] bdi_writeback_workfn+0x1cc/0x460
[ 1455.591290]  [<ffffffff810a805b>] process_one_work+0x17b/0x470
[ 1455.591303]  [<ffffffff810a8e96>] worker_thread+0x126/0x410
[ 1455.591316]  [<ffffffff810a8d70>] ? rescuer_thread+0x460/0x460
[ 1455.591331]  [<ffffffff810b064f>] kthread+0xcf/0xe0
[ 1455.591343]  [<ffffffff810b0580>] ? kthread_create_on_node+0x140/0x140
[ 1455.591359]  [<ffffffff81696818>] ret_from_fork+0x58/0x90
[ 1455.591373]  [<ffffffff810b0580>] ? kthread_create_on_node+0x140/0x140
[ 1455.591387] Code: 8d 5f 08 45 31 ed 48 c7 45 b8 00 00 00 00 4c 89 65 c0 31 d2 eb 74 0f 1f 00 4c 8b 02 4d 85 c0 74 48 41 f6 c0 03 0f 85 9f 00 00 00 <45> 8b 48 1c 45 85 c9 74 e5 41 8d 71 01 49 8d 48 1c 44 89 c8 f0
[ 1455.591472] RIP  [<ffffffff81183622>] find_get_pages+0x62/0x170
[ 1455.591487]  RSP <ffff880668bff7e0>

Comment by Michael Cahill (Inactive) [ 24/Feb/17 ]

Melody, what type of device is /dev/dfa? What mount flags are used? Are any errors logged in dmesg output during the run?

This type of error indicates that MongoDB wrote a block with some data, then later when it read back the same block, it got different data. In other words, the data was corrupted in between writing to the filesystem and reading back in again. That usually indicates bugs in the filesystem or block storage.

Comment by Melody [X] [ 24/Feb/17 ]

Ramón,

I run the test again using XFS instead of ext4, and also got the crash.
I checked xfs filesystem with xfs_repair and the print info listed
below. Does it mean that the filesystem have no consistency problem?

xfs_repair -n /dev/dfa
Phase 1 - find and verify superblock...
Phase 2 - using internal log

  • zero log...
  • scan filesystem freespace and inode maps...
  • found root inode chunk
    Phase 3 - for each AG...
  • scan (but don't clear) agi unlinked lists...
  • process known inodes and perform inode discovery...
  • agno = 0
  • agno = 1
  • agno = 2
  • agno = 3
  • process newly discovered inodes...
    Phase 4 - check for duplicate blocks...
  • setting up duplicate extent list...
  • check for inodes claiming duplicate blocks...
  • agno = 0
  • agno = 1
  • agno = 2
  • agno = 3
    No modify flag set, skipping phase 5
    Phase 6 - check inode connectivity...
  • traversing filesystem ...
  • traversal finished ...
  • moving disconnected inodes to lost+found ...
    Phase 7 - verify link counts...
    No modify flag set, skipping filesystem flush and exiting.

Look forward to your response.

Haiyan

Comment by Melody [X] [ 24/Feb/17 ]

I run the test again using XFS instead of ext4, and also got the crash. I checked xfs filesystem with xfs_repair and the print info listed below. Does it mean that the filesystem have no consistency problem?

xfs_repair -n /dev/dfa
Phase 1 - find and verify superblock...
Phase 2 - using internal log

  • zero log...
  • scan filesystem freespace and inode maps...
  • found root inode chunk
    Phase 3 - for each AG...
  • scan (but don't clear) agi unlinked lists...
  • process known inodes and perform inode discovery...
  • agno = 0
  • agno = 1
  • agno = 2
  • agno = 3
  • process newly discovered inodes...
    Phase 4 - check for duplicate blocks...
  • setting up duplicate extent list...
  • check for inodes claiming duplicate blocks...
  • agno = 0
  • agno = 1
  • agno = 2
  • agno = 3
    No modify flag set, skipping phase 5
    Phase 6 - check inode connectivity...
  • traversing filesystem ...
  • traversal finished ...
  • moving disconnected inodes to lost+found ...
    Phase 7 - verify link counts...
    No modify flag set, skipping filesystem flush and exiting.
Comment by Melody [X] [ 24/Feb/17 ]

The filesystem used is ext4. I have already tried to run using XFS filesystem, and it came the same result. I did not use fsck, and what do I expect to do with fsck?

Comment by Ramon Fernandez Marina [ 23/Feb/17 ]

Thanks for opening a ticket Melody. In the log I see the following:

2017-02-21T18:03:31.788+0800 E STORAGE  [conn9] WiredTiger error (0) [1487671411:788188][19407:0x7ff351866700], file:t/index-1
5--6289328272869131128.wt, WT_CURSOR.insert: read checksum error for 8192B block at offset 1000689664: calculated block checks
um of 1970664321 doesn't match expected checksum of 1165870198
2017-02-21T18:03:31.788+0800 E STORAGE  [conn9] WiredTiger error (0) [1487671411:788241][19407:0x7ff351866700], file:t/index-1
5--6289328272869131128.wt, WT_CURSOR.insert: t/index-15--6289328272869131128.wt: encountered an illegal file format or interna
l value
2017-02-21T18:03:31.788+0800 E STORAGE  [conn9] WiredTiger error (-31804) [1487671411:788254][19407:0x7ff351866700], file:t/in
dex-15--6289328272869131128.wt, WT_CURSOR.insert: the process must exit and restart: WT_PANIC: WiredTiger library panic

This can be easily caused by a faulty storage layer, so the first order of business is to check the integrity of your disks. Also, the logs show you're not using XFS – what filesystem are you using? Have you run fsck on this filesystem?

Thanks,
Ramón.

Generated at Thu Feb 08 04:17:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.