-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.2.20
-
Component/s: None
-
None
-
Environment:Linux
-
Server Triage
Today, I found that our online Mongo (version 3.2.20) process has disappeared. It is running under the Linux system. We checked the system log /var/log/message:
Aug 10 11:50:54 ts2-logdb kernel: INFO: task mongod:1977 blocked for more than 120 seconds.
Aug 10 11:50:54 ts2-logdb kernel: Not tainted 2.6.32-431.el6.x86_64 #1
Aug 10 11:50:54 ts2-logdb kernel: "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 10 11:50:54 ts2-logdb kernel: mongod D 0000000000000007 0 1977 1 0x00000080
Aug 10 11:50:54 ts2-logdb kernel: ffff88083777bc98 0000000000000086 ffffffffa00eb8e0 000000000000008e
Aug 10 11:50:54 ts2-logdb kernel: ffff88083777bc68 ffffffff810af370 ffff88083777bc28 ffffffff812825b9
Aug 10 11:50:54 ts2-logdb kernel: ffff88083a2105f8 ffff88083777bfd8 000000000000fbc8 ffff88083a2105f8
Aug 10 11:50:54 ts2-logdb kernel: Call Trace:
Aug 10 11:50:54 ts2-logdb kernel: [<ffffffff810af370>]? Exit_robust_list+0x90/0x160
Aug 10 11:50:54 ts2-logdb kernel: [<ffffffff812825b9>]? Cpumask_next_and+0x29/0x50
Aug 10 11:50:54 ts2-logdb kernel: [<ffffffff81076ad5>] exit_mm+0x95/0x180
Aug 10 11:50:54 ts2-logdb kernel: [<ffffffff81076f1f>] do_exit+0x15f/0x870
Aug 10 11:50:54 ts2-logdb kernel: [<ffffffff81077688>] do_group_exit+0x58/0xd0
Aug 10 11:50:54 ts2-logdb kernel: [<ffffffff8108d046>] get_signal_to_deliver+0x1f6/0x460
Aug 10 11:50:54 ts2-logdb kernel: [<ffffffff8100a265>] do_signal+0x75/0x800
Aug 10 11:50:54 ts2-logdb kernel: [<ffffffff810a0637>]? Hrtimer_nanosleep+0xe7/0x180
Aug 10 11:50:54 ts2-logdb kernel: [<ffffffff8109f460>]? Hrtimer_wakeup+0x0/0x30
Aug 10 11:50:54 ts2-logdb kernel: [<ffffffff8100aa80>] do_notify_resume+0x90/0xc0
Aug 10 11:50:54 ts2-logdb kernel: [<ffffffff8100b341>] int_signal+0x12/0x17
I looked at the MongoDB log, but not found any log output today. Only record we tried to restart and failed. The crash occurred around 11:50, and'mongod.log' only recorded:
2021-08-10T14:15:42.382+0800 I CONTROL [initandlisten] MongoDB starting: pid=10120 port=32003 dbpath=/data/mongodb/data/ 64-bit host=ts2-logdb
2021-08-10T14:15:42.383+0800 I CONTROL [initandlisten] db version v3.2.20
2021-08-10T14:15:42.383+0800 I CONTROL [initandlisten] git version: a7a144f40b70bfe290906eb33ff2714933544af8
2021-08-10T14:15:42.383+0800 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2021-08-10T14:15:42.383+0800 I CONTROL [initandlisten] allocator: tcmalloc
2021-08-10T14:15:42.383+0800 I CONTROL [initandlisten] modules: none
2021-08-10T14:15:42.383+0800 I CONTROL [initandlisten] build environment:
2021-08-10T14:15:42.383+0800 I CONTROL [initandlisten] distmod: rhel62
2021-08-10T14:15:42.383+0800 I CONTROL [initandlisten] distarch: x86_64
2021-08-10T14:15:42.383+0800 I CONTROL [initandlisten] target_arch: x86_64
??2021-08-10T14:15:42.383+0800 I CONTROL [initandlisten] options: {config: "/etc/mongod.conf", net:
, processManagement: {fork : true, pidFilePath: "/data/mongodb/log/mongod.pid" }, storage: {dbPath: "/data/mongodb/data/", journal:
{enabled: true} }, systemLog: {destination: "file ", logAppend: true, logRotate: "rename", path: "/data/mongodb/log/mongod.log"}}??
2021-08-10T14:15:42.457+0800 I-[initandlisten] Detected data files in /data/mongodb/data/ created by the'wiredTiger' storage engine, so setting the active storage engine to'wiredTiger'.
2021-08-10T14:15:42.460+0800 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=18G,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast) ,log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),verbose =(recovery_progress),
2021-08-10T14:15:42.466+0800 E STORAGE [initandlisten] WiredTiger (11) [1628576142:466438][10120:0x7fb44212bd40], wiredtiger_open: /data/mongodb/data//WiredTiger.lock: handle-lock: fcntl : Resource temporarily unavailable
2021-08-10T14:15:42.466+0800 E STORAGE [initandlisten] WiredTiger (16) [1628576142:466505][10120:0x7fb44212bd40], wiredtiger_open: WiredTiger database is already being managed by another process: Device or resource busy
2021-08-10T14:15:42.469+0800 I-[initandlisten] Assertion: 28595:16: Device or resource busy
2021-08-10T14:15:42.503+0800 I STORAGE [initandlisten] exception in initAndListen: 28595 16: Device or resource busy, terminating
2021-08-10T14:15:42.503+0800 I CONTROL [initandlisten] dbexit: rc: 100
2021-08-10T14:16:11.758+0800 I CONTROL [main] ***** SERVER RESTARTED *****
2021-08-10T14:16:11.769+0800 I CONTROL [initandlisten] MongoDB starting: pid=10128 port=32003 dbpath=/data/mongodb/data/ 64-bit host=ts2-logdb
I have tried --repair, but it lasted more than two hours and it was not completed yet, so I canceled it.
I've attached my WiredTiger.wt and WiredTiger.turtle files for your review.
Any ideas? Thanks so much!