Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-27942

mongodb lost databases after crash

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Storage, WiredTiger
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      Platform: centos 7 x64
      Version: 3.2.7
      DataSet: 7000+ databases, 30GB data

      Show
      Platform: centos 7 x64 Version: 3.2.7 DataSet: 7000+ databases, 30GB data

      Description

      We met an issue. The sever crashed as the the log shows. When it restarted, all the database gone. We can find the database directory and collection&index files in the db directory.
      My questions are: 1, how to fix or avoid the crash? 2, can i get all the databases back? Thanks.

      2017-02-06T23:42:02.266+0800 E STORAGE  [fsyncLockWorker] WiredTiger (22) [1486395722:266895][46275:0x7fabc504d700], file:WS.451483428930375.45360/index-3172--2347007522644439145.wt, WT_SESSION.open_cursor: live.avail: existing range 12288-20480 overlaps with merge range 16384-28672: Invalid argument
      2017-02-06T23:42:02.266+0800 E STORAGE  [fsyncLockWorker] WiredTiger (-31804) [1486395722:266975][46275:0x7fabc504d700], file:WS.451483428930375.45360/index-3172--2347007522644439145.wt, WT_SESSION.open_cursor: the process must exit and restart: WT_PANIC: WiredTiger library panic
      2017-02-06T23:42:02.266+0800 I -        [fsyncLockWorker] Fatal Assertion 28558
      2017-02-06T23:42:02.282+0800 I -        [fsyncLockWorker]
       
      ***aborting after fassert() failure
       
       
      2017-02-06T23:42:02.355+0800 I -        [WTJournalFlusher] Fatal Assertion 28559
      2017-02-06T23:42:02.355+0800 I -        [WTJournalFlusher]
       
      ***aborting after fassert() failure
       
       
      2017-02-06T23:42:02.378+0800 F -        [fsyncLockWorker] Got signal: 6 (Aborted).
       
       0x131a0d2 0x1319229 0x1319a32 0x7fabd9607100 0x7fabd926b5f7 0x7fabd926cce8 0x12a39d2 0x109e1a3 0x1a7af5c 0x1a7b41d 0x1a7b804 0x19ac3cd 0x19ad554 0x19ad7a3 0x19aa735 0x19ae7e9 0x19cb25c 0x1a010e0 0x1a79c87 0x1a073f9 0x1a0776a 0x1a4054e 0x1a07fcc 0x1a08366 0x1a76ecd 0x1a77259 0x1083895 0xfeb73e 0xb48474 0xb49e44 0x12a8330 0x1b34290 0x7fabd95ffdc5 0x7fabd932cced
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"F1A0D2","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"F19229"},{"b":"400000","o":"F19A32"},{"b":"7FABD95F8000","o":"F100"},{"b":"7FABD9236000","o":"355F7","s":"gsignal"},{"b":"7FABD9236000","o":"36CE8","s":"abort"},{"b":"400000","o":"EA39D2","s":"_ZN5mongo13fassertFailedEi"},{"b":"400000","o":"C9E1A3"},{"b":"400000","o":"167AF5C","s":"__wt_eventv"},{"b":"400000","o":"167B41D","s":"__wt_err"},{"b":"400000","o":"167B804","s":"__wt_panic"},{"b":"400000","o":"15AC3CD"},{"b":"400000","o":"15AD554","s":"__wt_block_extlist_read"},{"b":"400000","o":"15AD7A3","s":"__wt_block_extlist_read_avail"},{"b":"400000","o":"15AA735","s":"__wt_block_checkpoint_load"},{"b":"400000","o":"15AE7E9"},{"b":"400000","o":"15CB25C","s":"__wt_btree_open"},{"b":"400000","o":"16010E0","s":"__wt_conn_btree_open"},{"b":"400000","o":"1679C87","s":"__wt_session_get_btree"},{"b":"400000","o":"16073F9"},{"b":"400000","o":"160776A"},{"b":"400000","o":"164054E","s":"__wt_meta_apply_all"},{"b":"400000","o":"1607FCC"},{"b":"400000","o":"1608366","s":"__wt_curbackup_open"},{"b":"400000","o":"1676ECD"},{"b":"400000","o":"1677259"},{"b":"400000","o":"C83895","s":"_ZN5mongo18WiredTigerKVEngine11beginBackupEPNS_16OperationContextE"},{"b":"400000","o":"BEB73E","s":"_ZN5mongo15KVStorageEngine11beginBackupEPNS_16OperationContextE"},{"b":"400000","o":"748474","s":"_ZN5mongo15FSyncLockThread10doRealWorkEv"},{"b":"400000","o":"749E44","s":"_ZN5mongo15FSyncLockThread3runEv"},{"b":"400000","o":"EA8330","s":"_ZN5mongo13BackgroundJob7jobBodyEv"},{"b":"400000","o":"1734290","s":"execute_native_thread_routine"},{"b":"7FABD95F8000","o":"7DC5"},{"b":"7FABD9236000","o":"F6CED","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.2.7", "gitVersion" : "4249c1d2b5999ebbf1fdf3bc0e0e3b3ff5c0aaf2", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-327.13.1.el7.x86_64", "version" : "#1 SMP Thu Mar 31 16:04:38 UTC 2016", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "05C2980D41C615E7C1AB7B5330630B8AB5F5B9D0" }, { "b" : "7FFEC0D98000", "elfType" : 3, "buildId" : "DB61D786C7F8127F5D8753887080EE3FA07D215C" }, { "b" : "7FABDA520000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "478D01A08B923A251D755BB421F3EBAF9F2982C1" }, { "b" : "7FABDA138000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "42AAFD25E9B5F4CE2EFE6309491445B1A92A575D" }, { "b" : "7FABD9F30000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "CB0D2C9F29DBD13C47E7D2EEFB94B35835698CCA" }, { "b" : "7FABD9D2C000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "091060A163E7EDA25572F3B1BAF2E8F80209C00E" }, { "b" : "7FABD9A2A000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "F9DF294FB70243549DCB643F1322BB20E70E9FE8" }, { "b" : "7FABD9814000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "6AA1DCC4DE7F1836344949857FC2017278631FFD" }, { "b" : "7FABD95F8000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "723F0AC75EF88E778940AE8A8BC30141D85B116A" }, { "b" : "7FABD9236000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "088D48A9AB5A512D9F75BA3D66B6CF77EB6588F9" }, { "b" : "7FABDA78D000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "09E1BB4D034C7263810A41100647068858A7ECB6" }, { "b" : "7FABD8FEA000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "D46A230FFF4A7B808B3CFC213D31FCAC542FB504" }, { "b" : "7FABD8D05000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "6D6136A0E795420B05854DEF13A10C226FE9CCB2" }, { "b" : "7FABD8B01000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "3A1166709F88740C49E060731832E3FAD2DFB66B" }, { "b" : "7FABD88CF000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "AA97A848DD7C9E57B06EC913E10D420AEBBCE027" }, { "b" : "7FABD86B9000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "1982C8CDAE90F898D1AD26DC07E807333B4789D0" }, { "b" : "7FABD84AA000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "AEF6C3D3C5152F339942041519A106FC055DAF71" }, { "b" : "7FABD82A6000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "2E01D5AC08C1280D013AAB96B292AC58BC30A263" }, { "b" : "7FABD808C000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "D02DC134F38F06F3885231FD2486D5EF4796E5F9" }, { "b" : "7FABD7E67000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "82FF6B18E1E42825CC2D060F969479AD4AF2F62C" }, { "b" : "7FABD7C06000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "AE64AA461A26E01F60408013D361749D56DD0AE1" }, { "b" : "7FABD79E1000", "path" : "/lib64/liblzma.so.5", "elfType" : 3, "buildId" : "98131C9354279ABD39FD80D4BE5B3EC5678BD9E0" } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x131a0d2]
       mongod(+0xF19229) [0x1319229]
       mongod(+0xF19A32) [0x1319a32]
       libpthread.so.0(+0xF100) [0x7fabd9607100]
       libc.so.6(gsignal+0x37) [0x7fabd926b5f7]
       libc.so.6(abort+0x148) [0x7fabd926cce8]
       mongod(_ZN5mongo13fassertFailedEi+0x82) [0x12a39d2]
       mongod(+0xC9E1A3) [0x109e1a3]
       mongod(__wt_eventv+0x42C) [0x1a7af5c]
       mongod(__wt_err+0x8D) [0x1a7b41d]
       mongod(__wt_panic+0x24) [0x1a7b804]
       mongod(+0x15AC3CD) [0x19ac3cd]
       mongod(__wt_block_extlist_read+0x394) [0x19ad554]
       mongod(__wt_block_extlist_read_avail+0x33) [0x19ad7a3]
      mongod(__wt_block_checkpoint_load+0x3C5) [0x19aa735]
       mongod(+0x15AE7E9) [0x19ae7e9]
       mongod(__wt_btree_open+0xC7C) [0x19cb25c]
       mongod(__wt_conn_btree_open+0x140) [0x1a010e0]
       mongod(__wt_session_get_btree+0xE7) [0x1a79c87]
       mongod(+0x16073F9) [0x1a073f9]
       mongod(+0x160776A) [0x1a0776a]
       mongod(__wt_meta_apply_all+0xBE) [0x1a4054e]
       mongod(+0x1607FCC) [0x1a07fcc]
       mongod(__wt_curbackup_open+0x296) [0x1a08366]
       mongod(+0x1676ECD) [0x1a76ecd]
       mongod(+0x1677259) [0x1a77259]
       mongod(_ZN5mongo18WiredTigerKVEngine11beginBackupEPNS_16OperationContextE+0x75) [0x1083895]
       mongod(_ZN5mongo15KVStorageEngine11beginBackupEPNS_16OperationContextE+0x6E) [0xfeb73e]
       mongod(_ZN5mongo15FSyncLockThread10doRealWorkEv+0x124) [0xb48474]
       mongod(_ZN5mongo15FSyncLockThread3runEv+0x24) [0xb49e44]
       mongod(_ZN5mongo13BackgroundJob7jobBodyEv+0x160) [0x12a8330]
       mongod(execute_native_thread_routine+0x20) [0x1b34290]
       libpthread.so.0(+0x7DC5) [0x7fabd95ffdc5]
       libc.so.6(clone+0x6D) [0x7fabd932cced]
      -----  END BACKTRACE  -----
      2017-02-06T23:45:08.131+0800 I CONTROL  [main] ***** SERVER RESTARTED *****
      2017-02-06T23:45:08.213+0800 I CONTROL  [initandlisten] MongoDB starting : pid=127129 port=27017 dbpath=/data/mongodb/db 64-bit host=swift_proxy
      2017-02-06T23:45:08.213+0800 I CONTROL  [initandlisten] db version v3.2.7
      2017-02-06T23:45:08.213+0800 I CONTROL  [initandlisten] git version: 4249c1d2b5999ebbf1fdf3bc0e0e3b3ff5c0aaf2
      2017-02-06T23:45:08.213+0800 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
      2017-02-06T23:45:08.213+0800 I CONTROL  [initandlisten] allocator: tcmalloc
      2017-02-06T23:45:08.213+0800 I CONTROL  [initandlisten] modules: none
      2017-02-06T23:45:08.213+0800 I CONTROL  [initandlisten] build environment:
      2017-02-06T23:45:08.213+0800 I CONTROL  [initandlisten]     distmod: rhel70
      2017-02-06T23:45:08.213+0800 I CONTROL  [initandlisten]     distarch: x86_64
      2017-02-06T23:45:08.213+0800 I CONTROL  [initandlisten]     target_arch: x86_64
      2017-02-06T23:45:08.213+0800 I CONTROL  [initandlisten] options: { config: "/data/mongodb/conf/mongod.conf", cpu: true, net: { bindIp: "172.16.10.2", maxIncomingConnections: 10000, port: 27017 }, processManagement: { fork: true, pidFilePath: "/data/mongodb/pid/mongod.pid" }, storage: { dbPath: "/data/mongodb/db", directoryPerDB: true, mmapv1: { nsSize: 16, smallFiles: true } }, systemLog: { destination: "file", logAppend: true, path: "/data/mongodb/log/mongod.log" }, upgrade: false }
      2017-02-06T23:45:08.243+0800 I -        [initandlisten] Detected data files in /data/mongodb/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
      2017-02-06T23:45:08.243+0800 W -        [initandlisten] Detected unclean shutdown - /data/mongodb/db/mongod.lock is not empty.
      2017-02-06T23:45:08.243+0800 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
      2017-02-06T23:45:08.243+0800 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=18G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
      2017-02-06T23:45:08.266+0800 I STORAGE  [initandlisten] WiredTiger Both WiredTiger.turtle and WiredTiger.backup exist. Recreating metadata from backup.
      2017-02-06T23:45:19.937+0800 E STORAGE  [initandlisten] WiredTiger (17) [1486395919:937105][127129:0x7f0a74630dc0], WT_SESSION.create: /data/mongodb/db/sizeStorer.wt: handle-open: open: File exists
      2017-02-06T23:45:19.937+0800 I STORAGE  [initandlisten] WiredTiger unexpected file sizeStorer.wt found, renamed to sizeStorer.wt.5
      2017-02-06T23:45:19.998+0800 E STORAGE  [initandlisten] WiredTiger (17) [1486395919:998445][127129:0x7f0a74630dc0], WT_SESSION.create: /data/mongodb/db/_mdb_catalog.wt: handle-open: open: File exists
      2017-02-06T23:45:19.998+0800 I STORAGE  [initandlisten] WiredTiger unexpected file _mdb_catalog.wt found, renamed to _mdb_catalog.wt.5
      2017-02-06T23:46:31.713+0800 W STORAGE  [initandlisten] Detected configuration for non-active storage engine mmapv1 when current storage engine is wiredTiger
      2017-02-06T23:46:31.713+0800 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
      2017-02-06T23:46:31.713+0800 I CONTROL  [initandlisten]
      2017-02-06T23:46:31.713+0800 I CONTROL  [initandlisten]
      2017-02-06T23:46:31.713+0800 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
      2017-02-06T23:46:31.713+0800 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
      2017-02-06T23:46:31.713+0800 I CONTROL  [initandlisten]
      2017-02-06T23:46:31.713+0800 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
      2017-02-06T23:46:31.713+0800 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
      2017-02-06T23:46:31.714+0800 I CONTROL  [initandlisten]
      2017-02-06T23:46:31.714+0800 I CONTROL  [initandlisten] ** WARNING: soft rlimits too low. rlimits set to 65535 processes, 600000 files. Number of processes should be at least 300000 : 0.5 times number of files.
      2017-02-06T23:46:31.714+0800 I CONTROL  [initandlisten]
      2017-02-06T23:46:31.714+0800 I NETWORK  [HostnameCanonicalizationWorker] Starting hostname canonicalization worker
      2017-02-06T23:46:31.714+0800 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/data/mongodb/db/diagnostic.data'
      2017-02-06T23:46:31.961+0800 I NETWORK  [initandlisten] waiting for connections on port 27017
      2017-02-06T23:46:32.074+0800 I FTDC     [ftdc] Unclean full-time diagnostic data capture shutdown detected, found interim file, some metrics may have been lost. OK
      

        Attachments

        1. mongod_log.txt
          12 kB
        2. mongod.conf
          0.2 kB

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                11 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: