Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17086

Can't start MongoDB w/ WiredTiger Engine because of Checksum Error

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Cannot Reproduce
    • Affects Version/s: 2.8.0-rc5, 3.0.0-rc6
    • Fix Version/s: None
    • Component/s: WiredTiger
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      1. Do some concurrent inserts into the database
      2. Wait for the RAM to go out
      2.1. MongoDB should crash
      3. Try to start MongoDB again
      4. The error stated above occurs.

      Show
      1. Do some concurrent inserts into the database 2. Wait for the RAM to go out 2.1. MongoDB should crash 3. Try to start MongoDB again 4. The error stated above occurs.

      Description

      Hi,

      I inserted many data via a concurrent program into a MongoDB Database (2.8.0-rc5) with the WiredEngine StorageEngine.

      I'm guessing the 12 GB RAM went out and the MongoDB process was killed. Before I post the log file, I want to say that I searched the jira for this issue and found some bugreports on this:

      • SERVER-16210 (Solution: Remove any Non-MongoDB file from the dbpath-folder. Didn't work for me)
      • SERVER-16214 (No Solution)
      • SERVER-16596 (No Solution)
      • SERVER-16172 (Solution: Update should have fixed that. Didn't work for me, I tried just now with MongoDB-3.0-latest, same error persists)
      • SERVER-16173 (Solution: Update should have fixed that. Didn't work for me)
      • SERVER-16804 (No Solution)

      So my issue seems to be the checksum error. Is there any way for me to change the checksum number? I tried to open the .wt files in the dbpath with a hex editor but due to the huge size of the file (~228 GiB) I don't even know where to begin searching. The hex addresses in the error log below don't help either.

      I'd appreciate any help, comment or hint

      Here's the log file on trying to start mongod with the WiredTiger StorageEngine. The exact command is:

      ./mongod --dbpath /some/path --port 8003 --storageEngine wiredTiger
      

      I changed the hostname and pathname in the log below manually.

      2015-01-28T11:16:40.912+0100 I CONTROL  [initandlisten] MongoDB starting : pid=866 port=8003 dbpath=/some/path 64-bit host=someHost
      2015-01-28T11:16:40.913+0100 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
      2015-01-28T11:16:40.913+0100 I CONTROL  [initandlisten] 
      2015-01-28T11:16:40.913+0100 I CONTROL  [initandlisten] 
      2015-01-28T11:16:40.913+0100 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
      2015-01-28T11:16:40.913+0100 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
      2015-01-28T11:16:40.913+0100 I CONTROL  [initandlisten] 
      2015-01-28T11:16:40.913+0100 I CONTROL  [initandlisten] db version v2.8.0-rc5
      2015-01-28T11:16:40.913+0100 I CONTROL  [initandlisten] git version: 74b351de21c84438b12a83b28e155f5e69e3c1eb
      2015-01-28T11:16:40.913+0100 I CONTROL  [initandlisten] build info: Linux build19.nj1.10gen.cc 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
      2015-01-28T11:16:40.913+0100 I CONTROL  [initandlisten] allocator: tcmalloc
      2015-01-28T11:16:40.913+0100 I CONTROL  [initandlisten] options: { net: { port: 8003 }, storage: { dbPath: "/some/path", engine: "wiredTiger" } }
      2015-01-28T11:16:40.913+0100 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=3G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
      2015-01-28T11:16:41.728+0100 E STORAGE  [initandlisten] WiredTiger (0) [1422440201:728246][866:0x7fde3ffd2b80], file:sizeStorer.wt: read checksum error [4096B @ 32768, 546414969 != 1946601003]
      2015-01-28T11:16:41.728+0100 E STORAGE  [initandlisten] WiredTiger (0) [1422440201:728309][866:0x7fde3ffd2b80], file:sizeStorer.wt: sizeStorer.wt: encountered an illegal file format or internal value
      2015-01-28T11:16:41.728+0100 E STORAGE  [initandlisten] WiredTiger (-31804) [1422440201:728322][866:0x7fde3ffd2b80], file:sizeStorer.wt: the process must exit and restart: WT_PANIC: WiredTiger library panic
      2015-01-28T11:16:41.728+0100 I -        [initandlisten] Fatal Assertion 28558
      2015-01-28T11:16:41.745+0100 I CONTROL  [initandlisten] 
       0xf25749 0xecf571 0xeb46e1 0xd406f6 0x13501f0 0x13504b5 0x1350921 0x12abb92 0x12c457c 0x12c24b6 0x12c33cf 0x12e93db 0x134f868 0x134f97a 0x12f58c1 0x134e40a 0x135b3af 0x135b729 0x130c7d0 0x135c264 0x12ec211 0x12e6e8c 0xd421c8 0xd3fd28 0xa5fdcd 0x7df7af 0x7e4344 0x7fde3ebf2040 0x7dd589
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"B25749"},{"b":"400000","o":"ACF571"},{"b":"400000","o":"AB46E1"},{"b":"400000","o":"9406F6"},{"b":"400000","o":"F501F0"},{"b":"400000","o":"F504B5"},{"b":"400000","o":"F50921"},{"b":"400000","o":"EABB92"},{"b":"400000","o":"EC457C"},{"b":"400000","o":"EC24B6"},{"b":"400000","o":"EC33CF"},{"b":"400000","o":"EE93DB"},{"b":"400000","o":"F4F868"},{"b":"400000","o":"F4F97A"},{"b":"400000","o":"EF58C1"},{"b":"400000","o":"F4E40A"},{"b":"400000","o":"F5B3AF"},{"b":"400000","o":"F5B729"},{"b":"400000","o":"F0C7D0"},{"b":"400000","o":"F5C264"},{"b":"400000","o":"EEC211"},{"b":"400000","o":"EE6E8C"},{"b":"400000","o":"9421C8"},{"b":"400000","o":"93FD28"},{"b":"400000","o":"65FDCD"},{"b":"400000","o":"3DF7AF"},{"b":"400000","o":"3E4344"},{"b":"7FDE3EBD2000","o":"20040"},{"b":"400000","o":"3DD589"}],"processInfo":{ "mongodbVersion" : "2.8.0-rc5", "gitVersion" : "74b351de21c84438b12a83b28e155f5e69e3c1eb", "uname" : { "sysname" : "Linux", "release" : "3.18.4-1-ARCH", "version" : "#1 SMP PREEMPT Tue Jan 27 20:45:02 CET 2015", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "7FFF9F704000", "path" : "linux-vdso.so.1", "elfType" : 3 }, { "b" : "7FDE3FBAB000", "path" : "/usr/lib/libpthread.so.0", "elfType" : 3 }, { "b" : "7FDE3F9A3000", "path" : "/usr/lib/librt.so.1", "elfType" : 3 }, { "b" : "7FDE3F79F000", "path" : "/usr/lib/libdl.so.2", "elfType" : 3 }, { "b" : "7FDE3F490000", "path" : "/usr/lib/libstdc++.so.6", "elfType" : 3 }, { "b" : "7FDE3F18B000", "path" : "/usr/lib/libm.so.6", "elfType" : 3 }, { "b" : "7FDE3EF75000", "path" : "/usr/lib/libgcc_s.so.1", "elfType" : 3 }, { "b" : "7FDE3EBD2000", "path" : "/usr/lib/libc.so.6", "elfType" : 3 }, { "b" : "7FDE3FDC7000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3 } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf25749]
       mongod(_ZN5mongo10logContextEPKc+0xE1) [0xecf571]
       mongod(_ZN5mongo13fassertFailedEi+0x61) [0xeb46e1]
       mongod(+0x9406F6) [0xd406f6]
       mongod(+0xF501F0) [0x13501f0]
       mongod(__wt_err+0x95) [0x13504b5]
       mongod(__wt_panic+0x21) [0x1350921]
       mongod(__wt_bm_read+0x72) [0x12abb92]
       mongod(__wt_bt_read+0x1AC) [0x12c457c]
       mongod(__wt_btree_tree_open+0x56) [0x12c24b6]
       mongod(__wt_btree_open+0xD9F) [0x12c33cf]
       mongod(__wt_conn_btree_get+0x19B) [0x12e93db]
       mongod(__wt_session_get_btree+0x2D8) [0x134f868]
       mongod(__wt_session_get_btree_ckpt+0xCA) [0x134f97a]
       mongod(__wt_curfile_open+0xE1) [0x12f58c1]
       mongod(__wt_open_cursor+0x26A) [0x134e40a]
       mongod(+0xF5B3AF) [0x135b3af]
       mongod(+0xF5B729) [0x135b729]
       mongod(__wt_log_scan+0x780) [0x130c7d0]
       mongod(__wt_txn_recover+0x424) [0x135c264]
       mongod(__wt_connection_workers+0x61) [0x12ec211]
       mongod(wiredtiger_open+0x118C) [0x12e6e8c]
       mongod(_ZN5mongo18WiredTigerKVEngineC1ERKSsS2_bb+0x308) [0xd421c8]
       mongod(+0x93FD28) [0xd3fd28]
       mongod(_ZN5mongo23GlobalEnvironmentMongoD22setGlobalStorageEngineERKSs+0x30D) [0xa5fdcd]
       mongod(_ZN5mongo13initAndListenEi+0x6EF) [0x7df7af]
       mongod(main+0x134) [0x7e4344]
       libc.so.6(__libc_start_main+0xF0) [0x7fde3ebf2040]
       mongod(+0x3DD589) [0x7dd589]
      -----  END BACKTRACE  -----
      2015-01-28T11:16:41.745+0100 I -        [initandlisten] 
       
      ***aborting after fassert() failure
      

        Attachments

        1. sizeStorer.wt
          36 kB
        2. WiredTiger.turtle
          0.8 kB
        3. WiredTiger.wt
          52 kB

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                13 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: