Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17551

mongod fatal assertion after "hazard pointer table full" message

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 3.0.0
    • Fix Version/s: 3.0.2, 3.1.1
    • Component/s: WiredTiger
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Completed:

      Description

      Issue Status as of Apr 02, 2015

      ISSUE SUMMARY
      MongoDB running with the WiredTiger storage engine may hit a default limit and terminate with the following error message:

      eviction-server: session 0x2b1a2c0: hazard pointer table full
      

      The default limit for the number of hazard pointers is 1000, which may be hit if the database has a large number of collections and indexes.

      USER IMPACT
      The mongod process terminates and needs to be restarted.

      WORKAROUNDS
      Users affected by this issue can increase default limit for the size of the hazard pointer table to, for example, 10000, by using the following command line option:

      --wiredTigerEngineConfigString="hazard_max=10000".
      

      AFFECTED VERSIONS
      MongoDB stable releases 3.0.0 and 3.0.1 are affected by this issue.

      FIX VERSION
      The fix is included in the 3.0.2 production release.

      Original description

      We're in the process of migrating our test environment to MongoDB 3.0. We spun up a new instance with 3.0, added it to the replica set and let it sync.
      However, a few hours into the process it crashed with the following error:

      2015-03-10T19:59:46.037+0000 I -        [repl writer worker 11]   Index Build: 196300/590825 33%
      2015-03-10T19:59:46.272+0000 E STORAGE  WiredTiger (0) [1426017586:270981][3822:0x7f38f7a89700], file:index-3720--4215350639811228100.wt, eviction-server: session 0x2ffe2c0: hazard pointer table full
      2015-03-10T19:59:46.274+0000 E STORAGE  WiredTiger (12) [1426017586:272380][3822:0x7f38f7a89700], eviction-server: cache eviction server error: Cannot allocate memory
      2015-03-10T19:59:46.274+0000 E STORAGE  WiredTiger (-31804) [1426017586:274241][3822:0x7f38f7a89700], eviction-server: the process must exit and restart: WT_PANIC: WiredTiger library panic
      2015-03-10T19:59:46.274+0000 I -        Fatal Assertion 28558
      2015-03-10T19:59:46.274+0000 I -        [repl writer worker 11] Fatal Assertion 28559
      2015-03-10T19:59:46.288+0000 I NETWORK  [initandlisten] connection accepted from X.Y.Z.X:58398 #1900 (5 connections now open)
      2015-03-10T19:59:46.308+0000 I NETWORK  [conn1897] end connection X.Y.Z.Y:58395 (4 connections now open)
      2015-03-10T19:59:46.329+0000 I CONTROL
       0xf5b229 0xefba41 0xee0de1 0xd71e16 0x1382a50 0x1382d15 0x13831b1 0x133a99f 0x7f38f9abfe9a 0x7f38f85962ed
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"B5B229"},{"b":"400000","o":"AFBA41"},{"b":"400000","o":"AE0DE1"},{"b":"400000","o":"971E16"},{"b":"400000","o":"F82A50"},{"b":"400000","o":"F82D15"},{"b":"400000","o":"F831B1"},{"b":"400000","o":"F3A99F"},{"b":"7F38F9AB8000","o":"7E9A"},{"b":"7F38F84A2000","o":"F42ED"}],"processInfo":{ "mongodbVersion" : "3.0.0", "gitVersion" : "a841fd6394365954886924a35076691b4d149168", "uname" : { "sysname" : "Linux", "release" : "3.2.0-75-virtual", "version" : "#110-Ubuntu SMP Tue Dec 16 19:24:01 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "5CFC121279316F2BC7FBB39F8268284EDF47E708" }, { "b" : "7FFF9A6FF000", "elfType" : 3, "buildId" : "975615E6DDB0F811039107F3F95DB4884097BE36" }, { "b" : "7F38F9AB8000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "C340AF9DEE97C17C730F7D03693286C5194A46B8" }, { "b" : "7F38F985A000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "E62302CCF011B0D1277B07E2B6170B678480F996" }, { "b" : "7F38F947F000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "DBF958ED97EEE7EAD42EC0F24EDB2E39ACFA2B83" }, { "b" : "7F38F9277000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "352C5B373A50E6C4AB881A5DB6F5766FDF81EEE0" }, { "b" : "7F38F9073000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "D181AF551DBBC43E9D55913D532635FDE18E7C4E" }, { "b" : "7F38F8D73000", "path" : "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", "elfType" : 3, "buildId" : "B534DA725D06A04267EB2FEB92B9CC14C838B57B" }, { "b" : "7F38F8A77000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "817AA99B3DD02501F8BC04A3E9A9358A08F20D7D" }, { "b" : "7F38F8861000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "ECF322A96E26633C5D10F18215170DD4395AF82C" }, { "b" : "7F38F84A2000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "DDD4987072B438ECB112023E07E432844CF89076" }, { "b" : "7F38F9CD5000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "E25AD1A11CCF57E734116B8EC9C69F643DCA9F18" }, { "b" : "7F38F828B000", "path" : "/lib/x86_64-linux-gnu/libz.so.1", "elfType" : 3, "buildId" : "F695ECFCF3918D5D34989398A14B7ECDD9F46CD0" } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf5b229]
       mongod(_ZN5mongo10logContextEPKc+0xE1) [0xefba41]
       mongod(_ZN5mongo13fassertFailedEi+0x61) [0xee0de1]
       mongod(+0x971E16) [0xd71e16]
       mongod(+0xF82A50) [0x1382a50]
       mongod(__wt_err+0x95) [0x1382d15]
       mongod(__wt_panic+0x21) [0x13831b1]
       mongod(+0xF3A99F) [0x133a99f]
       libpthread.so.0(+0x7E9A) [0x7f38f9abfe9a]
       libc.so.6(clone+0x6D) [0x7f38f85962ed]
      -----  END BACKTRACE  -----
      2015-03-10T19:59:46.329+0000 I -
       
      ***aborting after fassert() failure
      

      The database was running on an AWS instance with 7.5GB RAM, using the default WiredTiger config. We attempted again, on a larger instance (16GB RAM), but it still fails:

      2015-03-11T13:18:44.503+0000 E STORAGE  WiredTiger (0) [1426079924:501160][892:0x7f9799de0700], file:index-2117-6998163636744330624.wt, eviction-server: session 0x2b1a2c0: hazard pointer table full
      2015-03-11T13:18:44.504+0000 E STORAGE  WiredTiger (12) [1426079924:503483][892:0x7f9799de0700], eviction-server: cache eviction server error: Cannot allocate memory
      2015-03-11T13:18:44.504+0000 E STORAGE  WiredTiger (-31804) [1426079924:504371][892:0x7f9799de0700], eviction-server: the process must exit and restart: WT_PANIC: WiredTiger library panic
      2015-03-11T13:18:44.505+0000 I -        Fatal Assertion 28558
      2015-03-11T13:18:44.506+0000 E STORAGE  WiredTiger (-31804) [1426079924:506395][892:0x7f9797ddc700], session.checkpoint: metadata unroll update file:collection-1552-6998163636744330624.wt to allocation_size=4KB,app_metadata=(formatVersion=1),block_allocation=best,block_compressor=snappy,cache_resident=0,checkpoint=(WiredTigerCheckpoint.2=(addr="01c1e581e40bf68bf8c1e681e44df8a873c1e781e4c3e687b3808080e3226fc0e3217fc0",order=2,time=1426079809,size=2203648,write_gen=113)),checkpoint_lsn=(12784,6111872),checksum=uncompressed,collator=,columns=,dictionary=0,format=btree,huffman_key=,huffman_value=,id=14826,internal_item_max=0,internal_key_max=0,internal_key_truncate=,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=1MB,memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=0,prefix_compression_min=4,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,value_format=u,version=(major=1,minor=1): WT_PANIC: WiredTiger library panic
      2015-03-11T13:18:44.506+0000 I -        Fatal Assertion 28558
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 9] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 10] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 1] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 11] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 15] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 3] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 6] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 4] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 12] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 2] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 0] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 13] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 14] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 5] Fatal Assertion 28559
      2015-03-11T13:18:44.506+0000 I -        [repl writer worker 8] Fatal Assertion 28559
      2015-03-11T13:18:44.507+0000 I -        [repl writer worker 7] Fatal Assertion 28559
      2015-03-11T13:18:44.591+0000 I CONTROL  [repl writer worker 0]
       0xf5b229 0xefba41 0xee0de1 0xd88fd9 0xd7d6b4 0xd7d70f 0xd7d775 0xd6c725 0xa9c8c5 0xa260c0 0xa522bc 0xbe50c4 0xbe5474 0xbe5ffd 0xb37531 0xc4e655 0xca4dfe 0xca78f8 0xef2a5b 0xfa7ce4 0x7f979be16e9a 0x7f979a8ed2ed
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"B5B229"},{"b":"400000","o":"AFBA41"},{"b":"400000","o":"AE0DE1"},{"b":"400000","o":"988FD9"},{"b":"400000","o":"97D6B4"},{"b":"400000","o":"97D70F"},{"b":"400000","o":"97D775"},{"b":"400000","o":"96C725"},{"b":"400000","o":"69C8C5"},{"b":"400000","o":"6260C0"},{"b":"400000","o":"6522BC"},{"b":"400000","o":"7E50C4"},{"b":"400000","o":"7E5474"},{"b":"400000","o":"7E5FFD"},{"b":"400000","o":"737531"},{"b":"400000","o":"84E655"},{"b":"400000","o":"8A4DFE"},{"b":"400000","o":"8A78F8"},{"b":"400000","o":"AF2A5B"},{"b":"400000","o":"BA7CE4"},{"b":"7F979BE0F000","o":"7E9A"},{"b":"7F979A7F9000","o":"F42ED"}],"processInfo":{ "mongodbVersion" : "3.0.0", "gitVersion" : "a841fd6394365954886924a35076691b4d149168", "uname" : { "sysname" : "Linux", "release" : "3.2.0-75-virtual", "version" : "#110-Ubuntu SMP Tue Dec 16 19:24:01 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "5CFC121279316F2BC7FBB39F8268284EDF47E708" }, { "b" : "7FFFDEC64000", "elfType" : 3, "buildId" : "975615E6DDB0F811039107F3F95DB4884097BE36" }, { "b" : "7F979BE0F000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "C340AF9DEE97C17C730F7D03693286C5194A46B8" }, { "b" : "7F979BBB1000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "E62302CCF011B0D1277B07E2B6170B678480F996" }, { "b" : "7F979B7D6000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "DBF958ED97EEE7EAD42EC0F24EDB2E39ACFA2B83" }, { "b" : "7F979B5CE000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "352C5B373A50E6C4AB881A5DB6F5766FDF81EEE0" }, { "b" : "7F979B3CA000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "D181AF551DBBC43E9D55913D532635FDE18E7C4E" }, { "b" : "7F979B0CA000", "path" : "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", "elfType" : 3, "buildId" : "B534DA725D06A04267EB2FEB92B9CC14C838B57B" }, { "b" : "7F979ADCE000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "817AA99B3DD02501F8BC04A3E9A9358A08F20D7D" }, { "b" : "7F979ABB8000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "ECF322A96E26633C5D10F18215170DD4395AF82C" }, { "b" : "7F979A7F9000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "DDD4987072B438ECB112023E07E432844CF89076" }, { "b" : "7F979C02C000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "E25AD1A11CCF57E734116B8EC9C69F643DCA9F18" }, { "b" : "7F979A5E2000", "path" : "/lib/x86_64-linux-gnu/libz.so.1", "elfType" : 3, "buildId" : "F695ECFCF3918D5D34989398A14B7ECDD9F46CD0" } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf5b229]
       mongod(_ZN5mongo10logContextEPKc+0xE1) [0xefba41]
       mongod(_ZN5mongo13fassertFailedEi+0x61) [0xee0de1]
       mongod(_ZN5mongo17wtRCToStatus_slowEiPKc+0x309) [0xd88fd9]
       mongod(_ZN5mongo22WiredTigerRecoveryUnit8_txnOpenEPNS_16OperationContextE+0x124) [0xd7d6b4]
       mongod(_ZN5mongo22WiredTigerRecoveryUnit10getSessionEPNS_16OperationContextE+0x1F) [0xd7d70f]
       mongod(_ZN5mongo16WiredTigerCursorC1ERKSsmbPNS_16OperationContextE+0x35) [0xd7d775]
       mongod(_ZNK5mongo21WiredTigerIndexUnique9newCursorEPNS_16OperationContextEi+0x55) [0xd6c725]
       mongod(_ZNK5mongo22BtreeBasedAccessMethod10findSingleEPNS_16OperationContextERKNS_7BSONObjE+0x25) [0xa9c8c5]
       mongod(_ZN5mongo11IDHackStage4workEPm+0xD0) [0xa260c0]
       mongod(_ZN5mongo11UpdateStage4workEPm+0x7C) [0xa522bc]
       mongod(_ZN5mongo12PlanExecutor18getNextSnapshottedEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0xA4) [0xbe50c4]
       mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x34) [0xbe5474]
       mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x3D) [0xbe5ffd]
       mongod(_ZN5mongo6updateEPNS_16OperationContextEPNS_8DatabaseERKNS_13UpdateRequestEPNS_7OpDebugE+0x111) [0xb37531]
       mongod(_ZN5mongo4repl21applyOperation_inlockEPNS_16OperationContextEPNS_8DatabaseERKNS_7BSONObjEbb+0x12B5) [0xc4e655]
       mongod(_ZN5mongo4repl8SyncTail9syncApplyEPNS_16OperationContextERKNS_7BSONObjEb+0x2EE) [0xca4dfe]
       mongod(_ZN5mongo4repl21multiInitialSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x78) [0xca78f8]
       mongod(_ZN5mongo10threadpool6Worker4loopERKSs+0x2FB) [0xef2a5b]
       mongod(+0xBA7CE4) [0xfa7ce4]
       libpthread.so.0(+0x7E9A) [0x7f979be16e9a]
       libc.so.6(clone+0x6D) [0x7f979a8ed2ed]
      -----  END BACKTRACE  -----
      2015-03-11T13:18:44.591+0000 I CONTROL  [repl writer worker 6]
       0xf5b229 0xefba41 0xee0de1 0xd88fd9 0xd7d6b4 0xd7d70f 0xd7d775 0xd6c725 0xa9c8c5 0xa260c0 0xa522bc 0xbe50c4 0xbe5474 0xbe5ffd 0xb37531 0xc4e655 0xca4dfe 0xca78f8 0xef2a5b 0xfa7ce4 0x7f979be16e9a 0x7f979a8ed2ed
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"B5B229"},{"b":"400000","o":"AFBA41"},{"b":"400000","o":"AE0DE1"},{"b":"400000","o":"988FD9"},{"b":"400000","o":"97D6B4"},{"b":"400000","o":"97D70F"},{"b":"400000","o":"97D775"},{"b":"400000","o":"96C725"},{"b":"400000","o":"69C8C5"},{"b":"400000","o":"6260C0"},{"b":"400000","o":"6522BC"},{"b":"400000","o":"7E50C4"},{"b":"400000","o":"7E5474"},{"b":"400000","o":"7E5FFD"},{"b":"400000","o":"737531"},{"b":"400000","o":"84E655"},{"b":"400000","o":"8A4DFE"},{"b":"400000","o":"8A78F8"},{"b":"400000","o":"AF2A5B"},{"b":"400000","o":"BA7CE4"},{"b":"7F979BE0F000","o":"7E9A"},{"b":"7F979A7F9000","o":"F42ED"}],"processInfo":{ "mongodbVersion" : "3.0.0", "gitVersion" : "a841fd6394365954886924a35076691b4d149168", "uname" : { "sysname" : "Linux", "release" : "3.2.0-75-virtual", "version" : "#110-Ubuntu SMP Tue Dec 16 19:24:01 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "5CFC121279316F2BC7FBB39F8268284EDF47E708" }, { "b" : "7FFFDEC64000", "elfType" : 3, "buildId" : "975615E6DDB0F811039107F3F95DB4884097BE36" }, { "b" : "7F979BE0F000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "C340AF9DEE97C17C730F7D03693286C5194A46B8" }, { "b" : "7F979BBB1000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "E62302CCF011B0D1277B07E2B6170B678480F996" }, { "b" : "7F979B7D6000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "DBF958ED97EEE7EAD42EC0F24EDB2E39ACFA2B83" }, { "b" : "7F979B5CE000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "352C5B373A50E6C4AB881A5DB6F5766FDF81EEE0" }, { "b" : "7F979B3CA000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "D181AF551DBBC43E9D55913D532635FDE18E7C4E" }, { "b" : "7F979B0CA000", "path" : "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", "elfType" : 3, "buildId" : "B534DA725D06A04267EB2FEB92B9CC14C838B57B" }, { "b" : "7F979ADCE000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "817AA99B3DD02501F8BC04A3E9A9358A08F20D7D" }, { "b" : "7F979ABB8000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "ECF322A96E26633C5D10F18215170DD4395AF82C" }, { "b" : "7F979A7F9000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "DDD4987072B438ECB112023E07E432844CF89076" }, { "b" : "7F979C02C000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "E25AD1A11CCF57E734116B8EC9C69F643DCA9F18" }, { "b" : "7F979A5E2000", "path" : "/lib/x86_64-linux-gnu/libz.so.1", "elfType" : 3, "buildId" : "F695ECFCF3918D5D34989398A14B7ECDD9F46CD0" } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf5b229]
       mongod(_ZN5mongo10logContextEPKc+0xE1) [0xefba41]
       mongod(_ZN5mongo13fassertFailedEi+0x61) [0xee0de1]
       mongod(_ZN5mongo17wtRCToStatus_slowEiPKc+0x309) [0xd88fd9]
       mongod(_ZN5mongo22WiredTigerRecoveryUnit8_txnOpenEPNS_16OperationContextE+0x124) [0xd7d6b4]
       mongod(_ZN5mongo22WiredTigerRecoveryUnit10getSessionEPNS_16OperationContextE+0x1F) [0xd7d70f]
       mongod(_ZN5mongo16WiredTigerCursorC1ERKSsmbPNS_16OperationContextE+0x35) [0xd7d775]
       mongod(_ZNK5mongo21WiredTigerIndexUnique9newCursorEPNS_16OperationContextEi+0x55) [0xd6c725]
       mongod(_ZNK5mongo22BtreeBasedAccessMethod10findSingleEPNS_16OperationContextERKNS_7BSONObjE+0x25) [0xa9c8c5]
       mongod(_ZN5mongo11IDHackStage4workEPm+0xD0) [0xa260c0]
       mongod(_ZN5mongo11UpdateStage4workEPm+0x7C) [0xa522bc]
       mongod(_ZN5mongo12PlanExecutor18getNextSnapshottedEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0xA4) [0xbe50c4]
       mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x34) [0xbe5474]
       mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x3D) [0xbe5ffd]
       mongod(_ZN5mongo6updateEPNS_16OperationContextEPNS_8DatabaseERKNS_13UpdateRequestEPNS_7OpDebugE+0x111) [0xb37531]
       mongod(_ZN5mongo4repl21applyOperation_inlockEPNS_16OperationContextEPNS_8DatabaseERKNS_7BSONObjEbb+0x12B5) [0xc4e655]
       mongod(_ZN5mongo4repl8SyncTail9syncApplyEPNS_16OperationContextERKNS_7BSONObjEb+0x2EE) [0xca4dfe]
       mongod(_ZN5mongo4repl21multiInitialSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x78) [0xca78f8]
       mongod(_ZN5mongo10threadpool6Worker4loopERKSs+0x2FB) [0xef2a5b]
       mongod(+0xBA7CE4) [0xfa7ce4]
       libpthread.so.0(+0x7E9A) [0x7f979be16e9a]
       libc.so.6(clone+0x6D) [0x7f979a8ed2ed]
      -----  END BACKTRACE  -----
      2015-03-11T13:18:44.591+0000 I -        [repl writer worker 0]
       
      ***aborting after fassert() failure
       
       
      2015-03-11T13:18:44.591+0000 I -        [repl writer worker 6]
       
      ***aborting after fassert() failure
      

      The database size on the 3.0.0 machine was ~50GB at this point. The master has ~220GB of data, but it's hard to know how far it had synced as it's using Snappy compression on the new node.

        Issue Links

          Activity

          Hide
          michael.cahill Michael Cahill added a comment -

          Hi Markus Svensson, the new error is:

          2015-03-13T19:00:34.081+0300 E STORAGE  [initandlisten] WiredTiger (24) [1426262434:81069][8903:0x7fbbeb66dc40], file:collection-6005-900971221894503514.wt, session.open_cursor: /data/collection-6005-900971221894503514.wt: Too many open files
          

          This indicates that the hazard pointer workaround was effective, and that you are now hitting the system limit on the number of open files. With WiredTiger, you need to configure enough file handles for all collections and indices. See http://docs.mongodb.org/manual/reference/ulimit/ for more details.

          Show
          michael.cahill Michael Cahill added a comment - Hi Markus Svensson , the new error is: 2015-03-13T19:00:34.081+0300 E STORAGE [initandlisten] WiredTiger (24) [1426262434:81069][8903:0x7fbbeb66dc40], file:collection-6005-900971221894503514.wt, session.open_cursor: /data/collection-6005-900971221894503514.wt: Too many open files This indicates that the hazard pointer workaround was effective, and that you are now hitting the system limit on the number of open files. With WiredTiger, you need to configure enough file handles for all collections and indices. See http://docs.mongodb.org/manual/reference/ulimit/ for more details.
          Hide
          michael.cahill Michael Cahill added a comment -
          Show
          michael.cahill Michael Cahill added a comment - A fix is in review here: https://github.com/wiredtiger/wiredtiger/pull/1761
          Hide
          michael.cahill Michael Cahill added a comment -

          Resolved with latest drop from WT.

          Show
          michael.cahill Michael Cahill added a comment - Resolved with latest drop from WT.
          Hide
          nbarnes Noah added a comment - - edited

          I am on Mongo 3.0.7 (Ubuntu 14.04.3 LTS) with a sharded cluster and I just got this error:

          2016-04-06T10:49:34.232-0700 I NETWORK  [initandlisten] connection accepted from 10.84.1.41:38827 #19282 (38 connections now open)
          2016-04-06T10:49:35.348-0700 E STORAGE  [conn19282] WiredTiger (0) [1459964975:348715][1115:0x7f1e58f19700], file:collection-15--275277209615762107.wt, cursor.search: session 0x2acf440: hazard pointer table full
          2016-04-06T10:49:35.348-0700 I -        [conn19282] Invariant failure: ret resulted in status UnknownError 12: Cannot allocate memory at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp 326
          2016-04-06T10:49:35.403-0700 I CONTROL  [conn19282]
           0xf5c3e9 0xefb231 0xee19da 0xd7469d 0x910e00 0xa014ab 0xa106d5 0xa012bd 0xa23885 0xa1abde 0xa126b4 0xa12fa7 0xbcf722 0xbcfd8c 0xbd00bf 0xba1a7f 0xb9d526 0xab38e0 0x80e38d 0xf0f29b 0x7f1e791ca182 0x7f1e77c9147d
          ----- BEGIN BACKTRACE -----
          {"backtrace":[{"b":"400000","o":"B5C3E9"},{"b":"400000","o":"AFB231"},{"b":"400000","o":"AE19DA"},{"b":"400000","o":"97469D"},{"b":"400000","o":"510E00"},{"b":"400000","o":"6014AB"},{"b":"400000","o":"6106D5"},{"b":"400000","o":"6012BD"},{"b":"400000","o":"623885"},{"b":"400000","o":"61ABDE"},{"b":"400000","o":"6126B4"},{"b":"400000","o":"612FA7"},{"b":"400000","o":"7CF722"},{"b":"400000","o":"7CFD8C"},{"b":"400000","o":"7D00BF"},{"b":"400000","o":"7A1A7F"},{"b":"400000","o":"79D526"},{"b":"400000","o":"6B38E0"},{"b":"400000","o":"40E38D"},{"b":"400000","o":"B0F29B"},{"b":"7F1E791C2000","o":"8182"},{"b":"7F1E77B97000","o":"FA47D"}],"processInfo":{ "mongodbVersion" : "3.0.7", "gitVersion" : "6ce7cbe8c6b899552dadd907604559806aa2e9bd", "uname" : { "sysname" : "Linux", "release" : "3.16.0-30-generic", "version" : "#40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "03104A2519911F189AABD1EBC9931E8EBE9AD44C" }, { "b" : "7FFF243F4000", "elfType" : 3, "buildId" : "C8BA9F3BA421CFBAE75F7E57F357B1B5431DE838" }, { "b" : "7F1E791C2000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "9318E8AF0BFBE444731BB0461202EF57F7C39542" }, { "b" : "7F1E78F63000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "A20EFFEC993A8441FA17F2079F923CBD04079E19" }, { "b" : "7F1E78B88000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "F000D29917E9B6E94A35A8F02E5C62846E5916BC" }, { "b" : "7F1E78980000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "92FCF41EFE012D6186E31A59AD05BDBB487769AB" }, { "b" : "7F1E7877C000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "C1AE4CB7195D337A77A3C689051DABAA3980CA0C" }, { "b" : "7F1E78478000", "path" : "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", "elfType" : 3, "buildId" : "4BF6F7ADD8244AD86008E6BF40D90F8873892197" }, { "b" : "7F1E78172000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "1D76B71E905CB867B27CEF230FCB20F01A3178F5" }, { "b" : "7F1E77F5C000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "8D0AA71411580EE6C08809695C3984769F25725B" }, { "b" : "7F1E77B97000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "30C94DC66A1FE95180C3D68D2B89E576D5AE213C" }, { "b" : "7F1E793E0000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9F00581AB3C73E3AEA35995A0C50D24D59A01D47" } ] }}
           mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf5c3e9]
           mongod(_ZN5mongo10logContextEPKc+0xE1) [0xefb231]
           mongod(_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j+0xDA) [0xee19da]
           mongod(_ZNK5mongo21WiredTigerRecordStore7dataForEPNS_16OperationContextERKNS_8RecordIdE+0x12D) [0xd7469d]
           mongod(_ZNK5mongo10Collection6docForEPNS_16OperationContextERKNS_8RecordIdE+0x20) [0x910e00]
           mongod(_ZN5mongo10FetchStage4workEPm+0x2BB) [0xa014ab]
           mongod(_ZN5mongo14MergeSortStage4workEPm+0x75) [0xa106d5]
           mongod(_ZN5mongo10FetchStage4workEPm+0xCD) [0xa012bd]
           mongod(_ZN5mongo16ShardFilterStage4workEPm+0x55) [0xa23885]
           mongod(_ZN5mongo15ProjectionStage4workEPm+0x4E) [0xa1abde]
           mongod(_ZN5mongo14MultiPlanStage12workAllPlansEmPNS_15PlanYieldPolicyE+0xE4) [0xa126b4]
           mongod(_ZN5mongo14MultiPlanStage12pickBestPlanEPNS_15PlanYieldPolicyE+0xC7) [0xa12fa7]
           mongod(_ZN5mongo12PlanExecutor12pickBestPlanENS0_11YieldPolicyE+0x72) [0xbcf722]
           mongod(_ZN5mongo12PlanExecutor4makeEPNS_16OperationContextEPNS_10WorkingSetEPNS_9PlanStageEPNS_13QuerySolutionEPNS_14CanonicalQueryEPKNS_10CollectionERKSsNS0_11YieldPolicyEPPS0_+0x7C) [0xbcfd8c]
           mongod(_ZN5mongo12PlanExecutor4makeEPNS_16OperationContextEPNS_10WorkingSetEPNS_9PlanStageEPNS_13QuerySolutionEPNS_14CanonicalQueryEPKNS_10CollectionENS0_11YieldPolicyEPPS0_+0x7F) [0xbd00bf]
           mongod(_ZN5mongo11getExecutorEPNS_16OperationContextEPNS_10CollectionEPNS_14CanonicalQueryENS_12PlanExecutor11YieldPolicyEPPS6_m+0xCF) [0xba1a7f]
           mongod(_ZN5mongo8runQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERKNS_15NamespaceStringERNS_5CurOpES3_+0x666) [0xb9d526]
           mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xB10) [0xab38e0]
           mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xDD) [0x80e38d]
           mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x34B) [0xf0f29b]
           libpthread.so.0(+0x8182) [0x7f1e791ca182]
           libc.so.6(clone+0x6D) [0x7f1e77c9147d]
          -----  END BACKTRACE  -----
          2016-04-06T10:49:35.403-0700 I -        [conn19282]
           
          ***aborting after invariant() failure
          

          Show
          nbarnes Noah added a comment - - edited I am on Mongo 3.0.7 (Ubuntu 14.04.3 LTS) with a sharded cluster and I just got this error: 2016-04-06T10:49:34.232-0700 I NETWORK [initandlisten] connection accepted from 10.84.1.41:38827 #19282 (38 connections now open) 2016-04-06T10:49:35.348-0700 E STORAGE [conn19282] WiredTiger (0) [1459964975:348715][1115:0x7f1e58f19700], file:collection-15--275277209615762107.wt, cursor.search: session 0x2acf440: hazard pointer table full 2016-04-06T10:49:35.348-0700 I - [conn19282] Invariant failure: ret resulted in status UnknownError 12: Cannot allocate memory at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp 326 2016-04-06T10:49:35.403-0700 I CONTROL [conn19282] 0xf5c3e9 0xefb231 0xee19da 0xd7469d 0x910e00 0xa014ab 0xa106d5 0xa012bd 0xa23885 0xa1abde 0xa126b4 0xa12fa7 0xbcf722 0xbcfd8c 0xbd00bf 0xba1a7f 0xb9d526 0xab38e0 0x80e38d 0xf0f29b 0x7f1e791ca182 0x7f1e77c9147d ----- BEGIN BACKTRACE ----- {"backtrace":[{"b":"400000","o":"B5C3E9"},{"b":"400000","o":"AFB231"},{"b":"400000","o":"AE19DA"},{"b":"400000","o":"97469D"},{"b":"400000","o":"510E00"},{"b":"400000","o":"6014AB"},{"b":"400000","o":"6106D5"},{"b":"400000","o":"6012BD"},{"b":"400000","o":"623885"},{"b":"400000","o":"61ABDE"},{"b":"400000","o":"6126B4"},{"b":"400000","o":"612FA7"},{"b":"400000","o":"7CF722"},{"b":"400000","o":"7CFD8C"},{"b":"400000","o":"7D00BF"},{"b":"400000","o":"7A1A7F"},{"b":"400000","o":"79D526"},{"b":"400000","o":"6B38E0"},{"b":"400000","o":"40E38D"},{"b":"400000","o":"B0F29B"},{"b":"7F1E791C2000","o":"8182"},{"b":"7F1E77B97000","o":"FA47D"}],"processInfo":{ "mongodbVersion" : "3.0.7", "gitVersion" : "6ce7cbe8c6b899552dadd907604559806aa2e9bd", "uname" : { "sysname" : "Linux", "release" : "3.16.0-30-generic", "version" : "#40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "03104A2519911F189AABD1EBC9931E8EBE9AD44C" }, { "b" : "7FFF243F4000", "elfType" : 3, "buildId" : "C8BA9F3BA421CFBAE75F7E57F357B1B5431DE838" }, { "b" : "7F1E791C2000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "9318E8AF0BFBE444731BB0461202EF57F7C39542" }, { "b" : "7F1E78F63000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "A20EFFEC993A8441FA17F2079F923CBD04079E19" }, { "b" : "7F1E78B88000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "F000D29917E9B6E94A35A8F02E5C62846E5916BC" }, { "b" : "7F1E78980000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "92FCF41EFE012D6186E31A59AD05BDBB487769AB" }, { "b" : "7F1E7877C000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "C1AE4CB7195D337A77A3C689051DABAA3980CA0C" }, { "b" : "7F1E78478000", "path" : "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", "elfType" : 3, "buildId" : "4BF6F7ADD8244AD86008E6BF40D90F8873892197" }, { "b" : "7F1E78172000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "1D76B71E905CB867B27CEF230FCB20F01A3178F5" }, { "b" : "7F1E77F5C000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "8D0AA71411580EE6C08809695C3984769F25725B" }, { "b" : "7F1E77B97000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "30C94DC66A1FE95180C3D68D2B89E576D5AE213C" }, { "b" : "7F1E793E0000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9F00581AB3C73E3AEA35995A0C50D24D59A01D47" } ] }} mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf5c3e9] mongod(_ZN5mongo10logContextEPKc+0xE1) [0xefb231] mongod(_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j+0xDA) [0xee19da] mongod(_ZNK5mongo21WiredTigerRecordStore7dataForEPNS_16OperationContextERKNS_8RecordIdE+0x12D) [0xd7469d] mongod(_ZNK5mongo10Collection6docForEPNS_16OperationContextERKNS_8RecordIdE+0x20) [0x910e00] mongod(_ZN5mongo10FetchStage4workEPm+0x2BB) [0xa014ab] mongod(_ZN5mongo14MergeSortStage4workEPm+0x75) [0xa106d5] mongod(_ZN5mongo10FetchStage4workEPm+0xCD) [0xa012bd] mongod(_ZN5mongo16ShardFilterStage4workEPm+0x55) [0xa23885] mongod(_ZN5mongo15ProjectionStage4workEPm+0x4E) [0xa1abde] mongod(_ZN5mongo14MultiPlanStage12workAllPlansEmPNS_15PlanYieldPolicyE+0xE4) [0xa126b4] mongod(_ZN5mongo14MultiPlanStage12pickBestPlanEPNS_15PlanYieldPolicyE+0xC7) [0xa12fa7] mongod(_ZN5mongo12PlanExecutor12pickBestPlanENS0_11YieldPolicyE+0x72) [0xbcf722] mongod(_ZN5mongo12PlanExecutor4makeEPNS_16OperationContextEPNS_10WorkingSetEPNS_9PlanStageEPNS_13QuerySolutionEPNS_14CanonicalQueryEPKNS_10CollectionERKSsNS0_11YieldPolicyEPPS0_+0x7C) [0xbcfd8c] mongod(_ZN5mongo12PlanExecutor4makeEPNS_16OperationContextEPNS_10WorkingSetEPNS_9PlanStageEPNS_13QuerySolutionEPNS_14CanonicalQueryEPKNS_10CollectionENS0_11YieldPolicyEPPS0_+0x7F) [0xbd00bf] mongod(_ZN5mongo11getExecutorEPNS_16OperationContextEPNS_10CollectionEPNS_14CanonicalQueryENS_12PlanExecutor11YieldPolicyEPPS6_m+0xCF) [0xba1a7f] mongod(_ZN5mongo8runQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERKNS_15NamespaceStringERNS_5CurOpES3_+0x666) [0xb9d526] mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xB10) [0xab38e0] mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xDD) [0x80e38d] mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x34B) [0xf0f29b] libpthread.so.0(+0x8182) [0x7f1e791ca182] libc.so.6(clone+0x6D) [0x7f1e77c9147d] ----- END BACKTRACE ----- 2016-04-06T10:49:35.403-0700 I - [conn19282]   ***aborting after invariant() failure
          Hide
          michael.cahill Michael Cahill added a comment -

          Noah, please open a new ticket with details of your MongoDB deployment. If possible, please also try upgrading to the latest 3.0.x release (currently 3.0.11) to see whether it resolves the issue.

          Show
          michael.cahill Michael Cahill added a comment - Noah , please open a new ticket with details of your MongoDB deployment. If possible, please also try upgrading to the latest 3.0.x release (currently 3.0.11) to see whether it resolves the issue.

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: