Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-37917

Even with unlimited NOFILE and NPROC WT crashes with UnknownError: 24: Too many open

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Stability, WiredTiger
    • Labels:
      None
    • ALL

      Following https://docs.mongodb.com/manual/reference/ulimit/#linux-distributions-using-systemd as a baseline we suddenly got MongoDB servers crashing due to "Too many open files at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp".
      Increasing the limits to unlimited/infinity did not help.

      $ stat -c "%y %n" /etc/security/limits.d/mongo.conf
      2018-11-04 01:39:16.458031873 +0100 /etc/security/limits.d/mongo.conf
      $ cat /etc/security/limits.d/mongo.conf
      mongod           hard    nproc           unlimited
      mongod           soft    nproc           unlimited
      mongod           soft    nofile          unlimited
      mongod           hard    nofile          unlimited
      mongod           hard    memlock         unlimited
      mongod           soft    memlock         unlimited
      $ stat -c "%y %n" /etc/systemd/system/mongo.service
      2018-11-03 23:11:25.526044456 +0100 /etc/systemd/system/mongo.service
      $ grep -Ei 'proc|file' /etc/systemd/system/mongo.service
      PIDFile=/var/run/mongodb/mongo.pid
      LimitNOFILE=infinity
      LimitNPROC=infinity
      $ uptime
       01:59:20 up 12 min,  1 user,  load average: 0.74, 0.63, 0.42
      $ df -h | grep mongo
      /dev/mapper/sysvg-lvmongojournal   32G  334M   32G   2% /mongojournal
      /dev/mapper/datavg-lvdata         2.8T  133G  2.6T   5% /var/lib/mongo
      [tabd@mongo13 ~]$ systemctl --no-pager show mongo.service | egrep 'NOFILE|NPROC'
      LimitNOFILE=18446744073709551615
      LimitNPROC=18446744073709551615
      
      $ uname -r && cat /etc/redhat-release
      3.10.0-957.el7.x86_64
      Red Hat Enterprise Linux Server release 7.6 (Maipo)
      $ rpm -qa | grep mongo
      mongodb-org-server-3.4.15-1.el7.x86_64
      mongodb-org-shell-3.4.15-1.el7.x86_64
      mongodb-org-tools-3.4.15-1.el7.x86_64
      mongodb-org-3.4.15-1.el7.x86_64
      python-pymongo-2.5.2-4.el7.x86_64
      mongodb-org-mongos-3.4.15-1.el7.x86_64
      Nov  4 01:55:09 mongo13 mongod.37021[15996]: [initandlisten] Placing a marker at optime Nov  3 16:50:06:195b8
      Nov  4 01:55:09 mongo13 mongod.37021[15996]: [initandlisten] Placing a marker at optime Nov  3 16:51:43:1be
      Nov  4 01:56:02 mongo13 mongod.37021[15996]: [initandlisten] WiredTiger error (24) [1541292962:666827][15996:0x7ff57968ee40], file:collection-32772-8549624332636919381.wt, WT_SESSION.open_cursor: /var/lib/mongo/data/game/prod/collection-32772-8549624332636919381.wt: handle-open: open: Too many open files
      Nov  4 01:56:02 mongo13 mongod.37021[15996]: [initandlisten] Invariant failure: ret resulted in status UnknownError: 24: Too many open files at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp 113
      Nov  4 01:56:02 mongo13 mongod.37021[15996]: [initandlisten] #012#012***aborting after invariant() failure#012#012
      Nov  4 01:56:02 mongo13 mongod.37021[15996]: [initandlisten] Got signal: 6 (Aborted).#012#012 0x557a29793611 0x557a29792829 0x557a29792d0d 0x7ff57827c5d0 0x7ff577ed6207 0x7ff577ed78f8 0x557a28a323eb 0x557a2949abe0 0x557a294988dc 0x557a29494df4 0x557a2949341c 0x557a29480a2b 0x557a293c16c3 0x557a293c64b4 0x557a2947e56a 0x557a293704e7 0x557a28a1e94c 0x557a28a3e4ab 0x7ff577ec23d5 0x557a28a9d84f#012----- BEGIN BACKTRACE -----#012{"backtrace":[{"b":"557A28216000","o":"157D611","s":"_ZN5mongo15printStackTraceERSo"},{"b":"557A28216000","o":"157C829"},{"b":"557A28216000","o":"157CD0D"},{"b":"7FF57826D000","o":"F5D0"},{"b":"7FF577EA0000","o":"36207","s":"gsignal"},{"b":"7FF577EA0000","o":"378F8","s":"abort"},{"b":"557A28216000","o":"81C3EB","s":"_ZN5mongo25fassertFailedWithLocationEiPKcj"},{"b":"557A28216000","o":"1284BE0","s":"_ZN5mongo17WiredTigerSession9getCursorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEmb"},{"b":"557A28216000","o":"12828DC","s":"_ZN5mongo16WiredTigerCursorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEmbPNS_16OperationContextE"},{"b":"557A28216000","o":"127EDF4","s":"_ZN5mongo21WiredTigerRecordStore6CursorC1EPNS_16OperationContextERKS0_b"},{"b":"557A28216000","o":"127D41C","s":"_ZN5mongo21WiredTigerRecordStoreC1EPNS_16OperationContextENS_10StringDataES3_NSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEbbllPNS_14CappedCallbackEPNS_20WiredTigerSizeStorerE"},{"b":"557A28216000","o":"126AA2B","s":"_ZN5mongo18WiredTigerKVEngine14getRecordStoreEPNS_16OperationContextENS_10StringDataES3_RKNS_17CollectionOptionsE"},{"b":"557A28216000","o":"11AB6C3","s":"_ZN5mongo22KVDatabaseCatalogEntry14initCollectionEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEb"},{"b":"557A28216000","o":"11B04B4","s":"_ZN5mongo15KVStorageEngineC1EPNS_8KVEngineERKNS_22KVStorageEngineOptionsE"},{"b":"557A28216000","o":"126856A"},{"b":"557A28216000","o":"115A4E7","s":"_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv"},{"b":"557A28216000","o":"80894C"},{"b":"557A28216000","o":"8284AB","s":"main"},{"b":"7FF577EA0000","o":"223D5","s":"__libc_start_main"},{"b":"557A28216000","o":"88784F"}],"processInfo":{ "mongodbVersion" : "3.4.15", "gitVersion" : "52e5b5fbaa3a2a5b1a217f5e647b5061817475f9", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-957.el7.x86_64", "version" : "#1 SMP Thu Oct 4 20:48:51 UTC 2018", "machine" : "x86_64" }, "somap" : [ { "b" : "557A28216000", "elfType" : 3, "buildId" : "B31EB3B086220F78FD19DFCBD52942FA9F35AC9D" }, { "b" : "7FFE220F4000", "elfType" : 3, "buildId" : "163C2DC43405427478788BAD0AFD537A7ACF7A13" }, { "b" : "7FF57920E000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "AEF5E6F2240B55F90E9DF76CFBB8B9D9F5286583" }, { "b" : "7FF578DAD000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "8BD89856B64DD5189BF075EF574EDF203F93D44A" }, { "b" : "7FF578BA5000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "EFDE2029C9A4A20BE5B8D8AE7E6551FF9B5755D2" }, { "b" : "7FF5789A1000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "67AD3498AC7DE3EAB952A243094DF5C12A21CD7D" }, { "b" : "7FF57869F000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "918D3696BF321AA8D32950AB2AB8D0F1B21AC907" }, { "b" : "7FF578489000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "179F202998E429AA1215907F6D4C5C1BB9C90136" }, { "b" : "7FF57826D000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "3D9441083D079DC2977F1BD50C8068D11767232D" }, { "b" : "7FF577EA0000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "8A05388886E079A56950262D27F1894688A015D3" }, { "b" : "7FF579480000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "5DA2D47925497B2F5875A7D8D1799A1227E2FDE4" }, { "b" : "7FF577C53000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "B5C83BDE7ED7026835B779FA0F957FCCCD599F40" }, { "b" : "7FF57796A000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "8B63976509135BA73A12153D6FDF7B3B9E5D2A54" }, { "b" : "7FF577766000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "C77BC26CE4D420861BAEBCC075C418BD9311BB5C" }, { "b" : "7FF57754B000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "6183129B5F29CA14580E517DF94EF317761FA6C9" }, { "b" : "7FF577335000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "B9D5F73428BD6AD68C96986B57BEA3B7CEDB9745" }, { "b" : "7FF577126000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "98F619035053EF68358099CE7CF1AA528B3B229D" }, { "b" : "7FF576F22000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "8CA73C16CFEB9A8B5660015B9223B09F87041CAD" }, { "b" : "7FF576D09000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "4C488F6E7044BB966162C1F7081ABBA6EBB2B485" }, { "b" : "7FF576AE2000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "D2DD4DA3FDE1477D25BFFF80F3A25FDB541A8179" }, { "b" : "7FF576880000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "F5B144F9F5D9BE451C80211B34DB2CE348E039B6" } ] }}#012 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x557a29793611]#012 mongod(+0x157C829) [0x557a29792829]#012 mongod(+0x157CD0D) [0x557a29792d0d]#012 libpthread.so.0(+0xF5D0) [0x7ff57827c5d0]#012 libc.so.6(gsignal+0x37) [0x7ff577ed6207]#012 libc.so.6(abort+0x148) [0x7ff577ed78f8]#012 mongod(_ZN5mongo25fassertFailedWithLocationEiPKcj+0x0) [0x557a28a323eb]#012 mongod(_ZN5mongo17WiredTigerSession9getCursorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEmb+0x100) [0x557a2949abe0]#012 mongod(_ZN5mongo16WiredTigerCursorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEmbPNS_16OperationContextE+0x4C) [0x557a294988dc]#012 mongod(_ZN5mongo21WiredTigerRecordStore6CursorC1EPNS_16OperationContextERKS0_b+0x64) [0x557a29494df4]#012 mongod(_ZN5mongo21WiredTigerRecordStoreC1EPNS_16OperationContextENS_10StringDataES3_NSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEbbllPNS_14CappedCallbackEPNS_20WiredTigerSizeStorerE+0x47C) [0x557a2949341c]#012 mongod(_ZN5mongo18WiredTigerKVEngine14getRecordStoreEPNS_16OperationContextENS_10StringDataES3_RKNS_17CollectionOptionsE+0x23B) [0x557a29480a2b]#012 mongod(_ZN5mongo22KVDatabaseCatalogEntry14initCollectionEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEb+0x2C3) [0x557a293c16c3]#012 mongod(_ZN5mongo15KVStorageEngineC1EPNS_8KVEngineERKNS_22KVStorageEngineOptionsE+0xAB4) [0x557a293c64b4]#012 mongod(+0x126856A) [0x557a2947e56a]#012 mongod(_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv+0x697) [0x557a293704e7]#012 mongod(+0x80894C) [0x557a28a1e94c]#012 mongod(main+0x96B) [0x557a28a3e4ab]#012 libc.so.6(__libc_start_main+0xF5) [0x7ff577ec23d5]#012 mongod(+0x88784F) [0x557a28a9d84f]#012-----  END BACKTRACE  -----
      

            Assignee:
            daniel.hatcher@mongodb.com Danny Hatcher (Inactive)
            Reporter:
            konstruktoid Thomas Sjögren
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: