Too many open files using ulimit 10000

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Cannot Reproduce
    • Priority: Major - P3
    • None
    • Affects Version/s: 1.8.1
    • Component/s: Stability
    • None
    • Linux
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      during the process migrating servers to raid5, I've used rs.stepDown() on amdbm023, and entered amdbm024 to full recovery mode.
      after few seconds amdbm023 closed itself on "Too many open files error". I have ulimit 10000, 30 mongos and 100 connections on each mongos.

      Sun Jun 5 02:41:38 [initandlisten] connection accepted from 10.102.33.76:54499 #32445
      Sun Jun 5 02:41:38 [conn32443] SyncClusterConnection connecting to [amdbm001:10001]
      Sun Jun 5 02:41:38 [conn30639] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn32443] SyncClusterConnection connecting to [amdbm003:10001]
      Sun Jun 5 02:41:38 [conn32443] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn30713] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn31312] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn30717] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn31582] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn31568] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn32445] SyncClusterConnection connecting to [amdbm001:10001]
      Sun Jun 5 02:41:38 [conn32445] SyncClusterConnection connecting to [amdbm003:10001]
      Sun Jun 5 02:41:38 [conn32445] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn30927] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn31282] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [initandlisten] connection accepted from 10.117.23.28:55828 #32446
      Sun Jun 5 02:41:38 [conn30475] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [initandlisten] connection accepted from 10.117.14.181:45708 #32447
      Sun Jun 5 02:41:38 [conn31377] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn32444] SyncClusterConnection connecting to [amdbm001:10001]
      Sun Jun 5 02:41:38 [initandlisten] connection accepted from 10.218.23.219:46445 #32448
      Sun Jun 5 02:41:38 [conn31583] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn31564] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn32444] SyncClusterConnection connecting to [amdbm003:10001]
      Sun Jun 5 02:41:38 [conn31564] ERROR: connect invalid socket errno:24 Too many open files
      Sun Jun 5 02:41:38 [conn31564] SyncClusterConnection connect fail to: amdbm005:10001 errmsg: couldn't connect to server amdbm005:10001
      Sun Jun 5 02:41:38 [conn32444] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn32446] SyncClusterConnection connecting to [amdbm001:10001]
      Sun Jun 5 02:41:38 [conn31564] setShardVersion - relocking slow: 3000
      Sun Jun 5 02:41:38 [conn31564] query admin.$cmd ntoreturn:1 command:

      { setShardVersion: "viber.text", configdb: "amdbm001:10001,amdbm003:10001,amdbm005:10001", version: Timestamp 5000|0, serverID: ObjectId('4de7245e8fa7a7267 17fd7d0'), authoritative: true, shard: "set12", shardHost: "set12/amdbm024:10000,amdbm023:10000" }

      reslen:73 3000ms
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files

            Assignee:
            Unassigned
            Reporter:
            ofer samocha
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: