Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-3194

Too many open files using ulimit 10000

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Cannot Reproduce
    • Affects Version/s: 1.8.1
    • Fix Version/s: None
    • Component/s: Stability
    • Labels:
      None
    • Operating System:
      Linux

      Description

      during the process migrating servers to raid5, I've used rs.stepDown() on amdbm023, and entered amdbm024 to full recovery mode.
      after few seconds amdbm023 closed itself on "Too many open files error". I have ulimit 10000, 30 mongos and 100 connections on each mongos.

      Sun Jun 5 02:41:38 [initandlisten] connection accepted from 10.102.33.76:54499 #32445
      Sun Jun 5 02:41:38 [conn32443] SyncClusterConnection connecting to [amdbm001:10001]
      Sun Jun 5 02:41:38 [conn30639] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn32443] SyncClusterConnection connecting to [amdbm003:10001]
      Sun Jun 5 02:41:38 [conn32443] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn30713] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn31312] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn30717] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn31582] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn31568] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn32445] SyncClusterConnection connecting to [amdbm001:10001]
      Sun Jun 5 02:41:38 [conn32445] SyncClusterConnection connecting to [amdbm003:10001]
      Sun Jun 5 02:41:38 [conn32445] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn30927] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn31282] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [initandlisten] connection accepted from 10.117.23.28:55828 #32446
      Sun Jun 5 02:41:38 [conn30475] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [initandlisten] connection accepted from 10.117.14.181:45708 #32447
      Sun Jun 5 02:41:38 [conn31377] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn32444] SyncClusterConnection connecting to [amdbm001:10001]
      Sun Jun 5 02:41:38 [initandlisten] connection accepted from 10.218.23.219:46445 #32448
      Sun Jun 5 02:41:38 [conn31583] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn31564] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn32444] SyncClusterConnection connecting to [amdbm003:10001]
      Sun Jun 5 02:41:38 [conn31564] ERROR: connect invalid socket errno:24 Too many open files
      Sun Jun 5 02:41:38 [conn31564] SyncClusterConnection connect fail to: amdbm005:10001 errmsg: couldn't connect to server amdbm005:10001
      Sun Jun 5 02:41:38 [conn32444] SyncClusterConnection connecting to [amdbm005:10001]
      Sun Jun 5 02:41:38 [conn32446] SyncClusterConnection connecting to [amdbm001:10001]
      Sun Jun 5 02:41:38 [conn31564] setShardVersion - relocking slow: 3000
      Sun Jun 5 02:41:38 [conn31564] query admin.$cmd ntoreturn:1 command:

      { setShardVersion: "viber.text", configdb: "amdbm001:10001,amdbm003:10001,amdbm005:10001", version: Timestamp 5000|0, serverID: ObjectId('4de7245e8fa7a7267 17fd7d0'), authoritative: true, shard: "set12", shardHost: "set12/amdbm024:10000,amdbm023:10000" }

      reslen:73 3000ms
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
      Sun Jun 5 02:41:38 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files

        Attachments

          Activity

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            ofersa ofer samocha
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: