Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-21323

Mongo Router frequently going down with a low limit for open files

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Critical - P2 Critical - P2
    • None
    • Affects Version/s: 3.0.6
    • Component/s: Security
    • None
    • Fully Compatible
    • ALL
    • Security C 11/20/15

      I've a mongo cluster version: 3.0.6 with 3 config nodes, 2 data nodes and 1 router. The router frequently goes down almost 2 times a day with the below error:

      2015-11-05T20:22:57.376+0000 I ACCESS   [conn1770]  authenticate db: azuga { authenticate: 1, user: "azugaapps", nonce: "xxx", key: "xxx" }
      2015-11-05T20:22:59.135+0000 I SHARDING [LockPinger] cluster ec2-54-184-178-101.us-west-2.compute.amazonaws.com:27019,ec2-54-184-71-89.us-west-2.compute.amazonaws.com:
      27019,ec2-54-212-135-86.us-west-2.compute.amazonaws.com:27019 pinged successfully at Thu Nov  5 20:22:58 2015 by distributed lock pinger 'ec2-54-184-178-101.us-west-2.
      compute.amazonaws.com:27019,ec2-54-184-71-89.us-west-2.compute.amazonaws.com:27019,ec2-54-212-135-86.us-west-2.compute.amazonaws.com:27019/ip-10-83-19-157:27017:144671
      3627:1804289383', sleeping for 30000ms
      2015-11-05T20:23:01.074+0000 I NETWORK  [mongosMain] connection accepted from 10.217.139.52:53068 #1771 (434 connections now open)
      2015-11-05T20:23:01.074+0000 I ACCESS   [conn1771]  authenticate db: azuga { authenticate: 1, user: "azugaapps", nonce: "xxx", key: "xxx" }
      2015-11-05T20:23:01.083+0000 F -        [conn1771] Got signal: 6 (Aborted).
      
       0xa4c139 0xa4ba02 0xa4bdb6 0x7ff06b270d40 0x7ff06b270cc9 0x7ff06b2740d8 0x88ed69 0x660ced 0x6618f7 0x633707 0x65bf66 0x619d31 0x62077e 0x612d9d 0x68b598 0x960f18 0x60
      a3cd 0x60b6be 0x60b8c2 0x9284db 0x99db5f 0x8c3a06 0x8c4693 0x8cc62a 0x95e42c 0x8e2fab 0x9703cd 0x95d961 0x5c8fd8 0x9fe60b 0x7ff06c86c182 0x7ff06b33447d
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"64C139"},{"b":"400000","o":"64BA02"},{"b":"400000","o":"64BDB6"},{"b":"7FF06B23A000","o":"36D40"},{"b":"7FF06B23A000","o":"36CC9"},{"b
      ":"7FF06B23A000","o":"3A0D8"},{"b":"400000","o":"48ED69"},{"b":"400000","o":"260CED"},{"b":"400000","o":"2618F7"},{"b":"400000","o":"233707"},{"b":"400000","o":"25BF66
      "},{"b":"400000","o":"219D31"},{"b":"400000","o":"22077E"},{"b":"400000","o":"212D9D"},{"b":"400000","o":"28B598"},{"b":"400000","o":"560F18"},{"b":"400000","o":"20A3C
      D"},{"b":"400000","o":"20B6BE"},{"b":"400000","o":"20B8C2"},{"b":"400000","o":"5284DB"},{"b":"400000","o":"59DB5F"},{"b":"400000","o":"4C3A06"},{"b":"400000","o":"4C46
      93"},{"b":"400000","o":"4CC62A"},{"b":"400000","o":"55E42C"},{"b":"400000","o":"4E2FAB"},{"b":"400000","o":"5703CD"},{"b":"400000","o":"55D961"},{"b":"400000","o":"1C8
      FD8"},{"b":"400000","o":"5FE60B"},{"b":"7FF06C864000","o":"8182"},{"b":"7FF06B23A000","o":"FA47D"}],"processInfo":{ "mongodbVersion" : "3.0.6", "gitVersion" : "1ef45a2
      3a4c5e3480ac919b28afcba3c615488f2", "uname" : { "sysname" : "Linux", "release" : "3.13.0-48-generic", "version" : "#80-Ubuntu SMP Thu Mar 12 11:16:15 UTC 2015", "machi
      ne" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "D31506D475D532F90CB8C8BDBF0D2852FE47CA5A" }, { "b" : "7FFFDB6FD000", "elfType" : 3, "buildI
      d" : "341443CC258F9F27709CAA36C8CB321E4BBFFF95" }, { "b" : "7FF06C864000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "9318E8AF0BFBE4
      44731BB0461202EF57F7C39542" }, { "b" : "7FF06C606000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "FF43D0947510134A8A494063A3C1CF3CEB
      B27791" }, { "b" : "7FF06C22B000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "B927879B878D90DD9FF4B15B00E7799AA8E0272F" }, { "b" 
      : "7FF06C023000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "92FCF41EFE012D6186E31A59AD05BDBB487769AB" }, { "b" : "7FF06BE1F000", "path" 
      : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "C1AE4CB7195D337A77A3C689051DABAA3980CA0C" }, { "b" : "7FF06BB1B000", "path" : "/usr/lib/x86_64-linux-
      gnu/libstdc++.so.6", "elfType" : 3, "buildId" : "19EFDDAB11B3BF5C71570078C59F91CF6592CE9E" }, { "b" : "7FF06B815000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfT
      ype" : 3, "buildId" : "1D76B71E905CB867B27CEF230FCB20F01A3178F5" }, { "b" : "7FF06B5FF000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : 
      "8D0AA71411580EE6C08809695C3984769F25725B" }, { "b" : "7FF06B23A000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "30C94DC66A1FE95180C3D68D2
      B89E576D5AE213C" }, { "b" : "7FF06CA82000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9F00581AB3C73E3AEA35995A0C50D24D59A01D47" } ] }}
       mongos(_ZN5mongo15printStackTraceERSo+0x29) [0xa4c139]
       mongos(+0x64BA02) [0xa4ba02]
       mongos(+0x64BDB6) [0xa4bdb6]
       libc.so.6(+0x36D40) [0x7ff06b270d40]
       libc.so.6(gsignal+0x39) [0x7ff06b270cc9]
       libc.so.6(abort+0x148) [0x7ff06b2740d8]
      mongos(_ZN5mongo12SecureRandom6createEv+0x1B9) [0x88ed69]
       mongos(_ZN5mongo31SaslSCRAMSHA1ClientConversation10_firstStepEPSs+0x16D) [0x660ced]
       mongos(_ZN5mongo31SaslSCRAMSHA1ClientConversation4stepERKNS_10StringDataEPSs+0x247) [0x6618f7]
       mongos(_ZN5mongo23NativeSaslClientSession4stepERKNS_10StringDataEPSs+0x27) [0x633707]
       mongos(+0x25BF66) [0x65bf66]
       mongos(_ZN5mongo20DBClientWithCommands5_authERKNS_7BSONObjE+0x191) [0x619d31]
       mongos(_ZN5mongo18DBClientConnection5_authERKNS_7BSONObjE+0x16E) [0x62077e]
       mongos(_ZN5mongo20DBClientWithCommands4authERKNS_7BSONObjE+0x1D) [0x612d9d]
       mongos(_ZN5mongo24authenticateInternalUserEPNS_20DBClientWithCommandsE+0xA8) [0x68b598]
       mongos(_ZN5mongo22ShardingConnectionHook8onCreateEPNS_12DBClientBaseE+0x108) [0x960f18]
       mongos(_ZN5mongo16DBConnectionPool8onCreateEPNS_12DBClientBaseE+0x2D) [0x60a3cd]
       mongos(_ZN5mongo16DBConnectionPool13_finishCreateERKSsdPNS_12DBClientBaseE+0x11E) [0x60b6be]
       mongos(_ZN5mongo16DBConnectionPool3getERKNS_16ConnectionStringEd+0xD2) [0x60b8c2]
       mongos(_ZN5mongo20DBClientMultiCommand7sendAllEv+0xAB) [0x9284db]
       mongos(_ZN5mongo14BatchWriteExec12executeBatchERKNS_21BatchedCommandRequestEPNS_22BatchedCommandResponseE+0x70F) [0x99db5f]
       mongos(_ZN5mongo13ClusterWriter10shardWriteERKNS_21BatchedCommandRequestEPNS_22BatchedCommandResponseE+0x2C6) [0x8c3a06]
       mongos(_ZN5mongo13ClusterWriter5writeERKNS_21BatchedCommandRequestEPNS_22BatchedCommandResponseE+0x603) [0x8c4693]
       mongos(_ZN5mongo15ClusterWriteCmd3runEPNS_16OperationContextERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x26A) [0x8cc62a]
       mongos(_ZN5mongo7Command22execCommandClientBasicEPNS_16OperationContextEPS0_RNS_11ClientBasicEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x3EC) [0x95e42c]
       mongos(_ZN5mongo7Command20runAgainstRegisteredEPKcRNS_7BSONObjERNS_14BSONObjBuilderEi+0x21B) [0x8e2fab]
       mongos(_ZN5mongo8Strategy15clientCommandOpERNS_7RequestE+0x1BD) [0x9703cd]
       mongos(_ZN5mongo7Request7processEi+0x5A1) [0x95d961]
       mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x58) [0x5c8fd8]
       mongos(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x34B) [0x9fe60b]
       libpthread.so.0(+0x8182) [0x7ff06c86c182]
       libc.so.6(clone+0x6D) [0x7ff06b33447d]
      -----  END BACKTRACE  -----
      

      Below is the detail of db.stats() on the major DB where we have 90% of data.

      mongos> db.stats();
      {
      	"raw" : {
      		"ec2-52-12-123-167.us-west-2.compute.amazonaws.com:27017" : {
      			"db" : "azuga",
      			"collections" : 8,
      			"objects" : 362467895,
      			"avgObjSize" : 765.6894732704534,
      			"dataSize" : 277537851600,
      			"storageSize" : 286915962896,
      			"numExtents" : 184,
      			"indexes" : 10,
      			"indexSize" : 98088943680,
      			"fileSize" : 397024428032,
      			"nsSizeMB" : 16,
      			"extentFreeList" : {
      				"num" : 0,
      				"totalSize" : 0
      			},
      			"dataFileVersion" : {
      				"major" : 4,
      				"minor" : 22
      			},
      			"ok" : 1
      		},
      		"ec2-54-188-151-194.us-west-2.compute.amazonaws.com:27017" : {
      			"db" : "azuga",
      			"collections" : 3,
      			"objects" : 78466712,
      			"avgObjSize" : 1008.0000481223171,
      			"dataSize" : 79094449472,
      			"storageSize" : 82284461520,
      			"numExtents" : 61,
      			"indexes" : 5,
      			"indexSize" : 26999073360,
      			"fileSize" : 120134303744,
      			"nsSizeMB" : 16,
      			"extentFreeList" : {
      				"num" : 0,
      				"totalSize" : 0
      			},
      			"dataFileVersion" : {
      				"major" : 4,
      				"minor" : 22
      			},
      			"ok" : 1
      		}
      	},
      	"objects" : 440934607,
      	"avgObjSize" : 808.2431719200484,
      	"dataSize" : 356632301072,
      	"storageSize" : 369200424416,
      	"numExtents" : 245,
      	"indexes" : 15,
      	"indexSize" : 125088017040,
      	"fileSize" : 517158731776,
      	"extentFreeList" : {
      		"num" : 0,
      		"totalSize" : 0
      	},
      	"ok" : 1
      }
      

            Assignee:
            spencer.jackson@mongodb.com Spencer Jackson
            Reporter:
            debasishn Debasish Nayak
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: