Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-20618

Brief network connectivity issue led to secondary segmentation fault on multiple secondaries (mms-dev)

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.1.8
    • Component/s: None
    • Labels:
      None
    • ALL

      Well, I have no idea really if the apparent network blip in the logs is cause or effect. Here is a log snippet, full log attached:

      mms-dev-1

      2015-09-24T12:46:51.963+0000 I NETWORK  [initandlisten] connection accepted from 54.81.124.119:57816 #4233 (66 connections now open)
      2015-09-24T12:46:52.279+0000 I ACCESS   [conn4233] Successfully authenticated as principal __system on local
      2015-09-24T12:46:52.396+0000 I COMMAND  [conn4089] command mmsdbrrd-raw-PT5S-PT2H-20150924T1200 command: profile { profile: -1 } ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:58 locks:{ Global: { acquireCount: { r: 1, w: 1 }, acq
      uireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 2910520 } }, MMAPV1Journal: { acquireCount: { w: 1 } }, Database: { acquireCount: { W: 1 } } } protocol:op_query 1410ms
      2015-09-24T12:46:52.533+0000 I REPL     [ReplicationExecutor] replSetElect voting yea for mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017 (5)
      2015-09-24T12:46:52.989+0000 I REPL     [ReplicationExecutor] Member mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017 is now in state PRIMARY
      2015-09-24T12:46:53.741+0000 I NETWORK  [initandlisten] connection accepted from 172.31.27.115:55463 #4234 (67 connections now open)
      2015-09-24T12:46:55.345+0000 I NETWORK  [conn4069] end connection 172.31.29.128:52527 (66 connections now open)
      2015-09-24T12:46:55.346+0000 I NETWORK  [ReplExecNetThread-362] Socket recv() errno:104 Connection reset by peer 172.31.29.128:27017
      2015-09-24T12:46:55.347+0000 I NETWORK  [ReplExecNetThread-362] SocketException: remote: 172.31.29.128:27017 error: 9001 socket exception [RECV_ERROR] server [172.31.29.128:27017] 
      2015-09-24T12:46:55.347+0000 I NETWORK  [conn4232] end connection 172.31.29.128:52661 (65 connections now open)
      2015-09-24T12:46:55.366+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to mms-dev-2.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017; HostUnreachable network error while attempting to run command 'replSetHeartbeat
      ' on host 'mms-dev-2.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017' 
      2015-09-24T12:46:55.366+0000 I REPL     [ReplicationExecutor] syncing from: mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017
      2015-09-24T12:46:55.368+0000 W NETWORK  [ReplExecNetThread-360] Failed to connect to 172.31.29.128:27017, reason: errno:111 Connection refused
      2015-09-24T12:46:55.369+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to mms-dev-2.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017; HostUnreachable couldn't connect to server mms-dev-2.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017, connection attempt failed
      2015-09-24T12:46:55.371+0000 W NETWORK  [ReplExecNetThread-361] Failed to connect to 172.31.29.128:27017, reason: errno:111 Connection refused
      2015-09-24T12:46:55.371+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to mms-dev-2.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017; HostUnreachable couldn't connect to server mms-dev-2.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017, connection attempt failed
      2015-09-24T12:46:55.389+0000 F -        [ReplExecNetThread-0] Invalid access at address: 0
      2015-09-24T12:46:55.389+0000 I REPL     [SyncSourceFeedback] setting syncSourceFeedback to mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017
      2015-09-24T12:46:55.416+0000 F -        [ReplExecNetThread-0] Got signal: 11 (Segmentation fault).
      
       0x1264f72 0x12640d9 0x1264458 0x7f7bb5a1a340 0x9a0388 0x9a040e 0x9a09c9 0x9a1760 0x9e4712 0x1044758 0x11feb70 0x11ff749 0x12002a0 0x7f7bb61f5a40 0x7f7bb5a12182 0x7f7bb573f47d
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"E64F72"},{"b":"400000","o":"E640D9"},{"b":"400000","o":"E64458"},{"b":"7F7BB5A0A000","o":"10340"},{"b":"400000","o":"5A0388"},{"b":"400000","o":"5A040E"},{"b":"400000","o":"5A09C9"},{"b":"400000","o":"5A1760"},{"b":"400000","o":"5E4712"},{"b":"400000","o":"C44758"},{"b":"400000","o":"DFEB70"},{"b":"400000","o":"DFF749"},{"b":"400000","o":"E002A0"},{"b":"7F7BB6144000","o":"B1A40"},{"b":"7F7BB5A0A000","o":"8182"},{"b":"7F7BB5645000","o":"FA47D"}],"processInfo":{ "mongodbVersion" : "3.1.8", "gitVersion" : "a9c87bd28a3e9c7c28fa60df2d0861a607ce9a7f", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.13.0-61-generic", "version" : "#100-Ubuntu SMP Wed Jul 29 11:21:34 UTC 2015", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "E05FD61AE55DAB585C597FD695BB3C9A38B4EBF5" }, { "b" : "7FFEDC452000", "elfType" : 3, "buildId" : "A3F2043AB5D2D7974286088B9241BE9E21C44687" }, { "b" : "7F7BB6C2F000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "A20EFFEC993A8441FA17F2079F923CBD04079E19" }, { "b" : "7F7BB6854000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "F000D29917E9B6E94A35A8F02E5C62846E5916BC" }, { "b" : "7F7BB664C000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "92FCF41EFE012D6186E31A59AD05BDBB487769AB" }, { "b" : "7F7BB6448000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "C1AE4CB7195D337A77A3C689051DABAA3980CA0C" }, { "b" : "7F7BB6144000", "path" : "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", "elfType" : 3, "buildId" : "4BF6F7ADD8244AD86008E6BF40D90F8873892197" }, { "b" : "7F7BB5E3E000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "1D76B71E905CB867B27CEF230FCB20F01A3178F5" }, { "b" : "7F7BB5C28000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "8D0AA71411580EE6C08809695C3984769F25725B" }, { "b" : "7F7BB5A0A000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "9318E8AF0BFBE444731BB0461202EF57F7C39542" }, { "b" : "7F7BB5645000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "30C94DC66A1FE95180C3D68D2B89E576D5AE213C" }, { "b" : "7F7BB6E8E000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9F00581AB3C73E3AEA35995A0C50D24D59A01D47" } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x1264f72]
       mongod(+0xE640D9) [0x12640d9]
       mongod(+0xE64458) [0x1264458]
       libpthread.so.0(+0x10340) [0x7f7bb5a1a340]
       mongod(_ZN5mongo14ConnectionPool25_destroyConnection_inlockEPSt4listINS0_14ConnectionInfoESaIS2_EESt14_List_iteratorIS2_E+0x18) [0x9a0388]
       mongod(_ZN5mongo14ConnectionPool24_cleanUpOlderThan_inlockENS_6Date_tEPSt4listINS0_14ConnectionInfoESaIS3_EE+0x5E) [0x9a040e]
       mongod(_ZN5mongo14ConnectionPool17acquireConnectionERKNS_11HostAndPortENS_6Date_tENSt6chrono8durationIlSt5ratioILl1ELl1000EEEE+0xA9) [0x9a09c9]
       mongod(_ZN5mongo14ConnectionPool13ConnectionPtrC1EPS0_RKNS_11HostAndPortENS_6Date_tENSt6chrono8durationIlSt5ratioILl1ELl1000EEEE+0x20) [0x9a1760]
       mongod(_ZN5mongo23RemoteCommandRunnerImpl10runCommandERKNS_8executor20RemoteCommandRequestE+0x92) [0x9e4712]
       mongod(_ZN5mongo8executor20NetworkInterfaceImpl14_runOneCommandEv+0x208) [0x1044758]
       mongod(_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockISt5mutexE+0x110) [0x11feb70]
       mongod(_ZN5mongo10ThreadPool13_consumeTasksEv+0xA9) [0x11ff749]
       mongod(_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKSs+0x100) [0x12002a0]
       libstdc++.so.6(+0xB1A40) [0x7f7bb61f5a40]
       libpthread.so.0(+0x8182) [0x7f7bb5a12182]
       libc.so.6(clone+0x6D) [0x7f7bb573f47d]
      -----  END BACKTRACE  -----
      

      mms-dev-2

      2015-09-24T12:46:51.468+0000 I NETWORK  [initandlisten] connection accepted from 54.81.124.119:58043 #3932 (61 connections now open)
      2015-09-24T12:46:51.575+0000 I ACCESS   [conn3930] Successfully authenticated as principal __system on local
      2015-09-24T12:46:51.685+0000 I ACCESS   [conn3932] Successfully authenticated as principal __system on local
      2015-09-24T12:46:51.689+0000 I ACCESS   [conn3931] Successfully authenticated as principal __system on local
      2015-09-24T12:46:51.873+0000 I REPL     [ReplicationExecutor] Standing for election
      2015-09-24T12:46:51.874+0000 I REPL     [ReplicationExecutor] not electing self, mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017 would veto with 'mms-dev-2.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017 has lower priori
      ty of 1 than mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017 which has a priority of 10'
      2015-09-24T12:46:51.874+0000 I REPL     [ReplicationExecutor] not electing self, we are not freshest
      2015-09-24T12:46:51.874+0000 I REPL     [ReplicationExecutor] Standing for election
      2015-09-24T12:46:51.875+0000 I REPL     [ReplicationExecutor] not electing self, mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017 would veto with 'mms-dev-2.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017 has lower priori
      ty of 1 than mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017 which has a priority of 10'
      2015-09-24T12:46:51.875+0000 I REPL     [ReplicationExecutor] not electing self, we are not freshest
      2015-09-24T12:46:51.879+0000 I NETWORK  [initandlisten] connection accepted from 54.204.233.120:47906 #3933 (62 connections now open)
      2015-09-24T12:46:51.900+0000 I NETWORK  [initandlisten] connection accepted from 172.31.3.213:50318 #3934 (63 connections now open)
      2015-09-24T12:46:51.948+0000 I REPL     [ReplicationExecutor] Standing for election
      2015-09-24T12:46:51.949+0000 I REPL     [ReplicationExecutor] not electing self, mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017 would veto with 'mms-dev-2.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017 has lower priori
      ty of 1 than mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017 which has a priority of 10'
      2015-09-24T12:46:51.949+0000 I REPL     [ReplicationExecutor] not electing self, we are not freshest
      2015-09-24T12:46:52.532+0000 I REPL     [ReplicationExecutor] replSetElect voting yea for mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017 (5)
      2015-09-24T12:46:52.581+0000 I NETWORK  [initandlisten] connection accepted from 54.81.124.119:58045 #3935 (64 connections now open)
      2015-09-24T12:46:52.897+0000 I ACCESS   [conn3935] Successfully authenticated as principal __system on local
      2015-09-24T12:46:53.759+0000 I NETWORK  [initandlisten] connection accepted from 172.31.27.115:55657 #3936 (65 connections now open)
      2015-09-24T12:46:53.874+0000 I REPL     [ReplicationExecutor] Member mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017 is now in state PRIMARY
      2015-09-24T12:46:54.873+0000 I REPL     [ReplicationExecutor] syncing from: mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017
      2015-09-24T12:46:54.889+0000 I REPL     [SyncSourceFeedback] setting syncSourceFeedback to mms-dev-0.mms-dev.56035370e4b025426aa18f77.mongo.plumbing:27017
      2015-09-24T12:46:54.889+0000 F -        [ReplExecNetThread-6] Invalid access at address: 0
      2015-09-24T12:46:54.915+0000 F -        [ReplExecNetThread-6] Got signal: 11 (Segmentation fault).
      
       0x1264f72 0x12640d9 0x1264458 0x7f2dd47c1340 0x9a0388 0x9a040e 0x9a09c9 0x9a1760 0x9e4712 0x1044758 0x11feb70 0x11ff749 0x12002a0 0x7f2dd4f9ca40 0x7f2dd47b9182 0x7f2dd44e647d
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"E64F72"},{"b":"400000","o":"E640D9"},{"b":"400000","o":"E64458"},{"b":"7F2DD47B1000","o":"10340"},{"b":"400000","o":"5A0388"},{"b":"400000","o":"5A040E"},{"b":"400000","o":"5A09C9"},{"b":"400000","o":"5A1760"}
      ,{"b":"400000","o":"5E4712"},{"b":"400000","o":"C44758"},{"b":"400000","o":"DFEB70"},{"b":"400000","o":"DFF749"},{"b":"400000","o":"E002A0"},{"b":"7F2DD4EEB000","o":"B1A40"},{"b":"7F2DD47B1000","o":"8182"},{"b":"7F2DD43EC000","o":"FA47D"}],"p
      rocessInfo":{ "mongodbVersion" : "3.1.8", "gitVersion" : "a9c87bd28a3e9c7c28fa60df2d0861a607ce9a7f", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.13.0-61-generic", "version" : "#100-Ubuntu SMP Wed Jul 29 11:21:34 UTC 2015", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "E05FD61AE55DAB585C597FD695BB3C9A38B4EBF5" }, { "b" : "7FFDC2AD6000", "elfType" : 3, "buildId" : "A3F2043AB5D2D7974286088B9241BE9E21C44687" }, { "b" : "7F2DD59D6000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "A20EFFEC993A8441FA17F2079F923CBD04079E19" }, { "b" : "7F2DD55FB000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "F000D29917E9B6E94A35A8F02E5C62846E5916BC" }, { "b" : "7F2DD53F3000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "92FCF41EFE012D6186E31A59AD05BDBB487769AB" }, { "b" : "7F2DD51EF000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "C1AE4CB7195D337A77A3C689051DABAA3980CA0C" }, { "b" : "7F2DD4EEB000", "path" : "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", "elfType" : 3, "buildId" : "4BF6F7ADD8244AD86008E6BF40D90F8873892197" }, { "b" : "7F2DD4BE5000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "1D76B71E905CB867B27CEF230FCB20F01A3178F5" }, { "b" : "7F2DD49CF000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "8D0AA71411580EE6C08809695C3984769F25725B" }, { "b" : "7F2DD47B1000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "9318E8AF0BFBE444731BB0461202EF57F7C39542" }, { "b" : "7F2DD43EC000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "30C94DC66A1FE95180C3D68D2B89E576D5AE213C" }, { "b" : "7F2DD5C35000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9F00581AB3C73E3AEA35995A0C50D24D59A01D47" } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x1264f72]
       mongod(+0xE640D9) [0x12640d9]
       mongod(+0xE64458) [0x1264458]
       libpthread.so.0(+0x10340) [0x7f2dd47c1340]
       mongod(_ZN5mongo14ConnectionPool25_destroyConnection_inlockEPSt4listINS0_14ConnectionInfoESaIS2_EESt14_List_iteratorIS2_E+0x18) [0x9a0388]
       mongod(_ZN5mongo14ConnectionPool24_cleanUpOlderThan_inlockENS_6Date_tEPSt4listINS0_14ConnectionInfoESaIS3_EE+0x5E) [0x9a040e]
       mongod(_ZN5mongo14ConnectionPool17acquireConnectionERKNS_11HostAndPortENS_6Date_tENSt6chrono8durationIlSt5ratioILl1ELl1000EEEE+0xA9) [0x9a09c9]
       mongod(_ZN5mongo14ConnectionPool13ConnectionPtrC1EPS0_RKNS_11HostAndPortENS_6Date_tENSt6chrono8durationIlSt5ratioILl1ELl1000EEEE+0x20) [0x9a1760]
       mongod(_ZN5mongo23RemoteCommandRunnerImpl10runCommandERKNS_8executor20RemoteCommandRequestE+0x92) [0x9e4712]
       mongod(_ZN5mongo8executor20NetworkInterfaceImpl14_runOneCommandEv+0x208) [0x1044758]
       mongod(_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockISt5mutexE+0x110) [0x11feb70]
       mongod(_ZN5mongo10ThreadPool13_consumeTasksEv+0xA9) [0x11ff749]
       mongod(_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKSs+0x100) [0x12002a0]
       libstdc++.so.6(+0xB1A40) [0x7f2dd4f9ca40]
       libpthread.so.0(+0x8182) [0x7f2dd47b9182]
       libc.so.6(clone+0x6D) [0x7f2dd44e647d]
      -----  END BACKTRACE  -----
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            cailin.nelson@mongodb.com Cailin Nelson
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: