Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-30455

heartbeats do not sleep after connection failure

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.5.11
    • Component/s: Replication
    • Labels:
    • Replication
    • ALL
    • Hide

      I noticed this when I triggered a crash in signalDrainComplete().

      Show
      I noticed this when I triggered a crash in signalDrainComplete().

      Upon certain types of node failures in a replica set, the remaining nodes can spin trying to connect to the failed node, without sleeping in between connection attempts. This has the effect of swamping the network and cpu.

      Here is an example of running jstests/sharding/add_invalid_shard.js (The config server replica set exhibits the failure described.)

      [js_test:add_invalid_shard] 2017-08-01T09:58:10.059-0400 c20011| 2017-08-01T09:58:10.059-0400 I INDEX    [rsSync] build index on: config.tags properties: { v: 2, key: { ns: 1, tag: 1 }, name: "ns_1_tag_1", ns: "config.tags" }
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.059-0400 c20011| 2017-08-01T09:58:10.059-0400 I INDEX    [rsSync] 	 building index using bulk method; build may temporarily use up to 500 megabytes of RAM
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.084-0400 c20011| 2017-08-01T09:58:10.083-0400 F -        [rsSync] Invariant failure static_cast<std::size_t>(size) < sizeof(oldestTSConfigString) src/mongo/db/storage/wiredtiger/wiredtiger_snapshot_manager.cpp 69
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.084-0400 c20011| 2017-08-01T09:58:10.083-0400 F -        [rsSync]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.084-0400 c20011|
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.084-0400 c20011| ***aborting after invariant() failure
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.084-0400 c20011|
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.084-0400 c20011|
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.123-0400 c20011| 2017-08-01T09:58:10.122-0400 F -        [rsSync] Got signal: 6 (Aborted).
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.123-0400 c20011|
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.123-0400 c20011|  0x55c89dc9c923 0x55c89dc9c495 0x55c89dc9bd69 0x7fa5ba7055b0 0x7fa5ba3638df 0x7fa5ba3654da 0x55c89dc8dcee 0x55c89bdd8073 0x55c89bbcd981 0x55c89c2934ea 0x55c89c2930d5 0x55c89c28236f 0x55c89c28218f 0x55c89bc57816 0x55c89bbe11fa 0x55c89bbe2809 0x55c89bbe2772 0x55c89bbe2725 0x55c89bbe24e9 0x55c89dea1680 0x7fa5ba6fb73a 0x7fa5ba435e0f
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.123-0400 c20011| ----- BEGIN BACKTRACE -----
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.128-0400 c20011| {"backtrace":[{"b":"55C899B61000","o":"413B923","s":"_ZN5mongo15printStackTraceERSo"},{"b":"55C899B61000","o":"413B495"},{"b":"55C899B61000","o":"413AD69"},{"b":"7FA5BA6F4000","o":"115B0"},{"b":"7FA5BA32E000","o":"358DF","s":"gsignal"},{"b":"7FA5BA32E000","o":"374DA","s":"abort"},{"b":"55C899B61000","o":"412CCEE","s":"_ZN5mongo15invariantFailedEPKcS1_j"},{"b":"55C899B61000","o":"2277073","s":"_ZN5mongo25WiredTigerSnapshotManager20setCommittedSnapshotERKNS_12SnapshotNameENS_9TimestampE"},{"b":"55C899B61000","o":"206C981","s":"_ZN5mongo4repl39ReplicationCoordinatorExternalStateImpl23updateCommittedSnapshotENS0_12SnapshotInfoE"},{"b":"55C899B61000","o":"27324EA","s":"_ZN5mongo4repl26ReplicationCoordinatorImpl31_updateCommittedSnapshot_inlockENS0_12SnapshotInfoE"},{"b":"55C899B61000","o":"27320D5","s":"_ZN5mongo4repl26ReplicationCoordinatorImpl25_updateCommitPoint_inlockEv"},{"b":"55C899B61000","o":"272136F","s":"_ZN5mongo4repl26ReplicationCoordinatorImpl33_updateLastCommittedOpTime_inlockEv"},{"b":"55C899B61000","o":"272118F","s":"_ZN5mongo4repl26ReplicationCoordinatorImpl19signalDrainCompleteEPNS_16OperationContextEx"},{"b":"55C899B61000","o":"20F6816","s":"_ZN5mongo4repl8SyncTail16oplogApplicationEPNS0_22ReplicationCoordinatorE"},{"b":"55C899B61000","o":"20801FA","s":"_ZN5mongo4repl10RSDataSync4_runEv"},{"b":"55C899B61000","o":"2081809","s":"_ZNKSt12_Mem_fn_baseIMN5mongo4repl10RSDataSyncEFvvELb1EEclIJEvEEvPS2_DpOT_"},{"b":"55C899B61000","o":"2081772","s":"_ZNSt12_Bind_simpleIFSt7_Mem_fnIMN5mongo4repl10RSDataSyncEFvvEEPS3_EE9_M_invokeIJLm0EEEEvSt12_Index_tupleIJXspT_EEE"},{"b":"55C899B61000","o":"2081725","s":"_ZNSt12_Bind_simpleIFSt7_Mem_fnIMN5mongo4repl10RSDataSyncEFvvEEPS3_EEclEv"},{"b":"55C899B61000","o":"20814E9","s":"_ZNSt6thread5_ImplISt12_Bind_simpleIFSt7_Mem_fnIMN5mongo4repl10RSDataSyncEFvvEEPS5_EEE6_M_runEv"},{"b":"55C899B61000","o":"4340680"},{"b":"7FA5BA6F4000","o":"773A"},{"b":"7FA5BA32E000","o":"107E0F","s":"clone"}],"processInfo":{ "mongodbVersion" : "0.0.0", "gitVersion" : "unknown", "compiledModules" : [ "ninja", "enterprise" ], "uname" : { "sysname" : "Linux", "release" : "4.10.16-200.fc25.x86_64", "version" : "#1 SMP Mon May 15 15:19:52 UTC 2017", "machine" : "x86_64" }, "somap" : [ { "b" : "55C899B61000", "elfType" : 3, "buildId" : "63F0C194EA1B871BBC473EC6BC8402F9F4B92D8C" }, { "b" : "7FFD33FCE000", "path" : "linux-vdso.so.1", "elfType" : 3, "buildId" : "BDC63928341B8A303F90D00E7C9DDE0C718CBDF4" }, { "b" : "7FA5BD4C1000", "path" : "/lib64/libcurl.so.4", "elfType" : 3, "buildId" : "F12CDB86B1078D8C13433CC84E88A6EE8EEBA3DA" }, { "b" : "7FA5BD00A000", "path" : "/lib64/libnetsnmpmibs.so.30", "elfType" : 3, "buildId" : "64FE8F0ECC0C26481A0BDEE02858B801A15A54F2" }, { "b" : "7FA5BCDFB000", "path" : "/lib64/libsensors.so.4", "elfType" : 3, "buildId" : "7586282C44AC0E116ACA7BAA0E2E693AEB3A5A96" }, { "b" : "7FA5BCBF7000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "261C6A25B339A25DF7C1F226CF1AB02D1A860577" }, { "b" : "7FA5BC983000", "path" : "/lib64/librpm.so.7", "elfType" : 3, "buildId" : "3F6D1822996ED19E103DEA49AD0F365E38C696C1" }, { "b" : "7FA5BC756000", "path" : "/lib64/librpmio.so.7", "elfType" : 3, "buildId" : "CEB315EC4E580FCB1C0D994254F47AB358662294" }, { "b" : "7FA5BC4E8000", "path" : "/lib64/libnetsnmpagent.so.30", "elfType" : 3, "buildId" : "098D6BFC2003696574C5E13659F82C51FAF605FA" }, { "b" : "7FA5BC2DD000", "path" : "/lib64/libwrap.so.0", "elfType" : 3, "buildId" : "7BE9ECD0172FD42BEBE50941194D11D38D4BBD7E" }, { "b" : "7FA5BBFD9000", "path" : "/lib64/libnetsnmp.so.30", "elfType" : 3, "buildId" : "ACE90B052EFCC9D8345693E29FDFFF60FBDAA27E" }, { "b" : "7FA5BBD67000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "0C3C3999F04F0CB2BEEA7A9FFCA8442FC2039C28" }, { "b" : "7FA5BB906000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "12049113E5F2E8208ADF8EAA4D5228725D47CF0E" }, { "b" : "7FA5BB5FD000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "650AADC59C53B979E7898D1B445028D062249E6A" }, { "b" : "7FA5BB3E0000", "path" : "/lib64/libsasl2.so.3", "elfType" : 3, "buildId" : "25505BDDB74FCE45AD73F714BDCC0E4378EE1A1C" }, { "b" : "7FA5BB18D000", "path" : "/lib64/libldap-2.4.so.2", "elfType" : 3, "buildId" : "FCB33008A87ABB518DCD43776C8B1EB855437E69" }, { "b" : "7FA5BAF7E000", "path" : "/lib64/liblber-2.4.so.2", "elfType" : 3, "buildId" : "2E2EB7D42BAD7300ED5129520B29A10A420B2555" }, { "b" : "7FA5BAD31000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "53D56F818722973F2D9B3FD9404E0FC54463A102" }, { "b" : "7FA5BAB29000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "10083AE0A47C1A8DB5EBA146918DC14973C5359E" }, { "b" : "7FA5BA912000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "7F6DFB0BC6EFB047F1221C2E63619A4888DD5EF0" }, { "b" : "7FA5BA6F4000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "D47BE0768B0E30FD6E825F507615E2C0717858C1" }, { "b" : "7FA5BA32E000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "1A7E87FADEB1DA4619B1AA2E6DD2F80EE2784A8F" }, { "b" : "7FA5BD73F000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "B47850F46E35503E606C67071868704A3654A253" }, { "b" : "7FA5BA109000", "path" : "/lib64/libnghttp2.so.14", "elfType" : 3, "buildId" : "227E7C32C911EF97D0DCFB9D6E3BAD3874B023D4" }, { "b" : "7FA5B9EEC000", "path" : "/lib64/libidn2.so.0", "elfType" : 3, "buildId" : "130D7F38CA0836A3545C2CDAD7532F4011908C9A" }, { "b" : "7FA5B9CBF000", "path" : "/lib64/libssh2.so.1", "elfType" : 3, "buildId" : "7C2CBA6CF2BDC4AB37661AAF1EF9D3CF4FA7853F" }, { "b" : "7FA5B9AB1000", "path" : "/lib64/libpsl.so.5", "elfType" : 3, "buildId" : "5BE6A37308F708448653C528C95C26A62B0CBA66" }, { "b" : "7FA5B9865000", "path" : "/lib64/libssl3.so", "elfType" : 3, "buildId" : "BE402847D9D8AD6565B6E60CE0508A2DE0133C28" }, { "b" : "7FA5B963E000", "path" : "/lib64/libsmime3.so", "elfType" : 3, "buildId" : "1106A7FA180F7E2407F83AACF635368085787EBF" }, { "b" : "7FA5B9313000", "path" : "/lib64/libnss3.so", "elfType" : 3, "buildId" : "8C5D38A1F46B457C9876231E476DCCADF452E053" }, { "b" : "7FA5B90E5000", "path" : "/lib64/libnssutil3.so", "elfType" : 3, "buildId" : "7BA624633C0D1F16CB0BCC483A8DB6FCE31306AC" }, { "b" : "7FA5B8EE1000", "path" : "/lib64/libplds4.so", "elfType" : 3, "buildId" : "D951A0FB75BBFF9946CD835C1E349C75B951D09C" }, { "b" : "7FA5B8CDC000", "path" : "/lib64/libplc4.so", "elfType" : 3, "buildId" : "E21BD10A28B4950A33E95B31BDE282F6893C35F5" }, { "b" : "7FA5B8A9D000", "path" : "/lib64/libnspr4.so", "elfType" : 3, "buildId" : "650F0C41AD9E526E3033F04CB2374AAC54788A50" }, { "b" : "7FA5B87B7000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "1ED4FF8BEDAE3459F8CEC47A0E2AE712A2C46A5B" }, { "b" : "7FA5B8586000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "9B979B5D307E015D970F55E65DD35A03F1C77ABC" }, { "b" : "7FA5B8382000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "187A3021EE64DA2DE47E16EA6C4E9E28BA893AF9" }, { "b" : "7FA5B816C000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "919D7128D3D20B8904E76D7F7A35A4788EEA5E0B" }, { "b" : "7FA5B7D78000", "path" : "/lib64/libperl.so.5.24", "elfType" : 3, "buildId" : "96541951FE8D0444AA97D3B249664EEAF984BE1F" }, { "b" : "7FA5B7B5D000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "A5FCFAC12A7396936ECB0F652991CB5BBD3763DB" }, { "b" : "7FA5B7944000", "path" : "/lib64/libnsl.so.1", "elfType" : 3, "buildId" : "08110BF6186570915006C12EF3185552B074AADD" }, { "b" : "7FA5B770E000", "path" : "/lib64/libcrypt.so.1", "elfType" : 3, "buildId" : "97F68AD42EEF4B92A793235950425134924E927E" }, { "b" : "7FA5B750B000", "path" : "/lib64/libutil.so.1", "elfType" : 3, "buildId" : "695D5CB5CC8AB93A9E38F8766E61893CA463850F" }, { "b" : "7FA5B72FB000", "path" : "/lib64/libbz2.so.1", "elfType" : 3, "buildId" : "4F76B97284B76CA956B9634BC2AE1BDDB48E3304" }, { "b" : "7FA5B70E3000", "path" : "/lib64/libelf.so.1", "elfType" : 3, "buildId" : "1D36DFDD9E21D4BECA5F96BDE662025FD6D5838B" }, { "b" : "7FA5B6EBD000", "path" : "/lib64/liblzma.so.5", "elfType" : 3, "buildId" : "02A6C8CE81EAB889BF04EDC75E3313CB742D790F" }, { "b" : "7FA5B6CB0000", "path" : "/lib64/libpopt.so.0", "elfType" : 3, "buildId" : "263688C8598A3DDE8B745604C6F927EDBE2E88BF" }, { "b" : "7FA5B6AAB000", "path" : "/lib64/libcap.so.2", "elfType" : 3, "buildId" : "9AD6870BD8403205830A4333A7FAC559CA4E912C" }, { "b" : "7FA5B68A2000", "path" : "/lib64/libacl.so.1", "elfType" : 3, "buildId" : "AA25ADFC0D8F48E66689CBDD7DFD58E765DB4C03" }, { "b" : "7FA5B6668000", "path" : "/lib64/liblua-5.3.so", "elfType" : 3, "buildId" : "3C3CAEBC0A6FF80DACEBE8998E189A9493524DDA" }, { "b" : "7FA5B62A5000", "path" : "/lib64/libdb-5.3.so", "elfType" : 3, "buildId" : "1006D020CF3A3C933E6F3892E49BE6A9C24A4107" }, { "b" : "7FA5B6096000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "B7CF10168953C83E85BB52EC9DA0C4B6773C0FB5" }, { "b" : "7FA5B5E92000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "83B5DC698C4FBF60FEBE682891F6F423FCB91479" }, { "b" : "7FA5B5B62000", "path" : "/lib64/libunistring.so.2", "elfType" : 3, "buildId" : "F2BCEEE554D34B60963C7E382A82CE2B9E817524" }, { "b" : "7FA5B595F000", "path" : "/lib64/libfreebl3.so", "elfType" : 3, "buildId" : "F884E9136989EFAC740585EF096F588A9F4D5F18" }, { "b" : "7FA5B575A000", "path" : "/lib64/libattr.so.1", "elfType" : 3, "buildId" : "7D295658A787A72E3DF7BD6C71E9A10AD347355D" }, { "b" : "7FA5B5533000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "4E80D147759A2B757DF396C40B079E8EA9CE2629" }, { "b" : "7FA5B52C1000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "CC2860E6444BEE693FD63BDCA3C788C28DBAD11C" } ] }}
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.128-0400 c20011|  mongod(_ZN5mongo15printStackTraceERSo+0x33) [0x55c89dc9c923]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.129-0400 c20011|  mongod(+0x413B495) [0x55c89dc9c495]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.129-0400 c20011|  mongod(+0x413AD69) [0x55c89dc9bd69]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.129-0400 c20011|  libpthread.so.0(+0x115B0) [0x7fa5ba7055b0]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.129-0400 c20011|  libc.so.6(gsignal+0x9F) [0x7fa5ba3638df]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.129-0400 c20011|  libc.so.6(abort+0x16A) [0x7fa5ba3654da]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.130-0400 c20011|  mongod(_ZN5mongo15invariantFailedEPKcS1_j+0x17E) [0x55c89dc8dcee]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.130-0400 c20011|  mongod(_ZN5mongo25WiredTigerSnapshotManager20setCommittedSnapshotERKNS_12SnapshotNameENS_9TimestampE+0x273) [0x55c89bdd8073]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.130-0400 c20011|  mongod(_ZN5mongo4repl39ReplicationCoordinatorExternalStateImpl23updateCommittedSnapshotENS0_12SnapshotInfoE+0xA1) [0x55c89bbcd981]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.130-0400 c20011|  mongod(_ZN5mongo4repl26ReplicationCoordinatorImpl31_updateCommittedSnapshot_inlockENS0_12SnapshotInfoE+0x28A) [0x55c89c2934ea]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.130-0400 c20011|  mongod(_ZN5mongo4repl26ReplicationCoordinatorImpl25_updateCommitPoint_inlockEv+0x2D5) [0x55c89c2930d5]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.131-0400 c20011|  mongod(_ZN5mongo4repl26ReplicationCoordinatorImpl33_updateLastCommittedOpTime_inlockEv+0x4F) [0x55c89c28236f]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.131-0400 c20011|  mongod(_ZN5mongo4repl26ReplicationCoordinatorImpl19signalDrainCompleteEPNS_16OperationContextEx+0x50F) [0x55c89c28218f]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.131-0400 c20011|  mongod(_ZN5mongo4repl8SyncTail16oplogApplicationEPNS0_22ReplicationCoordinatorE+0x536) [0x55c89bc57816]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.131-0400 c20011|  mongod(_ZN5mongo4repl10RSDataSync4_runEv+0x59A) [0x55c89bbe11fa]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.131-0400 c20011|  mongod(_ZNKSt12_Mem_fn_baseIMN5mongo4repl10RSDataSyncEFvvELb1EEclIJEvEEvPS2_DpOT_+0x69) [0x55c89bbe2809]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.131-0400 c20011|  mongod(_ZNSt12_Bind_simpleIFSt7_Mem_fnIMN5mongo4repl10RSDataSyncEFvvEEPS3_EE9_M_invokeIJLm0EEEEvSt12_Index_tupleIJXspT_EEE+0x42) [0x55c89bbe2772]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.132-0400 c20011|  mongod(_ZNSt12_Bind_simpleIFSt7_Mem_fnIMN5mongo4repl10RSDataSyncEFvvEEPS3_EEclEv+0x15) [0x55c89bbe2725]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.132-0400 c20011|  mongod(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt7_Mem_fnIMN5mongo4repl10RSDataSyncEFvvEEPS5_EEE6_M_runEv+0x19) [0x55c89bbe24e9]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.132-0400 c20011|  mongod(+0x4340680) [0x55c89dea1680]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.132-0400 c20011|  libpthread.so.0(+0x773A) [0x7fa5ba6fb73a]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.132-0400 c20011|  libc.so.6(clone+0x3F) [0x7fa5ba435e0f]
      [js_test:add_invalid_shard] 2017-08-01T09:58:10.132-0400 c20011| -----  END BACKTRACE  -----
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.471-0400 c20012| 2017-08-01T09:58:12.470-0400 I NETWORK  [conn4] end connection 127.0.0.1:38436 (1 connection now open)
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.471-0400 c20013| 2017-08-01T09:58:12.470-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Ending connection to host lazarus:20011 due to bad connection status; 0 connections to that host remain open
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.471-0400 c20013| 2017-08-01T09:58:12.470-0400 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 4) to lazarus:20011, response status: HostUnreachable: Connection reset by peer
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.471-0400 c20013| 2017-08-01T09:58:12.471-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to close stream: Transport endpoint is not connected
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.471-0400 c20012| 2017-08-01T09:58:12.470-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Ending connection to host lazarus:20011 due to bad connection status; 0 connections to that host remain open
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.472-0400 c20012| 2017-08-01T09:58:12.471-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to close stream: Transport endpoint is not connected
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.472-0400 c20012| 2017-08-01T09:58:12.470-0400 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 4) to lazarus:20011, response status: HostUnreachable: Connection reset by peer
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.472-0400 c20013| 2017-08-01T09:58:12.471-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to lazarus:20011
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.472-0400 c20012| 2017-08-01T09:58:12.471-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to lazarus:20011
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.472-0400 c20013| 2017-08-01T09:58:12.471-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to lazarus:20011 - HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.472-0400 c20013| 2017-08-01T09:58:12.471-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to lazarus:20011 due to failed operation on a connection
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.473-0400 c20013| 2017-08-01T09:58:12.471-0400 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 6) to lazarus:20011, response status: HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.473-0400 c20012| 2017-08-01T09:58:12.471-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to lazarus:20011 - HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.473-0400 c20012| 2017-08-01T09:58:12.471-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to lazarus:20011 due to failed operation on a connection
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.473-0400 c20013| 2017-08-01T09:58:12.472-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to lazarus:20011
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.473-0400 c20012| 2017-08-01T09:58:12.472-0400 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 6) to lazarus:20011, response status: HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.473-0400 c20013| 2017-08-01T09:58:12.472-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to lazarus:20011 - HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.474-0400 c20013| 2017-08-01T09:58:12.472-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to lazarus:20011 due to failed operation on a connection
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.474-0400 c20013| 2017-08-01T09:58:12.472-0400 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 8) to lazarus:20011, response status: HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.474-0400 c20012| 2017-08-01T09:58:12.472-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to lazarus:20011
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.474-0400 c20013| 2017-08-01T09:58:12.472-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to lazarus:20011
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.474-0400 c20012| 2017-08-01T09:58:12.472-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to lazarus:20011 - HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.474-0400 c20012| 2017-08-01T09:58:12.472-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to lazarus:20011 due to failed operation on a connection
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.474-0400 c20013| 2017-08-01T09:58:12.472-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to lazarus:20011 - HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.475-0400 c20012| 2017-08-01T09:58:12.472-0400 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 8) to lazarus:20011, response status: HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.475-0400 c20013| 2017-08-01T09:58:12.472-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to lazarus:20011 due to failed operation on a connection
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.475-0400 c20013| 2017-08-01T09:58:12.473-0400 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 10) to lazarus:20011, response status: HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.475-0400 c20012| 2017-08-01T09:58:12.473-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to lazarus:20011
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.475-0400 c20013| 2017-08-01T09:58:12.473-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to lazarus:20011
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.475-0400 c20013| 2017-08-01T09:58:12.473-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to lazarus:20011 - HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.475-0400 c20012| 2017-08-01T09:58:12.473-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to lazarus:20011 - HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.476-0400 c20013| 2017-08-01T09:58:12.473-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to lazarus:20011 due to failed operation on a connection
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.476-0400 c20012| 2017-08-01T09:58:12.473-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to lazarus:20011 due to failed operation on a connection
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.476-0400 c20013| 2017-08-01T09:58:12.473-0400 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 12) to lazarus:20011, response status: HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.476-0400 c20012| 2017-08-01T09:58:12.473-0400 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 10) to lazarus:20011, response status: HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.476-0400 c20013| 2017-08-01T09:58:12.473-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to lazarus:20011
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.476-0400 c20012| 2017-08-01T09:58:12.473-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to lazarus:20011
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.477-0400 c20013| 2017-08-01T09:58:12.474-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to lazarus:20011 - HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.477-0400 c20013| 2017-08-01T09:58:12.474-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to lazarus:20011 due to failed operation on a connection
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.477-0400 c20013| 2017-08-01T09:58:12.474-0400 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 14) to lazarus:20011, response status: HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.477-0400 c20012| 2017-08-01T09:58:12.474-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to lazarus:20011 - HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.477-0400 c20012| 2017-08-01T09:58:12.474-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to lazarus:20011 due to failed operation on a connection
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.477-0400 c20012| 2017-08-01T09:58:12.474-0400 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 12) to lazarus:20011, response status: HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.477-0400 2017-08-01T09:58:12.474-0400 I NETWORK  [thread1] trying reconnect to 127.0.0.1:20011 (127.0.0.1) failed
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.478-0400 c20013| 2017-08-01T09:58:12.474-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to lazarus:20011
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.478-0400 2017-08-01T09:58:12.474-0400 W NETWORK  [thread1] Failed to connect to 127.0.0.1:20011, in(checking socket for error after poll), reason: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.478-0400 2017-08-01T09:58:12.474-0400 I NETWORK  [thread1] reconnect 127.0.0.1:20011 (127.0.0.1) failed failed
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.478-0400 c20013| 2017-08-01T09:58:12.474-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to lazarus:20011 - HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.478-0400 c20013| 2017-08-01T09:58:12.474-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to lazarus:20011 due to failed operation on a connection
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.478-0400 c20012| 2017-08-01T09:58:12.474-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to lazarus:20011
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.479-0400 c20013| 2017-08-01T09:58:12.474-0400 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 16) to lazarus:20011, response status: HostUnreachable: Connection refused
      [js_test:add_invalid_shard] 2017-08-01T09:58:12.479-0400 c20012| 2017-08-01T09:58:12.475-0400 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to lazarus:20011 - HostUnreachable: Connection refused
      

            Assignee:
            backlog-server-repl [DO NOT USE] Backlog - Replication Team
            Reporter:
            milkie@mongodb.com Eric Milkie
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: