[SERVER-24834] Two members of replica set threw error around the same time Created: 29/Jun/16  Updated: 14/Jul/16  Resolved: 29/Jun/16

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Michal Kralik Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-24711 ASIO connections that have already ti... Closed
Operating System: ALL
Participants:

 Description   

We're running a 3 member replica set - primary, secondary, hidden and two members threw error at around the same time. One was hidden and the other not sure if it was primary or secondary.

Here's the log from one machine

2016-06-29T13:44:35.367+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to 32-1c.mongo.example.com:27017; HostUnreachable: Connection refused
2016-06-29T13:44:35.379+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to 32-1c.mongo.example.com:27017; HostUnreachable: Connection refused
2016-06-29T13:44:35.391+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to 32-1c.mongo.example.com:27017; HostUnreachable: Connection refused
2016-06-29T13:44:37.404+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to 32-1c.mongo.example.com:27017; HostUnreachable: Connection refused
2016-06-29T13:44:37.419+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to 32-1c.mongo.example.com:27017; HostUnreachable: Connection refused
2016-06-29T13:44:47.391+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to 32-1c.mongo.example.com:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
2016-06-29T13:44:47.701+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to 32-1b.mongo.example.com:27017; ExceededTimeLimit: Operation timed out
2016-06-29T13:44:48.439+0000 I ACCESS   [conn20761] Successfully authenticated as principal ceecko on admin
2016-06-29T13:44:50.265+0000 I ACCESS   [conn20761] Successfully authenticated as principal ceecko on admin
2016-06-29T13:44:57.430+0000 I -        [NetworkInterfaceASIO-Replication-0] Invariant failure _connection.is_initialized() src/mongo/executor/network_interface_asio_operation.cpp 142
2016-06-29T13:44:57.430+0000 I -        [NetworkInterfaceASIO-Replication-0]
 
***aborting after invariant() failure
 
 
2016-06-29T13:44:57.457+0000 F -        [NetworkInterfaceASIO-Replication-0] Got signal: 6 (Aborted).
 
 0x131a0d2 0x1319229 0x1319a32 0x7ff12651f100 0x7ff1261835f7 0x7ff126184ce8 0x12a393b 0x10e779d 0x10c6773 0x10c776c 0x10c7cd7 0x10c8409 0x1335de1 0x1336001 0x133a19f 0x10d37e5 0x1b34290 0x7ff126517dc5 0x7ff126244ced
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"F1A0D2","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"F19229"},{"b":"400000","o":"F19A32"},{"b":"7FF126510000","o":"F100"},{"b":"7FF12614E000","o":"355F7","s":"gsignal"},{"b":"7FF12614E000","o":"36CE8","s":"abort"},{"b":"400000","o":"EA393B","s":"_ZN5mongo15invariantFailedEPKcS1_j"},{"b":"400000","o":"CE779D"},{"b":"400000","o":"CC6773"},{"b":"400000","o":"CC776C"},{"b":"400000","o":"CC7CD7"},{"b":"400000","o":"CC8409"},{"b":"400000","o":"F35DE1","s":"_ZN4asio6detail9scheduler10do_run_oneERNS0_11scoped_lockINS0_11posix_mutexEEERNS0_21scheduler_thread_infoERKSt10error_code"},{"b":"400000","o":"F36001","s":"_ZN4asio6detail9scheduler3runERSt10error_code"},{"b":"400000","o":"F3A19F","s":"_ZN4asio10io_service3runEv"},{"b":"400000","o":"CD37E5"},{"b":"400000","o":"1734290","s":"execute_native_thread_routine"},{"b":"7FF126510000","o":"7DC5"},{"b":"7FF12614E000","o":"F6CED","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.2.7", "gitVersion" : "4249c1d2b5999ebbf1fdf3bc0e0e3b3ff5c0aaf2", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-327.18.2.el7.x86_64", "version" : "#1 SMP Thu May 12 11:03:55 UTC 2016", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "05C2980D41C615E7C1AB7B5330630B8AB5F5B9D0" }, { "b" : "7FFE62690000", "elfType" : 3, "buildId" : "627B075D566CF4BFF68497DAB7DF9B024F8E5A83" }, { "b" : "7FF127438000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "478D01A08B923A251D755BB421F3EBAF9F2982C1" }, { "b" : "7FF127050000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "42AAFD25E9B5F4CE2EFE6309491445B1A92A575D" }, { "b" : "7FF126E48000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "CB0D2C9F29DBD13C47E7D2EEFB94B35835698CCA" }, { "b" : "7FF126C44000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "091060A163E7EDA25572F3B1BAF2E8F80209C00E" }, { "b" : "7FF126942000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "F9DF294FB70243549DCB643F1322BB20E70E9FE8" }, { "b" : "7FF12672C000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "6AA1DCC4DE7F1836344949857FC2017278631FFD" }, { "b" : "7FF126510000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "723F0AC75EF88E778940AE8A8BC30141D85B116A" }, { "b" : "7FF12614E000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "088D48A9AB5A512D9F75BA3D66B6CF77EB6588F9" }, { "b" : "7FF1276A5000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "09E1BB4D034C7263810A41100647068858A7ECB6" }, { "b" : "7FF125F02000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "D46A230FFF4A7B808B3CFC213D31FCAC542FB504" }, { "b" : "7FF125C1D000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "6D6136A0E795420B05854DEF13A10C226FE9CCB2" }, { "b" : "7FF125A19000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "3A1166709F88740C49E060731832E3FAD2DFB66B" }, { "b" : "7FF1257E7000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "AA97A848DD7C9E57B06EC913E10D420AEBBCE027" }, { "b" : "7FF1255D1000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "1982C8CDAE90F898D1AD26DC07E807333B4789D0" }, { "b" : "7FF1253C2000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "AEF6C3D3C5152F339942041519A106FC055DAF71" }, { "b" : "7FF1251BE000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "2E01D5AC08C1280D013AAB96B292AC58BC30A263" }, { "b" : "7FF124FA4000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "D02DC134F38F06F3885231FD2486D5EF4796E5F9" }, { "b" : "7FF124D7F000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "82FF6B18E1E42825CC2D060F969479AD4AF2F62C" }, { "b" : "7FF124B1E000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "AE64AA461A26E01F60408013D361749D56DD0AE1" }, { "b" : "7FF1248F9000", "path" : "/lib64/liblzma.so.5", "elfType" : 3, "buildId" : "98131C9354279ABD39FD80D4BE5B3EC5678BD9E0" } ] }}
 
 mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x131a0d2]
 mongod(+0xF19229) [0x1319229]
 mongod(+0xF19A32) [0x1319a32]
 libpthread.so.0(+0xF100) [0x7ff12651f100]
 libc.so.6(gsignal+0x37) [0x7ff1261835f7]
 libc.so.6(abort+0x148) [0x7ff126184ce8]
 mongod(_ZN5mongo15invariantFailedEPKcS1_j+0xCB) [0x12a393b]
 mongod(+0xCE779D) [0x10e779d]
 mongod(+0xCC6773) [0x10c6773]
 mongod(+0xCC776C) [0x10c776c]
 mongod(+0xCC7CD7) [0x10c7cd7]
 mongod(+0xCC8409) [0x10c8409]
 mongod(_ZN4asio6detail9scheduler10do_run_oneERNS0_11scoped_lockINS0_11posix_mutexEEERNS0_21scheduler_thread_infoERKSt10error_code+0x2F1) [0x1335de1]
 mongod(_ZN4asio6detail9scheduler3runERSt10error_code+0xC1) [0x1336001]
 mongod(_ZN4asio10io_service3runEv+0x2F) [0x133a19f]
 mongod(+0xCD37E5) [0x10d37e5]
 mongod(execute_native_thread_routine+0x20) [0x1b34290]
 libpthread.so.0(+0x7DC5) [0x7ff126517dc5]
 libc.so.6(clone+0x6D) [0x7ff126244ced]
-----  END BACKTRACE  -----

And the other machine

2016-06-29T13:27:44.438+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to 32-1b.mongo.example.com:27017; ExceededTimeLimit: Operation timed out
2016-06-29T13:27:45.981+0000 I NETWORK  [conn118] end connection 149.202.164.250:48007 (4 connections now open)
2016-06-29T13:27:56.439+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to 32-1b.mongo.example.com:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
2016-06-29T13:28:07.997+0000 I NETWORK  [initandlisten] connection accepted from 149.202.164.250:48726 #119 (5 connections now open)
2016-06-29T13:28:08.110+0000 I ACCESS   [conn119] Successfully authenticated as principal __system on local
2016-06-29T13:28:08.441+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to 32-1b.mongo.example.com:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
2016-06-29T13:28:15.998+0000 I NETWORK  [initandlisten] connection accepted from 149.202.164.250:48725 #120 (6 connections now open)
2016-06-29T13:28:16.104+0000 I ACCESS   [conn120] Successfully authenticated as principal __system on local
2016-06-29T13:28:16.117+0000 I NETWORK  [conn120] end connection 149.202.164.250:48725 (5 connections now open)
2016-06-29T13:28:18.432+0000 I ACCESS   [conn56] Successfully authenticated as principal ceecko on admin
2016-06-29T13:28:18.445+0000 I -        [NetworkInterfaceASIO-Replication-0] Invariant failure _connection.is_initialized() src/mongo/executor/network_interface_asio_operation.cpp 142
2016-06-29T13:28:18.445+0000 I -        [NetworkInterfaceASIO-Replication-0]
 
***aborting after invariant() failure
 
 
2016-06-29T13:28:18.462+0000 F -        [NetworkInterfaceASIO-Replication-0] Got signal: 6 (Aborted).
 
 0x131a0d2 0x1319229 0x1319a32 0x7fcbacce3100 0x7fcbac9475f7 0x7fcbac948ce8 0x12a393b 0x10e779d 0x10c6773 0x10c776c 0x10c7cd7 0x10c8409 0x1335de1 0x1336001 0x133a19f 0x10d37e5 0x1b34290 0x7fcbaccdbdc5 0x7fcbaca08ced
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"F1A0D2","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"F19229"},{"b":"400000","o":"F19A32"},{"b":"7FCBACCD4000","o":"F100"},{"b":"7FCBAC912000","o":"355F7","s":"gsignal"},{"b":"7FCBAC912000","o":"36CE8","s":"abort"},{"b":"400000","o":"EA393B","s":"_ZN5mongo15invariantFailedEPKcS1_j"},{"b":"400000","o":"CE779D"},{"b":"400000","o":"CC6773"},{"b":"400000","o":"CC776C"},{"b":"400000","o":"CC7CD7"},{"b":"400000","o":"CC8409"},{"b":"400000","o":"F35DE1","s":"_ZN4asio6detail9scheduler10do_run_oneERNS0_11scoped_lockINS0_11posix_mutexEEERNS0_21scheduler_thread_infoERKSt10error_code"},{"b":"400000","o":"F36001","s":"_ZN4asio6detail9scheduler3runERSt10error_code"},{"b":"400000","o":"F3A19F","s":"_ZN4asio10io_service3runEv"},{"b":"400000","o":"CD37E5"},{"b":"400000","o":"1734290","s":"execute_native_thread_routine"},{"b":"7FCBACCD4000","o":"7DC5"},{"b":"7FCBAC912000","o":"F6CED","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.2.7", "gitVersion" : "4249c1d2b5999ebbf1fdf3bc0e0e3b3ff5c0aaf2", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-327.18.2.el7.x86_64", "version" : "#1 SMP Thu May 12 11:03:55 UTC 2016", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "05C2980D41C615E7C1AB7B5330630B8AB5F5B9D0" }, { "b" : "7FFF2F8F7000", "elfType" : 3, "buildId" : "627B075D566CF4BFF68497DAB7DF9B024F8E5A83" }, { "b" : "7FCBADBFC000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "478D01A08B923A251D755BB421F3EBAF9F2982C1" }, { "b" : "7FCBAD814000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "42AAFD25E9B5F4CE2EFE6309491445B1A92A575D" }, { "b" : "7FCBAD60C000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "CB0D2C9F29DBD13C47E7D2EEFB94B35835698CCA" }, { "b" : "7FCBAD408000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "091060A163E7EDA25572F3B1BAF2E8F80209C00E" }, { "b" : "7FCBAD106000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "F9DF294FB70243549DCB643F1322BB20E70E9FE8" }, { "b" : "7FCBACEF0000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "6AA1DCC4DE7F1836344949857FC2017278631FFD" }, { "b" : "7FCBACCD4000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "723F0AC75EF88E778940AE8A8BC30141D85B116A" }, { "b" : "7FCBAC912000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "088D48A9AB5A512D9F75BA3D66B6CF77EB6588F9" }, { "b" : "7FCBADE69000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "09E1BB4D034C7263810A41100647068858A7ECB6" }, { "b" : "7FCBAC6C6000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "D46A230FFF4A7B808B3CFC213D31FCAC542FB504" }, { "b" : "7FCBAC3E1000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "6D6136A0E795420B05854DEF13A10C226FE9CCB2" }, { "b" : "7FCBAC1DD000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "3A1166709F88740C49E060731832E3FAD2DFB66B" }, { "b" : "7FCBABFAB000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "AA97A848DD7C9E57B06EC913E10D420AEBBCE027" }, { "b" : "7FCBABD95000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "1982C8CDAE90F898D1AD26DC07E807333B4789D0" }, { "b" : "7FCBABB86000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "AEF6C3D3C5152F339942041519A106FC055DAF71" }, { "b" : "7FCBAB982000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "2E01D5AC08C1280D013AAB96B292AC58BC30A263" }, { "b" : "7FCBAB768000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "D02DC134F38F06F3885231FD2486D5EF4796E5F9" }, { "b" : "7FCBAB543000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "82FF6B18E1E42825CC2D060F969479AD4AF2F62C" }, { "b" : "7FCBAB2E2000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "AE64AA461A26E01F60408013D361749D56DD0AE1" }, { "b" : "7FCBAB0BD000", "path" : "/lib64/liblzma.so.5", "elfType" : 3, "buildId" : "98131C9354279ABD39FD80D4BE5B3EC5678BD9E0" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x131a0d2]
 mongod(+0xF19229) [0x1319229]
 mongod(+0xF19A32) [0x1319a32]
 libpthread.so.0(+0xF100) [0x7fcbacce3100]
 libc.so.6(gsignal+0x37) [0x7fcbac9475f7]
 libc.so.6(abort+0x148) [0x7fcbac948ce8]
 mongod(_ZN5mongo15invariantFailedEPKcS1_j+0xCB) [0x12a393b]
 mongod(+0xCE779D) [0x10e779d]
 mongod(+0xCC6773) [0x10c6773]
 mongod(+0xCC776C) [0x10c776c]
 mongod(+0xCC7CD7) [0x10c7cd7]
 mongod(+0xCC8409) [0x10c8409]
 mongod(_ZN4asio6detail9scheduler10do_run_oneERNS0_11scoped_lockINS0_11posix_mutexEEERNS0_21scheduler_thread_infoERKSt10error_code+0x2F1) [0x1335de1]
 mongod(_ZN4asio6detail9scheduler3runERSt10error_code+0xC1) [0x1336001]
 mongod(_ZN4asio10io_service3runEv+0x2F) [0x133a19f]
 mongod(+0xCD37E5) [0x10d37e5]
 mongod(execute_native_thread_routine+0x20) [0x1b34290]
 libpthread.so.0(+0x7DC5) [0x7fcbaccdbdc5]
 libc.so.6(clone+0x6D) [0x7fcbaca08ced]
-----  END BACKTRACE  -----



 Comments   
Comment by Ramon Fernandez Marina [ 29/Jun/16 ]

Sorry you're running into difficulties ceecko@gmail.com. This bug was reported earlier in SERVER-24711, so I'm going to mark this ticket as a duplicate.

The fix for SERVER-24711 is included in development version 3.3.9, released yesterday. A fix for stable version 3.2.8, scheduled for the coming weeks, is being tested right now.

Thanks,
Ramón.

Generated at Thu Feb 08 04:07:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.