[SERVER-6673] Piping a String to SSL enabled mongos via openssl s_client causes a crash with no stack trace Created: 31/Jul/12  Updated: 23/Feb/15  Resolved: 13/Aug/12

Status: Closed
Project: Core Server
Component/s: Networking, Sharding
Affects Version/s: 2.2.0-rc0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Adam Comerford Assignee: Greg Studer
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu Linux 12.04 (mongos) - 11.04 (build and mongod), mongos/mongod 2.2.0-rc0 with SSL


Issue Links:
Depends
Duplicate
duplicates SERVER-6509 ignore SIGPIPE Closed
Operating System: Linux
Participants:

 Description   

When run on the same host, the following command crashed the mongos process immediately:

echo "GET /" | openssl s_client -connect localhost:27017

If connecting from a remote host, speed of connection seems to matter (e.g. over VPN did not seem to trigger at all) - from another very close host, this loop caused the crash after several iterations:

while (true) ; do echo "GET /" | openssl s_client -connect localhost:27017 ; done

Attaching strace to the process in flight, here is the portion right before the crash:

accept(6, {sa_family=AF_INET, sin_port=htons(44893), sin_addr=inet_addr("10.7.100.20")}, [16]) = 30
setsockopt(30, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(30, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
getsockopt(30, SOL_TCP, TCP_KEEPIDLE, [7200], [4]) = 0
setsockopt(30, SOL_TCP, TCP_KEEPIDLE, [300], 4) = 0
getsockopt(30, SOL_TCP, TCP_KEEPINTVL, [75], [4]) = 0
write(1, "Wed Aug  1 01:33:08 [mongosMain]"..., 106) = 106
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
clone(child_stack=0x7f03687ddfb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f03687de9d0, tls=0x7f03687de700, child_tidptr=0x7f03687de9d0) = 22074
select(8, [6 7], NULL, NULL, {0, 10000}) = 1 (in [6], left {0, 1064})
accept(6, {sa_family=AF_INET, sin_port=htons(44894), sin_addr=inet_addr("10.7.100.20")}, [16]) = 30
setsockopt(30, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(30, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
getsockopt(30, SOL_TCP, TCP_KEEPIDLE, [7200], [4]) = 0
setsockopt(30, SOL_TCP, TCP_KEEPIDLE, [300], 4) = 0
getsockopt(30, SOL_TCP, TCP_KEEPINTVL, [75], [4]) = 0
write(1, "Wed Aug  1 01:33:08 [mongosMain]"..., 106) = 106
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
clone(child_stack=0x7f03687ddfb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f03687de9d0, tls=0x7f03687de700, child_tidptr=0x7f03687de9d0) = 22075
select(8, [6 7], NULL, NULL, {0, 10000} <unfinished ...>
+++ killed by SIGPIPE +++

There is no indication of a problem in the logs - last 50 lines:

Wed Aug  1 00:02:40 [mongosMain] connection accepted from 192.168.100.20:48007 #497 (3 connections now open)
Wed Aug  1 00:02:40 [conn497] end connection 192.168.100.20:48007 (2 connections now open)
Wed Aug  1 00:02:40 [mongosMain] connection accepted from 192.168.100.20:48008 #498 (3 connections now open)
Wed Aug  1 00:02:40 [conn498] end connection 192.168.100.20:48008 (2 connections now open)
Wed Aug  1 00:02:40 [mongosMain] connection accepted from 192.168.100.20:48009 #499 (3 connections now open)
Wed Aug  1 00:02:40 [conn499] end connection 192.168.100.20:48009 (2 connections now open)
Wed Aug  1 00:02:40 [mongosMain] connection accepted from 192.168.100.20:48010 #500 (3 connections now open)
Wed Aug  1 00:02:40 [conn500] end connection 192.168.100.20:48010 (2 connections now open)
Wed Aug  1 00:02:40 [mongosMain] connection accepted from 192.168.100.20:48011 #501 (3 connections now open)
Wed Aug  1 00:02:40 [conn501] end connection 192.168.100.20:48011 (2 connections now open)
Wed Aug  1 00:02:40 [mongosMain] connection accepted from 192.168.100.20:48012 #502 (3 connections now open)
Wed Aug  1 00:02:40 [conn502] end connection 192.168.100.20:48012 (2 connections now open)
Wed Aug  1 00:02:40 [mongosMain] connection accepted from 192.168.100.20:48013 #503 (3 connections now open)
Wed Aug  1 00:02:40 [conn503] end connection 192.168.100.20:48013 (2 connections now open)
Wed Aug  1 00:02:40 [mongosMain] connection accepted from 192.168.100.20:48014 #504 (3 connections now open)
Wed Aug  1 00:02:40 [conn504] end connection 192.168.100.20:48014 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48015 #505 (3 connections now open)
Wed Aug  1 00:02:41 [conn505] end connection 192.168.100.20:48015 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48016 #506 (3 connections now open)
Wed Aug  1 00:02:41 [conn506] end connection 192.168.100.20:48016 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48017 #507 (3 connections now open)
Wed Aug  1 00:02:41 [conn507] end connection 192.168.100.20:48017 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48018 #508 (3 connections now open)
Wed Aug  1 00:02:41 [conn508] end connection 192.168.100.20:48018 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48019 #509 (3 connections now open)
Wed Aug  1 00:02:41 [conn509] end connection 192.168.100.20:48019 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48020 #510 (3 connections now open)
Wed Aug  1 00:02:41 [conn510] end connection 192.168.100.20:48020 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48021 #511 (3 connections now open)
Wed Aug  1 00:02:41 [conn511] end connection 192.168.100.20:48021 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48022 #512 (3 connections now open)
Wed Aug  1 00:02:41 [conn512] end connection 192.168.100.20:48022 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48023 #513 (3 connections now open)
Wed Aug  1 00:02:41 [conn513] end connection 192.168.100.20:48023 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48024 #514 (3 connections now open)
Wed Aug  1 00:02:41 [conn514] end connection 192.168.100.20:48024 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48025 #515 (3 connections now open)
Wed Aug  1 00:02:41 [conn515] end connection 192.168.100.20:48025 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48026 #516 (3 connections now open)
Wed Aug  1 00:02:41 [conn516] end connection 192.168.100.20:48026 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48027 #517 (3 connections now open)
Wed Aug  1 00:02:41 [conn517] end connection 192.168.100.20:48027 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48028 #518 (3 connections now open)
Wed Aug  1 00:02:41 [conn518] end connection 192.168.100.20:48028 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48029 #519 (3 connections now open)
Wed Aug  1 00:02:41 [conn519] end connection 192.168.100.20:48029 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48030 #520 (3 connections now open)
Wed Aug  1 00:02:41 [conn520] end connection 192.168.100.20:48030 (2 connections now open)
Wed Aug  1 00:02:41 [mongosMain] connection accepted from 192.168.100.20:48031 #521 (3 connections now open)
Wed Aug  1 00:02:41 [conn521] end connection 192.168.100.20:48031 (2 connections now open)



 Comments   
Comment by Adam Comerford [ 13/Aug/12 ]

Confirmed as fixed by SIGPIPE changes in SERVER-6509

Comment by Adam Comerford [ 13/Aug/12 ]

Confirmed this no longer crashes (after several days of constant testing).

Comment by Adam Comerford [ 10/Aug/12 ]

Testing of v2.2.0-rc1-pre- so far is not showing this crash (hundreds of iterations so far). I will leave it running overnight to confirm.

Comment by Adam Comerford [ 10/Aug/12 ]

If SERVER-6509 is the fix, then we can close this as a dupe - I still have my test shards running - easy to rebuild/retest - will do so shortly

Comment by Greg Studer [ 10/Aug/12 ]

Problem is we can get a sigpipe when using SSH and we don't handle this on certain OSes.

Generated at Thu Feb 08 03:12:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.