-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.2.1
-
Component/s: Replication
-
None
-
Environment:Amazon Linux 64 bit
-
Linux
-
When removing a replica set member using rs.remove() the primary member segmentation faults. This only happens when using SSL. If SSL is disabled then the member can be removed without issue.
Here is a log showing the seg fault:
Mon Feb 18 18:35:53 [journal] _groupCommit
Mon Feb 18 18:35:53 [journal] _groupCommit upgrade
Mon Feb 18 18:35:53 [journal] journal REMAPPRIVATEVIEW
Mon Feb 18 18:35:53 [journal] journal REMAPPRIVATEVIEW done startedAt: 3 n:2 0ms
Mon Feb 18 18:35:53 [journal] groupCommit end
Mon Feb 18 18:35:53 [rsHealthPoll] Sending command
to 10.4.230.134:27017 with $auth: {}
Mon Feb 18 18:35:54 [rsHealthPoll] Sending command
to 10.4.228.114:27017 with $auth: {}
Mon Feb 18 18:35:54 [conn18] runQuery called local.$cmd { count: "system.replset", query: {}, fields: {} }
Mon Feb 18 18:35:54 [conn18] run command local.$cmd { count: "system.replset", query: {}, fields: {} }
Mon Feb 18 18:35:54 [conn18] command local.$cmd command: { count: "system.replset", query: {}, fields: {} } ntoreturn:1 keyUpdates:0 locks(micros) r:28 reslen:48 0ms
Mon Feb 18 18:35:54 [conn18] runQuery called local.system.replset {}
Mon Feb 18 18:35:54 [conn18] query local.system.replset ntoreturn:1 keyUpdates:0 locks(micros) r:46 nreturned:1 reslen:208 0ms
Mon Feb 18 18:35:54 [conn18] runQuery called admin.$cmd { replSetReconfig: { _id: "triplink", version: 75238, members: [
,
{ _id: 3, host: "10.4.228.114:27017" } ] } }
Mon Feb 18 18:35:54 [conn18] run command admin.$cmd { replSetReconfig: { _id: "triplink", version: 75238, members: [
,
{ _id: 3, host: "10.4.228.114:27017" } ] } }
Mon Feb 18 18:35:54 [conn18] command: { replSetReconfig: { _id: "triplink", version: 75238, members: [
,
{ _id: 3, host: "10.4.228.114:27017" } ] } }
Mon Feb 18 18:35:54 [conn18] replSet replSetReconfig config object parses ok, 2 members specified
Mon Feb 18 18:35:54 [conn18] getMyAddrs(): [127.0.0.1] [10.4.232.178] [::1] [fe80::47b:33ff:fec4:80c9%eth0]
Mon Feb 18 18:35:54 [conn18] getallIPs("10.4.228.114"): [10.4.228.114]
Mon Feb 18 18:35:54 BackgroundJob starting: ConnectBG
Mon Feb 18 18:35:54 [conn18] Sending command
to 10.4.228.114:27017 with $auth: {}
Mon Feb 18 18:35:54 [conn18] replSet replSetReconfig [2]
Mon Feb 18 18:35:54 [conn18] replSet info saving a newer config version to local.system.replset
Mon Feb 18 18:35:54 [conn27] CoveredIndexMatcher::matches() {} 2:25f0 0
Mon Feb 18 18:35:54 [conn27] Matcher::matches() { ts: Timestamp 1361212554000|1, h: 4260825151744352383, v: 2, op: "n", ns: "", o:
}
Mon Feb 18 18:35:54 [conn27] CoveredIndexMatcher _docMatcher->matches() returns 1
Mon Feb 18 18:35:54 [conn27] getmore local.oplog.rs query: { ts:
} cursorid:2107813182278322768 ntoreturn:0 keyUpdates:0 numYields: 1 locks(micros) r:379 nreturned:1 reslen:117 4684ms
Mon Feb 18 18:35:54 [conn21] CoveredIndexMatcher::matches() {} 2:25f0 0
Mon Feb 18 18:35:54 [conn21] Matcher::matches() { ts: Timestamp 1361212554000|1, h: 4260825151744352383, v: 2, op: "n", ns: "", o:
}
Mon Feb 18 18:35:54 [conn21] CoveredIndexMatcher _docMatcher->matches() returns 1
Mon Feb 18 18:35:54 [conn21] getmore local.oplog.rs query: { ts:
} cursorid:2322837176604495716 ntoreturn:0 keyUpdates:0 locks(micros) r:198 nreturned:1 reslen:117 2410ms
Mon Feb 18 18:35:54 [conn18] replSet saveConfigLocally done
Mon Feb 18 18:35:54 [conn18] replSet attempting to relinquish
Mon Feb 18 18:35:54 [conn18] replSet relinquishing primary state
Mon Feb 18 18:35:54 [conn18] replSet SECONDARY
Mon Feb 18 18:35:54 [conn18] replSet closing client sockets after relinquishing primary
Mon Feb 18 18:35:54 Invalid access at address: 0x1d8 from thread: conn21
Mon Feb 18 18:35:54 Got signal: 11 (Segmentation fault).
Mon Feb 18 18:35:54 Backtrace:
0x9a0946 0x57d47d 0x57d7e7 0x7f2748a71500 0x7f274882ac1f 0x7f274882bb7d 0x7f2748828380 0x992a79 0x9961ef 0x98e89a 0x98fd87 0x7f2748a69851 0x7f274781911d
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x26) [0x9a0946]
/usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x39d) [0x57d47d]
/usr/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x247) [0x57d7e7]
/lib64/libpthread.so.0(+0xf500) [0x7f2748a71500]
/usr/lib64/libssl.so.10(ssl3_send_alert+0x4f) [0x7f274882ac1f]
/usr/lib64/libssl.so.10(ssl3_read_bytes+0x21d) [0x7f274882bb7d]
/usr/lib64/libssl.so.10(+0x22380) [0x7f2748828380]
/usr/bin/mongod(_ZN5mongo6Socket11unsafe_recvEPci+0x9) [0x992a79]
/usr/bin/mongod(_ZN5mongo6Socket4recvEPci+0x2f) [0x9961ef]
/usr/bin/mongod(_ZN5mongo13MessagingPort4recvERNS_7MessageE+0x8a) [0x98e89a]
/usr/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x437) [0x98fd87]
/lib64/libpthread.so.0(+0x7851) [0x7f2748a69851]
/lib64/libc.so.6(clone+0x6d) [0x7f274781911d]
And here is the shell output after issuing the command:
triplink:PRIMARY> rs.remove("10.4.230.134:27017")
Mon Feb 18 18:35:54 DBClientCursor::init call() failed
Mon Feb 18 18:35:54 query failed : admin.$cmd { replSetReconfig: { _id: "triplink", version: 75238, members: [
,
{ _id: 3, host: "10.4.228.114:27017" } ] } } to: 127.0.0.1:27017
Mon Feb 18 18:35:54 Error: error doing query: failed src/mongo/shell/collection.js:155
Mon Feb 18 18:35:54 trying reconnect to 127.0.0.1:27017
Mon Feb 18 18:35:54 reconnect 127.0.0.1:27017 ok
Mon Feb 18 18:35:54 SSL Error ret: -1 err: 1 error:140790E5:SSL routines:SSL23_WRITE:ssl handshake failure
Mon Feb 18 18:35:54 Socket say send() errno:0 Success 127.0.0.1:27017
> exit
- duplicates
-
SERVER-5487 Seg fault when shutting down replica set member - Subscription Build with SSL Enabled
- Closed