Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-8643

Calling rs.remove() causes a segmentation fault when using SSL

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Duplicate
    • 2.2.1
    • None
    • Replication
    • None
    • Amazon Linux 64 bit
    • Linux
    • Hide

      1. Create a replica set and make sure all servers are using SSL
      2. Attempt to remove a member of the replica set using rs.remove()
      3. The primary member should segmentation fault

      Show
      1. Create a replica set and make sure all servers are using SSL 2. Attempt to remove a member of the replica set using rs.remove() 3. The primary member should segmentation fault

    Description

      When removing a replica set member using rs.remove() the primary member segmentation faults. This only happens when using SSL. If SSL is disabled then the member can be removed without issue.

      Here is a log showing the seg fault:
      Mon Feb 18 18:35:53 [journal] _groupCommit
      Mon Feb 18 18:35:53 [journal] _groupCommit upgrade
      Mon Feb 18 18:35:53 [journal] journal REMAPPRIVATEVIEW
      Mon Feb 18 18:35:53 [journal] journal REMAPPRIVATEVIEW done startedAt: 3 n:2 0ms
      Mon Feb 18 18:35:53 [journal] groupCommit end
      Mon Feb 18 18:35:53 [rsHealthPoll] Sending command

      { replSetHeartbeat: "triplink", v: 75237, pv: 1, checkEmpty: false, from: "10.4.232.178:27017" }

      to 10.4.230.134:27017 with $auth: {}
      Mon Feb 18 18:35:54 [rsHealthPoll] Sending command

      { replSetHeartbeat: "triplink", v: 75237, pv: 1, checkEmpty: false, from: "10.4.232.178:27017" }

      to 10.4.228.114:27017 with $auth: {}
      Mon Feb 18 18:35:54 [conn18] runQuery called local.$cmd { count: "system.replset", query: {}, fields: {} }
      Mon Feb 18 18:35:54 [conn18] run command local.$cmd { count: "system.replset", query: {}, fields: {} }
      Mon Feb 18 18:35:54 [conn18] command local.$cmd command: { count: "system.replset", query: {}, fields: {} } ntoreturn:1 keyUpdates:0 locks(micros) r:28 reslen:48 0ms
      Mon Feb 18 18:35:54 [conn18] runQuery called local.system.replset {}
      Mon Feb 18 18:35:54 [conn18] query local.system.replset ntoreturn:1 keyUpdates:0 locks(micros) r:46 nreturned:1 reslen:208 0ms
      Mon Feb 18 18:35:54 [conn18] runQuery called admin.$cmd { replSetReconfig: { _id: "triplink", version: 75238, members: [

      { _id: 1, host: "10.4.232.178:27017" }

      ,

      { _id: 3, host: "10.4.228.114:27017" }

      ] } }
      Mon Feb 18 18:35:54 [conn18] run command admin.$cmd { replSetReconfig: { _id: "triplink", version: 75238, members: [

      { _id: 1, host: "10.4.232.178:27017" }

      ,

      { _id: 3, host: "10.4.228.114:27017" }

      ] } }
      Mon Feb 18 18:35:54 [conn18] command: { replSetReconfig: { _id: "triplink", version: 75238, members: [

      { _id: 1, host: "10.4.232.178:27017" }

      ,

      { _id: 3, host: "10.4.228.114:27017" }

      ] } }
      Mon Feb 18 18:35:54 [conn18] replSet replSetReconfig config object parses ok, 2 members specified
      Mon Feb 18 18:35:54 [conn18] getMyAddrs(): [127.0.0.1] [10.4.232.178] [::1] [fe80::47b:33ff:fec4:80c9%eth0]
      Mon Feb 18 18:35:54 [conn18] getallIPs("10.4.228.114"): [10.4.228.114]
      Mon Feb 18 18:35:54 BackgroundJob starting: ConnectBG
      Mon Feb 18 18:35:54 [conn18] Sending command

      { replSetHeartbeat: "triplink", v: -1, pv: 1, checkEmpty: false, from: "" }

      to 10.4.228.114:27017 with $auth: {}
      Mon Feb 18 18:35:54 [conn18] replSet replSetReconfig [2]
      Mon Feb 18 18:35:54 [conn18] replSet info saving a newer config version to local.system.replset
      Mon Feb 18 18:35:54 [conn27] CoveredIndexMatcher::matches() {} 2:25f0 0
      Mon Feb 18 18:35:54 [conn27] Matcher::matches() { ts: Timestamp 1361212554000|1, h: 4260825151744352383, v: 2, op: "n", ns: "", o:

      { msg: "Reconfig set", version: 75238 }

      }
      Mon Feb 18 18:35:54 [conn27] CoveredIndexMatcher _docMatcher->matches() returns 1
      Mon Feb 18 18:35:54 [conn27] getmore local.oplog.rs query: { ts:

      { $gte: new Date(5845321129736011777) }

      } cursorid:2107813182278322768 ntoreturn:0 keyUpdates:0 numYields: 1 locks(micros) r:379 nreturned:1 reslen:117 4684ms
      Mon Feb 18 18:35:54 [conn21] CoveredIndexMatcher::matches() {} 2:25f0 0
      Mon Feb 18 18:35:54 [conn21] Matcher::matches() { ts: Timestamp 1361212554000|1, h: 4260825151744352383, v: 2, op: "n", ns: "", o:

      { msg: "Reconfig set", version: 75238 }

      }
      Mon Feb 18 18:35:54 [conn21] CoveredIndexMatcher _docMatcher->matches() returns 1
      Mon Feb 18 18:35:54 [conn21] getmore local.oplog.rs query: { ts:

      { $gte: new Date(5845321129736011777) }

      } cursorid:2322837176604495716 ntoreturn:0 keyUpdates:0 locks(micros) r:198 nreturned:1 reslen:117 2410ms
      Mon Feb 18 18:35:54 [conn18] replSet saveConfigLocally done
      Mon Feb 18 18:35:54 [conn18] replSet attempting to relinquish
      Mon Feb 18 18:35:54 [conn18] replSet relinquishing primary state
      Mon Feb 18 18:35:54 [conn18] replSet SECONDARY
      Mon Feb 18 18:35:54 [conn18] replSet closing client sockets after relinquishing primary
      Mon Feb 18 18:35:54 Invalid access at address: 0x1d8 from thread: conn21

      Mon Feb 18 18:35:54 Got signal: 11 (Segmentation fault).

      Mon Feb 18 18:35:54 Backtrace:
      0x9a0946 0x57d47d 0x57d7e7 0x7f2748a71500 0x7f274882ac1f 0x7f274882bb7d 0x7f2748828380 0x992a79 0x9961ef 0x98e89a 0x98fd87 0x7f2748a69851 0x7f274781911d
      /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x26) [0x9a0946]
      /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x39d) [0x57d47d]
      /usr/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x247) [0x57d7e7]
      /lib64/libpthread.so.0(+0xf500) [0x7f2748a71500]
      /usr/lib64/libssl.so.10(ssl3_send_alert+0x4f) [0x7f274882ac1f]
      /usr/lib64/libssl.so.10(ssl3_read_bytes+0x21d) [0x7f274882bb7d]
      /usr/lib64/libssl.so.10(+0x22380) [0x7f2748828380]
      /usr/bin/mongod(_ZN5mongo6Socket11unsafe_recvEPci+0x9) [0x992a79]
      /usr/bin/mongod(_ZN5mongo6Socket4recvEPci+0x2f) [0x9961ef]
      /usr/bin/mongod(_ZN5mongo13MessagingPort4recvERNS_7MessageE+0x8a) [0x98e89a]
      /usr/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x437) [0x98fd87]
      /lib64/libpthread.so.0(+0x7851) [0x7f2748a69851]
      /lib64/libc.so.6(clone+0x6d) [0x7f274781911d]

      And here is the shell output after issuing the command:
      triplink:PRIMARY> rs.remove("10.4.230.134:27017")
      Mon Feb 18 18:35:54 DBClientCursor::init call() failed
      Mon Feb 18 18:35:54 query failed : admin.$cmd { replSetReconfig: { _id: "triplink", version: 75238, members: [

      { _id: 1, host: "10.4.232.178:27017" }

      ,

      { _id: 3, host: "10.4.228.114:27017" }

      ] } } to: 127.0.0.1:27017
      Mon Feb 18 18:35:54 Error: error doing query: failed src/mongo/shell/collection.js:155
      Mon Feb 18 18:35:54 trying reconnect to 127.0.0.1:27017
      Mon Feb 18 18:35:54 reconnect 127.0.0.1:27017 ok
      Mon Feb 18 18:35:54 SSL Error ret: -1 err: 1 error:140790E5:SSL routines:SSL23_WRITE:ssl handshake failure
      Mon Feb 18 18:35:54 Socket say send() errno:0 Success 127.0.0.1:27017
      > exit

      Attachments

        Issue Links

          Activity

            People

              milkie@mongodb.com Eric Milkie
              ryan.bunker Ryan Bunker
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: