Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9052

During failover in replicaset, MongoDB crashes

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.2.3
    • Component/s: Stability
    • Environment:
    • Linux
    • Hide

      Not sure exactly, but my guess is to have one primary, one secondary, and an arbiter and cause the secondary to die. Hopefully the logs will be enough to give you some information that will help determine what went wrong.

      Show
      Not sure exactly, but my guess is to have one primary, one secondary, and an arbiter and cause the secondary to die. Hopefully the logs will be enough to give you some information that will help determine what went wrong.

      We recently upgraded Mongo from 2.2.2 to 2.2.3. After a couple days of running, the primary couldn't be contacted by the arbiter and the secondary was elected to take over... however, at that point, MongoDB just stopped responding and put tons of errors in the log.

      (See attachments for the logs)

      The primary has these types of errors:

      problem detected during query over (DBNAME).(COLLECTION_NAME) :

      { $err: "not master and slaveOk=false", code: 13435 }

      [rsMgr] replSet can't see a majority, will not try to elect self

      recv(): message len XXX is too largeXX

      Assertion: 16141:cannot translate opcode 26975

      ... but see the log for more details.

      The Java application server running the same box as the primary MongoDB instance (o16.servername.com) uses ReadPreference.primaryPreferred() when it does queries.

      The Java application server running the same box as the secondary MongoDB instance (o15.servername.com) uses ReadPreference.nearest() when it does queries, since it's physically across the country from the primary.

      P.S. after this happened, I updated my Java driver to mongo-java-driver:2.10.1 and am considering rebuilding using the latest openssl build (0.9.8e-26.el5_9.1).

            Assignee:
            stephen.lee Stephen Lee
            Reporter:
            dclaus1000 Dave Claussen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: