Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-60802

Primary node turns to ROLLBACK state permanently

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Incomplete
    • Affects Version/s: 4.2.15
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      1. The primary node is down with some data unable to sync to secondary nodes.
      2. Some new data writes to the new primary node and sync to the rest of the replica set.
      3. Restart the former primary node.

      Description

      We have a MongoDB cluster host on-premise on AWS, containing 1 primary node and 2 secondary nodes, on 3 r5 EC2 instances. Due to some heavy workloads, the primary node's memory utilization reached 100% and then the instance crashed.

      After rebooting the instance, we restart the MongoDB, one of the secondary nodes became the primary as expected. Then the former primary node turned into ROLLBACK state. We noticed the docs on https://docs.mongodb.com/manual/core/replica-set-rollbacks/ that this is because secondaries can not keep up with the throughput of operations on the former primary. However, it stuck at the state after several rollback files were created under the rollback folder, and after that, we did not notice any new rollback activities on the log.

      In the end, we stopped MongoDB, cleared all data on the node, and started again to sync data from the replica set.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              dmitry.agranat Dmitry Agranat
              Reporter:
              zijun.tian@tusimple.ai Zijun Tian
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: