Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-4927

Slaves stops replog sync if another slaves used fsyncLock

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Gone away
    • 2.0.2
    • None
    • Replication
    • Linux 2.6.32-38-server, Ubuntu 10.04, MongoDB 2.0.1, Replicaset with 4 Nodes, NUMA, 2x XEON E5620 , 24 GB RAM
    • Replication
    • ALL

    Description

      This is our Setup
      2 Shards, each a ReplicaSet with 4 Nodes. 1 Node is dedicated for backups (priority:0,hidden:true)
      If we start a backup we send the backup node the fsyncLock command and then start a rsync of the filesystem.
      After we have finished the backup we send the fsyncUnLock command to the backup node.

      If we have a master switch (due to upgrade or failure) in the ReplicaSet we encounter the problem that some or all slaves stops oplog syncing when the backup node starts the backup. It is exactly the same moment as we start the fsyncLock command, since the replication lag is the same for the backup nodes and the slaves which also stops syncing. When the backup is finished the other slaves also starts syncing again.
      db.currentOp() doesn't show the fsyncLock on the slaves, only on the backup node.
      To get rid of this problem we have to start the non backup slave. After this restart the slave runs well and never stop syncing again together with the backup node.

      This is the second time we've encoutered this problem. Since this is our production environment we don't want to force a master switch if not needed.

      It seems that the cause of this problem is the master switch in the replicaset.

      Regards,
      Steffen

      Attachments

        Issue Links

          Activity

            People

              backlog-server-repl Backlog - Replication Team
              steffen Steffen
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: