Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-7472

Replication lag can cause cluster to hang in migration critical section

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.2.2, 2.3.1
    • Affects Version/s: 2.2.0
    • Component/s: Sharding
    • Labels:
      None
    • Fully Compatible
    • ALL

      In the critical section of a migration we check to make sure a majority of secondaries have received the migration writes. If the secondaries fall behind at that point, it's possible for the migration to be stuck in the critical section for a long time (we timeout after 5 minutes). During that time, however, the whole cluster can become unusable as setShardVersion commands will block until the shard is out of the critical section.

            Assignee:
            spencer@mongodb.com Spencer Brody (Inactive)
            Reporter:
            spencer@mongodb.com Spencer Brody (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: