Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14885

replica sets that disable chaining may have trouble electing a primary if members have different priorities

    • Replication
    • Fully Compatible
    • ALL

      This report comes mostly from code inspection. When chaining is not allowed in replication, ReplSetImpl::getMemberToSyncTo only allows a secondary to sync from the primary. If a primary cannot be reached, syncing does not happen.

      In consensus.cpp, an election will refuse to elect a member with a lower priority if a member with a higher priority exists and is within 10 seconds of being caught up.

      These two facts together can cause a replica set to never elect a primary.

      Take the following scenario. Chaining is disabled, and no primary exists. Member A has priority 10 (the highest in the set), and is 5 seconds behind member B that has priority 1. B is furthest along. Neither A nor B will ever be elected. B won't be elected because the election algorithm will say "A is within 10 seconds and has a higher priority". A won't get elected because it is behind B, and because chaining is disallowed, cannot replicate from B to catch up.

      I think the end result is a primary never gets elected.

      I don't see any code that says "ignore the chainingAllowed bit and replicate off a secondary because a primary does not exist".

            Assignee:
            backlog-server-repl [DO NOT USE] Backlog - Replication Team
            Reporter:
            zardosht Zardosht Kasheff
            Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated:
              Resolved: