Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14982

replSetMaintenance command should not block

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.7.8
    • Affects Version/s: 2.4.9, 2.6.4
    • Component/s: Replication
    • Labels:
      None
    • Major Change
    • ALL
    • Hide
      1. create a replica set (I used Azure Windows without drive cache to make sure disk IO is super slow)
      2. insert 10000000 records into a test collection with random values for some field)
      3. build a foreground index on that field (I used hashed index in my experiment)
      4. when the index build is started on secondaries (mongostat on that secondary was frozen), try setting the maintenance mode on a secondary
      5. observe the command being blocked and queued heartbeats in db.currentOp()
      Show
      create a replica set (I used Azure Windows without drive cache to make sure disk IO is super slow) insert 10000000 records into a test collection with random values for some field) build a foreground index on that field (I used hashed index in my experiment) when the index build is started on secondaries (mongostat on that secondary was frozen), try setting the maintenance mode on a secondary observe the command being blocked and queued heartbeats in db.currentOp()

      This command appears to take some internal locks, so it can be blocked by things like foreground index builds (tested on 2.6.4 and 2.4.9).

      Further, this command, when queued, causes replSetHeartbeat commands to queue up, resulting in missing heartbeats.

      Ideally, this command should be lock-less, allowing the operator to effectively hide the node from application servers or MongoS routers in critical circumstances, e.g. the node being overloaded.

            Assignee:
            scotthernandez Scott Hernandez (Inactive)
            Reporter:
            alex.komyagin@mongodb.com Alexander Komyagin (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: