Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14982

replSetMaintenance command should not block

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • 2.4.9, 2.6.4
    • 2.7.8
    • Replication
    • None
    • Major Change
    • ALL
    • Hide
      1. create a replica set (I used Azure Windows without drive cache to make sure disk IO is super slow)
      2. insert 10000000 records into a test collection with random values for some field)
      3. build a foreground index on that field (I used hashed index in my experiment)
      4. when the index build is started on secondaries (mongostat on that secondary was frozen), try setting the maintenance mode on a secondary
      5. observe the command being blocked and queued heartbeats in db.currentOp()
      Show
      create a replica set (I used Azure Windows without drive cache to make sure disk IO is super slow) insert 10000000 records into a test collection with random values for some field) build a foreground index on that field (I used hashed index in my experiment) when the index build is started on secondaries (mongostat on that secondary was frozen), try setting the maintenance mode on a secondary observe the command being blocked and queued heartbeats in db.currentOp()

    Description

      This command appears to take some internal locks, so it can be blocked by things like foreground index builds (tested on 2.6.4 and 2.4.9).

      Further, this command, when queued, causes replSetHeartbeat commands to queue up, resulting in missing heartbeats.

      Ideally, this command should be lock-less, allowing the operator to effectively hide the node from application servers or MongoS routers in critical circumstances, e.g. the node being overloaded.

      Attachments

        Issue Links

          Activity

            People

              scotthernandez Scott Hernandez
              alex.komyagin@mongodb.com Alexander Komyagin
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: