Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-26799

ReplicaSetMonitor for a set with unreachable hosts continues to refresh (and log verbosely) long (e.g. 15 seconds) after ReplicaSetMonitor::remove() is called

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 3.4.0-rc1
    • Fix Version/s: 3.4.0-rc3
    • Component/s: None
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Sprint:
      Sharding 2016-11-21
    • Linked BF Score:
      0

      Description

      This is because remove() only removes the (shared_ptr) reference to the ReplicaSetMonitor from the ReplicaSetMonitorManager: it does not cause the ReplicaSetMonitor's destructor to be called.

      Is there any way on ReplicaSetMonitor::remove() to interrupt ongoing refresh cycle(s) from concurrent threads with a ShardNotFound or even better, a ShardRemoved-esque error?

      Here's an example of refreshes continuing after we attempt to remove the ReplicaSetMonitor:
      https://logkeeper.mongodb.org/build/45b90edca90e143c2b59e6e42ad34c9d/test/5810e98cbe07c45f730ab605#L1120

      from this Evergreen run on October 26, 2016:
      https://evergreen.mongodb.com/task/mongodb_mongo_master_enterprise_rhel_62_64_bit_sharding_WT_67994f33a88f4aa70283e155c75f48ce997ccdc3_16_10_26_16_05_19

        Attachments

          Activity

            People

            Assignee:
            misha.tyulenev Misha Tyulenev
            Reporter:
            esha.maharishi Esha Maharishi
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: