Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-12727

index building can make replica set member unreachable / unresponsive

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.4.9
    • Component/s: Replication
    • None
    • ALL
    • Hide

      We have a 3-member replica set with no arbiters. We built an index on a large collection( ~40GB, 17M docs) with background=True. This seemed to work okay on primary but 30 mins later (when the secondaries were both told to build the index) our replica set went down as they became entirely unresponsive and were unable to vote.

      Show
      We have a 3-member replica set with no arbiters. We built an index on a large collection( ~40GB, 17M docs) with background=True. This seemed to work okay on primary but 30 mins later (when the secondaries were both told to build the index) our replica set went down as they became entirely unresponsive and were unable to vote.

      There is already an issue relating to the behaviour of background indexes on secondaries listed as FIXED for 2.5
      https://jira.mongodb.org/browse/SERVER-2771
      It is not entirely clear however how this issue has been fixed. Do the indices get built in background on secondaries similarly to the primary and/or is the building of indices done sequentially rather than synchronously accross all secondaries. It would be good to have clarification on this.

      Separate from this issue though I believe is the behaviour of the secondaries whilst building foreground indices is not entirely acceptable. It is fine that database is locked but the member shouldn't become entirely unresponsive for the time it takes to build the index.

            Assignee:
            matt.dannenberg Matt Dannenberg
            Reporter:
            johng John Greenall
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: