Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Critical - P2
Fix Version/s: 3.0.4, 3.1.4
Affects Version/s: 3.0.1
Component/s: Replication
Labels:
- UT

Backwards Compatibility:
Minor Change
Operating System:
ALL
Backport Completed:

3.0.4
Linked BF Score:
0
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Issue Status as of Jun 09, 2015

ISSUE SUMMARY
On a MongoDB replica set, when a secondary node is running multiple background index builds on a given collection, metadata changes to that same collection may lead to a fatal error on the secondary node.

Metadata changes that may trigger this behavior include renaming and dropping the collection, and dropping the database that contains the collection.

USER IMPACT
If a quorum of secondary nodes experience the error and shut down, the replica set will no longer have enough voting nodes operational, leading to loss of write availability.

WORKAROUNDS
Avoid collection creation, drop, and rename operations while building indexes in the background on that same collection.

AFFECTED VERSIONS
MongoDB 3.0.0 through 3.0.3.

FIX VERSION
The fix is included in the 3.0.4 production release.

Original description

Create and destroy indexes with different options, and variations, on the same collection from multiple clients and there is a chance that secondaries will fassert when applying the oplog. Thus far, no problem has been observed on the primary.

Tested using 3.0.1 enterprise. Known to occur on ubuntu 12.01 and windows 8.

Attached is the script used in each shell session. The "test.ts" collection had 250K small documents structured as {_id:ObjectId,server:int,cpu:int} however neither the structure nor quantity of documents seem to be important as other variations also trigger the fault. Background indexing appears to be a crucial requirement. The fault was originally observed on a sharded cluster with operations performed via a mongos, but a basic replica-set is all that is needed.

Sometimes the secondaries can be restarted, recover, and rejoin normally. Sometimes they fassert again on restart, persistently, until re-sync'ed. Both these results were observed in consecutive runs with no known difference to explain the different recovery result (other than timing).

Also attached is log output of an example restart (on windows) where the secondary could not recover.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

restart.log
7 kB
Apr 08 2015 05:13:52 AM UTC

is duplicated by

SERVER-18762 Mongo 3.0 crashes while replicating map reduce collections

Closed

SERVER-19065 dropIndexes() produces "Assertion: 17348:cannot dropAllIndexes when index builds in progress"

Closed

related to

SERVER-20010 Segfault while dropping an index that failed to build

Closed

Assignee:: Eric Milkie
Reporter:: Andrew Ryder (Inactive)
Participants:: Andrew Ryder, Eric Milkie, Githook User, Ramon Fernandez
Votes:: 1 Vote for this issue
Watchers:: 10 Start watching this issue

Created:: Apr 08 2015 05:13:52 AM UTC
Updated:: Feb 04 2016 10:12:27 PM UTC
Resolved:: Jun 04 2015 01:16:28 PM UTC

Details

Description

Original description

Attachments

Attachments

Issue Links

Activity

People

Dates

PagerDuty