[SERVER-9856] No check for building identical background indexes concurrently Created: 04/Jun/13  Updated: 11/Jul/16  Resolved: 10/Jun/13

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 2.4.4, 2.5.0
Fix Version/s: 2.4.5, 2.5.1

Type: Bug Priority: Critical - P2
Reporter: Thomas Rueckstiess Assignee: Eric Milkie
Resolution: Done Votes: 4
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File dupl_background_index.js    
Issue Links:
Depends
Duplicate
is duplicated by SERVER-9904 Secondaries crash when too many indexes Closed
is duplicated by SERVER-9875 Possible to accidentally create dupli... Closed
is duplicated by SERVER-9995 corruption on primaries after upgrade... Closed
is duplicated by SERVER-10058 Possible race condition with ensureIn... Closed
is duplicated by SERVER-10684 Multiple indexes created when ensureI... Closed
is duplicated by SERVER-10895 Duplicate indexes being created at th... Closed
is duplicated by SERVER-12147 Removing duplicate index cause corrup... Closed
is duplicated by SERVER-14238 Mongodb server crash after update of ... Closed
Related
Operating System: ALL
Steps To Reproduce:
  1. insert large number of documents into a primary
  2. open two shells, start background index build of same index in both shells
  3. confirm with db.collection.getIndexes() that there are multiple identical indexes

Everything seems to work ok, until you

  • try to drop the index with db.collection.dropIndex(<index>)
  • try to drop all indexes with db.collection.dropIndexes()
  • try to drop the collection with db.collection.drop()

After one of these operations, even hinting on that particular index fails with the same error. It seems the index does get deleted but not properly cleaned up.

Participants:

 Description   
Issue Status as of Sep 19, 2014

ISSUE SUMMARY
It is possible to build duplicate indexes on a collection when indexes are run in the background, thus causing index corruption. Running db.repairDatabase() fixes the index corruption.

USER IMPACT
When a collection with duplicate indexes is dropped, or one of the duplicate indexes is dropped, users may encounter an error message similar to the following:

Tue Jun  4 15:19:47.778 JavaScript execution failed: drop failed: {
        "nIndexesWas" : 2,
        "errmsg" : "exception: drop: dropIndexes for collection failed - consider trying repair  cause: BSONObj size: 0 (0x00000000) is invalid. Size must be between 0 and 16793600(16MB) First element: EOO",
        "code" : 12503,
        "ok" : 0
}

When upgrading to MongoDB 2.6.x, this index corruption is detected during the startup sequence and the server aborts as follows:

2014-10-15T12:17:47.641-0400 [IndexRebuilder] test.foo Invariant failure save == _entries.find( descriptor->indexName() ) src/mongo/db/catalog/index_catalog.cpp 156
2014-10-15T12:17:47.644-0400 [IndexRebuilder] test.foo 0x1006ac7ab 0x1006642c2 0x100653fc9 0x1000f5e02 0x1000f7d0f 0x1000df14a 0x1000e538e 0x1002936ef 0x100293111 0x100656f8b 0x1006e0bb5 0x7fff8b44f899 0x7fff8b44f72a 0x7fff8b453fc9 
 0   mongod                              0x00000001006ac7ab _ZN5mongo15printStackTraceERSo + 43
 1   mongod                              0x00000001006642c2 _ZN5mongo10logContextEPKc + 114
 2   mongod                              0x0000000100653fc9 _ZN5mongo15invariantFailedEPKcS1_j + 233
 3   mongod                              0x00000001000f5e02 _ZN5mongo12IndexCatalog24_setupInMemoryStructuresEPNS_15IndexDescriptorE + 748
 4   mongod                              0x00000001000f7d0f _ZN5mongo12IndexCatalog4initEv + 873
 5   mongod                              0x00000001000df14a _ZN5mongo10CollectionC2ERKNS_10StringDataEPNS_16NamespaceDetailsEPNS_8DatabaseE + 756
 6   mongod                              0x00000001000e538e _ZN5mongo8Database13getCollectionERKNS_10StringDataE + 322
 7   mongod                              0x00000001002936ef _ZN5mongo14IndexRebuilder7checkNSERKSt4listISsSaISsEE + 317
 8   mongod                              0x0000000100293111 _ZN5mongo14IndexRebuilder3runEv + 399
 9   mongod                              0x0000000100656f8b _ZN5mongo13BackgroundJob7jobBodyEv + 257
 10  mongod                              0x00000001006e0bb5 thread_proxy + 229
 11  libsystem_pthread.dylib             0x00007fff8b44f899 _pthread_body + 138
 12  libsystem_pthread.dylib             0x00007fff8b44f72a _pthread_struct_init + 0
 13  libsystem_pthread.dylib             0x00007fff8b453fc9 thread_start + 13
2014-10-15T12:17:47.644-0400 [IndexRebuilder] 
 
***aborting after invariant() failure

WORKAROUNDS
The recommended workaround for users running a replica set is to shut down the affected node and resync the node from the primary.

Users running a standalone server may temporarily convert the standalone node to a replica set, then add a new node to the replica set. Once the new node has finished its initial sync, shut it down and restart it as a standalone instance, and manually build the indexes that were corrupted on the primary. The new instance should now be used instead of the original one.

Alternatively, standalone servers can be recovered by running repairDatabase() using a 2.4-series MongoDB version 2.4.5 or older. Then it is safe to upgrade to MongoDB 2.6. If running MongoDB 2.4 is preferred, it is highly recommended to upgrade to the latest 2.4.x release.

AFFECTED VERSIONS
MongoDB 2.4 production releases up to 2.4.4 are affected by this issue.

FIX VERSION
The fix is included in the 2.4.5 production release.

RESOLUTION DETAILS
Check in-progress indexes for duplicates in prepareToBuildIndex().

Original description

Since 2.4 it is possible to build background indexes concurrently. There is no check if that index is the same one, which can lead to a situation where the same index exists multiple times on a collection.

Trying to drop that index, or all indexes, or the collection leads to this error:

Tue Jun  4 15:19:47.778 JavaScript execution failed: drop failed: {
        "nIndexesWas" : 2,
        "errmsg" : "exception: drop: dropIndexes for collection failed - consider trying repair  cause: BSONObj size: 0 (0x00000000) is invalid. Size must be between 0 and 16793600(16MB) First element: EOO",
        "code" : 12503,
        "ok" : 0
}

Subsequent queries using the half-dropped index will result in errors as well.

In 2.4.x, the secondaries were unaffected by this because we didn't allow background index builds on secondaries. The index was only present once on the secondary. In 2.5.0 the duplicate index carries over to the secondaries and show the same behavior as the primaries.

This is a regression from 2.2 where a second background index build was not allowed and therefore the bug could not occur.

A db.repairDatabase() seems to fix the problem.



 Comments   
Comment by Daniel Pasette (Inactive) [ 01/Jul/13 ]

chengas123, we don't have a quick fix available to remove the extra indexes at this point other than doing a repair database. There are a couple of alternatives:

  • If you are running with a replica set, there is a possible workaround. In version 2.4, secondaries do not allow indexes to be built in the background, even if they were built in the background on the primary. Thus, secondaries should have no duplicate indexes. As an alternative to running db.repairDatabase on your primary, you can promote your secondary to primary and completely resync your old primary from scratch.
  • You can run repairDatabase even if you have low disk space by using another disk for the repair target. Shut down the server and then run: mongod --repair --repairpath <some other mount point>
Comment by Wiliam [ 01/Jul/13 ]

That would be great Ben McCann, repairDatabase is a expensive task y large databases.

Comment by Ben McCann [ 01/Jul/13 ]

Can you guys also add a fix to allow proper deletion of the multiple indexes? My disk is 60% used, so I will not be able to run db.repairDatabase()

Comment by auto [ 19/Jun/13 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-9856 check in-progress indexes for duplicates in prepareToBuildIndex()
Branch: v2.4
https://github.com/mongodb/mongo/commit/af0b49cbb0ca6888d6f133081b7d41464bf4135f

Comment by auto [ 10/Jun/13 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-9856 check in-progress indexes for duplicates in prepareToBuildIndex()
Branch: master
https://github.com/mongodb/mongo/commit/1e65ddf0bf4fe11d2ab71cf04be808e9a10f342d

Generated at Thu Feb 08 03:21:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.