Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9856

No check for building identical background indexes concurrently

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: 2.4.4, 2.5.0
    • Fix Version/s: 2.4.5, 2.5.1
    • Component/s: Indexing
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide
      1. insert large number of documents into a primary
      2. open two shells, start background index build of same index in both shells
      3. confirm with db.collection.getIndexes() that there are multiple identical indexes

      Everything seems to work ok, until you

      • try to drop the index with db.collection.dropIndex(<index>)
      • try to drop all indexes with db.collection.dropIndexes()
      • try to drop the collection with db.collection.drop()

      After one of these operations, even hinting on that particular index fails with the same error. It seems the index does get deleted but not properly cleaned up.

      Show
      insert large number of documents into a primary open two shells, start background index build of same index in both shells confirm with db.collection.getIndexes() that there are multiple identical indexes Everything seems to work ok, until you try to drop the index with db.collection.dropIndex(<index>) try to drop all indexes with db.collection.dropIndexes() try to drop the collection with db.collection.drop() After one of these operations, even hinting on that particular index fails with the same error. It seems the index does get deleted but not properly cleaned up.

      Description

      Issue Status as of Sep 19, 2014

      ISSUE SUMMARY
      It is possible to build duplicate indexes on a collection when indexes are run in the background, thus causing index corruption. Running db.repairDatabase() fixes the index corruption.

      USER IMPACT
      When a collection with duplicate indexes is dropped, or one of the duplicate indexes is dropped, users may encounter an error message similar to the following:

      Tue Jun  4 15:19:47.778 JavaScript execution failed: drop failed: {
              "nIndexesWas" : 2,
              "errmsg" : "exception: drop: dropIndexes for collection failed - consider trying repair  cause: BSONObj size: 0 (0x00000000) is invalid. Size must be between 0 and 16793600(16MB) First element: EOO",
              "code" : 12503,
              "ok" : 0
      }
      

      When upgrading to MongoDB 2.6.x, this index corruption is detected during the startup sequence and the server aborts as follows:

      2014-10-15T12:17:47.641-0400 [IndexRebuilder] test.foo Invariant failure save == _entries.find( descriptor->indexName() ) src/mongo/db/catalog/index_catalog.cpp 156
      2014-10-15T12:17:47.644-0400 [IndexRebuilder] test.foo 0x1006ac7ab 0x1006642c2 0x100653fc9 0x1000f5e02 0x1000f7d0f 0x1000df14a 0x1000e538e 0x1002936ef 0x100293111 0x100656f8b 0x1006e0bb5 0x7fff8b44f899 0x7fff8b44f72a 0x7fff8b453fc9 
       0   mongod                              0x00000001006ac7ab _ZN5mongo15printStackTraceERSo + 43
       1   mongod                              0x00000001006642c2 _ZN5mongo10logContextEPKc + 114
       2   mongod                              0x0000000100653fc9 _ZN5mongo15invariantFailedEPKcS1_j + 233
       3   mongod                              0x00000001000f5e02 _ZN5mongo12IndexCatalog24_setupInMemoryStructuresEPNS_15IndexDescriptorE + 748
       4   mongod                              0x00000001000f7d0f _ZN5mongo12IndexCatalog4initEv + 873
       5   mongod                              0x00000001000df14a _ZN5mongo10CollectionC2ERKNS_10StringDataEPNS_16NamespaceDetailsEPNS_8DatabaseE + 756
       6   mongod                              0x00000001000e538e _ZN5mongo8Database13getCollectionERKNS_10StringDataE + 322
       7   mongod                              0x00000001002936ef _ZN5mongo14IndexRebuilder7checkNSERKSt4listISsSaISsEE + 317
       8   mongod                              0x0000000100293111 _ZN5mongo14IndexRebuilder3runEv + 399
       9   mongod                              0x0000000100656f8b _ZN5mongo13BackgroundJob7jobBodyEv + 257
       10  mongod                              0x00000001006e0bb5 thread_proxy + 229
       11  libsystem_pthread.dylib             0x00007fff8b44f899 _pthread_body + 138
       12  libsystem_pthread.dylib             0x00007fff8b44f72a _pthread_struct_init + 0
       13  libsystem_pthread.dylib             0x00007fff8b453fc9 thread_start + 13
      2014-10-15T12:17:47.644-0400 [IndexRebuilder] 
       
      ***aborting after invariant() failure
      

      WORKAROUNDS
      The recommended workaround for users running a replica set is to shut down the affected node and resync the node from the primary.

      Users running a standalone server may temporarily convert the standalone node to a replica set, then add a new node to the replica set. Once the new node has finished its initial sync, shut it down and restart it as a standalone instance, and manually build the indexes that were corrupted on the primary. The new instance should now be used instead of the original one.

      Alternatively, standalone servers can be recovered by running repairDatabase() using a 2.4-series MongoDB version 2.4.5 or older. Then it is safe to upgrade to MongoDB 2.6. If running MongoDB 2.4 is preferred, it is highly recommended to upgrade to the latest 2.4.x release.

      AFFECTED VERSIONS
      MongoDB 2.4 production releases up to 2.4.4 are affected by this issue.

      FIX VERSION
      The fix is included in the 2.4.5 production release.

      RESOLUTION DETAILS
      Check in-progress indexes for duplicates in prepareToBuildIndex().

      Original description

      Since 2.4 it is possible to build background indexes concurrently. There is no check if that index is the same one, which can lead to a situation where the same index exists multiple times on a collection.

      Trying to drop that index, or all indexes, or the collection leads to this error:

      Tue Jun  4 15:19:47.778 JavaScript execution failed: drop failed: {
              "nIndexesWas" : 2,
              "errmsg" : "exception: drop: dropIndexes for collection failed - consider trying repair  cause: BSONObj size: 0 (0x00000000) is invalid. Size must be between 0 and 16793600(16MB) First element: EOO",
              "code" : 12503,
              "ok" : 0
      }
      

      Subsequent queries using the half-dropped index will result in errors as well.

      In 2.4.x, the secondaries were unaffected by this because we didn't allow background index builds on secondaries. The index was only present once on the secondary. In 2.5.0 the duplicate index carries over to the secondaries and show the same behavior as the primaries.

      This is a regression from 2.2 where a second background index build was not allowed and therefore the bug could not occur.

      A db.repairDatabase() seems to fix the problem.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                4 Vote for this issue
                Watchers:
                16 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: