Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-12990

Abnormal termination of concurrent index builds can lead to a corrupt index catalog

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.4.10
    • Affects Version/s: 2.4.9, 2.4.10
    • Component/s: Index Maintenance
    • ALL
    • Hide

      Build a background index
      Build a foreground index while the background index is building
      When the foreground index finishes perform a killOp of the background

      Show
      Build a background index Build a foreground index while the background index is building When the foreground index finishes perform a killOp of the background

      Issue Status as of March 31, 2014

      ISSUE SUMMARY

      Building indexes concurrently can lead to a corrupt index catalog. In particular, the order of operations that expose this bug is:

      1. Start a background index build
      2. Start another index build (background or foreground)
      3. After the second index build completes, kill the background index build with db.killOp()

      After this series of steps, the index catalog is corrupted and changes to the data in this collection or a call to stats() results in an error.

      USER IMPACT

      A node ending up with a corrupt index catalog needs to be repaired or resynced from a healthy node.

      SOLUTION

      The index position of the background index needs to be re-calculated on failure as it may have changed. This allows the server to clean up the failed index build correctly.

      WORKAROUNDS

      It is advisable to build indexes one at a time, not concurrently.

      AFFECTED VERSIONS

      All recent production release versions up to 2.4.9 are affected. The 2.6 series is unaffected.

      PATCHES

      The fix is included in the 2.4.10 production release.

      Original Description

      If you cancel a background index which is in progress after having already (successfully) created a foreground index you will corrupt the index.

      Commands issued to re-create:

      shell1> db.test.ensureIndex({x:1,fruits:1,transport:1},{background:true});
      shell2> db.test.ensureIndex({x:1,vegetables:1,transport:1});
      shell3> db.currentOp()
      shell3> db.killOp(173)
      

      Result of collstats after killOp()

      > db.test.stats()
      {
      	"ns" : "test.test",
      	"count" : 1358323,
      	"size" : 262296864,
      	"avgObjSize" : 193.10345477474797,
      	"storageSize" : 335896576,
      	"numExtents" : 14,
      	"nindexes" : 2,
      	"lastExtentSize" : 92581888,
      	"paddingFactor" : 1,
      	"systemFlags" : 0,
      	"userFlags" : 0,
      	"errmsg" : "exception: BSONObj size: 0 (0x00000000) is invalid. Size must be between 0 and 16793600(16MB) First element: EOO",
      	"code" : 10334,
      	"ok" : 0
      }
      

      This does not effect 2.6RC0

        1. mongod.log
          13 kB
        2. mongod-2.4.10RC.log
          13 kB
        3. server12990_killbgindex.js
          1 kB

            Assignee:
            eliot Eliot Horowitz (Inactive)
            Reporter:
            david.hows David Hows
            Votes:
            1 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: