Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-10059

Possible race condition with ensureIndex (text only?)

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Duplicate
    • Affects Version/s: 2.4.4
    • Fix Version/s: None
    • Component/s: Index Maintenance
    • Labels:
      None
    • Environment:
      mongodb 2.4.4; pymongo 2.4.2
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      I haven't yet tried to reproduce the issue, but here's a proposed technique:

      1. Deploy mongod 2.4.4 with textSearchEnabled.
      2. Create a database with a non-trivial amount of text to be searched.
      3. In two threads or processes, invoke ensureIndex on that text.

      Show
      I haven't yet tried to reproduce the issue, but here's a proposed technique: 1. Deploy mongod 2.4.4 with textSearchEnabled. 2. Create a database with a non-trivial amount of text to be searched. 3. In two threads or processes, invoke ensureIndex on that text.

      Description

      Today I encountered a situation where a brand new (test) database had become corrupted. I realized too late that I should have saved the state for analysis. Here's roughly what happened:

      Configured the mongod to enableTextSearch.
      Used mongorestore to populate the database with data from another database (1546 docs with 19MB of data).
      Used an application deployment system similar to Heroku (Velociraptor) to deploy two instances of the app against this database. During startup, this app runs the following:

      coll.ensure_index([('qsl', 'text')], background=True)

      On a collection in that restored db.

      When I tried to query the text index, I got an error indicating a problem because there was "more than one text index" (I'm unsure about the wording).

      I thought, "how weird," queried the collection for its indexes and sure enough, it had two copies of the /exact same index/ (name, type, etc).

      So I removed it, and the remove operation succeeded. Just one remove and both manifestations were gone. Great, I thought, problem solved, but I was wrong.

      Subsequently, whenever I tried to start the app (which would recreate the index), I got the same error reported here (https://groups.google.com/forum/#!topic/mongodb-user/5xNYJVJdp5c). Basically, it seems my database was corrupted.

      That's when I dropped the database and started again. This time, instead of starting the two apps in parallel, I only started one locally (so ensureIndex would have been called once first), and everything seems to be running swimmingly.

      Here's what I suspect happened:
      1. When the two app instances started up, they each invoked ensure_index at roughly the same time.
      2. MongoD performed the test for the existence on each request and failed on both.
      3. MongoD created the identical index in duplicate.
      4. MongoD corrupted the database when removing the index.

      Alternatively, I am using pymongo 2.4.2, which is somewhat dated, and could be implicated.

      In any case, mongod is almost certainly implicated here.

      I would be shocked if this issue existed for standard indexes. I'm guessing that text indexes are somehow implicated (perhaps bypassing checks that standard indexes using).

      I'm reporting this here because I wanted to capture the issue while I still remember the details. I invite others to investigate further.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              jason.coombs@yougov.com Jason R. Coombs
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: