Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-10058

Possible race condition with ensureIndex (text only?)

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.4.4
    • Component/s: Index Maintenance
    • Labels:
      None
    • Environment:
      mongodb 2.4.4; pymongo 2.4.2
    • ALL
    • Hide

      I haven't yet tried to reproduce the issue, but here's a proposed technique:

      1. Deploy mongod 2.4.4 with textSearchEnabled.
      2. Create a database with a non-trivial amount of text to be searched.
      3. In two threads or processes, invoke ensureIndex on that text.

      Show
      I haven't yet tried to reproduce the issue, but here's a proposed technique: 1. Deploy mongod 2.4.4 with textSearchEnabled. 2. Create a database with a non-trivial amount of text to be searched. 3. In two threads or processes, invoke ensureIndex on that text.

      Today I encountered a situation where a brand new (test) database had become corrupted. I realized too late that I should have saved the state for analysis. Here's roughly what happened:

      Configured the mongod to enableTextSearch.
      Used mongorestore to populate the database with data from another database (1546 docs with 19MB of data).
      Used an application deployment system similar to Heroku (Velociraptor) to deploy two instances of the app against this database. During startup, this app runs the following:

      coll.ensure_index([('qsl', 'text')], background=True)

      On a collection in that restored db.

      When I tried to query the text index, I got an error indicating a problem because there was "more than one text index" (I'm unsure about the wording).

      I thought, "how weird," queried the collection for its indexes and sure enough, it had two copies of the /exact same index/ (name, type, etc).

      So I removed it, and the remove operation succeeded. Just one remove and both manifestations were gone. Great, I thought, problem solved, but I was wrong.

      Subsequently, whenever I tried to start the app (which would recreate the index), I got the same error reported here (https://groups.google.com/forum/#!topic/mongodb-user/5xNYJVJdp5c). Basically, it seems my database was corrupted.

      That's when I dropped the database and started again. This time, instead of starting the two apps in parallel, I only started one locally (so ensureIndex would have been called once first), and everything seems to be running swimmingly.

      Here's what I suspect happened:
      1. When the two app instances started up, they each invoked ensure_index at roughly the same time.
      2. MongoD performed the test for the existence on each request and failed on both.
      3. MongoD created the identical index in duplicate.
      4. MongoD corrupted the database when removing the index.

      Alternatively, I am using pymongo 2.4.2, which is somewhat dated, and could be implicated.

      In any case, mongod is almost certainly implicated here.

      I would be shocked if this issue existed for standard indexes. I'm guessing that text indexes are somehow implicated (perhaps bypassing checks that standard indexes using).

      I'm reporting this here because I wanted to capture the issue while I still remember the details. I invite others to investigate further.

            Assignee:
            Unassigned Unassigned
            Reporter:
            jason.coombs@yougov.com Jason R. Coombs
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: