[SERVER-14738] Updates to documents with text-indexed fields may lead to incorrect entries Created: 31/Jul/14  Updated: 11/Jul/16  Resolved: 01/Aug/14

Status: Closed
Project: Core Server
Component/s: Text Search
Affects Version/s: 2.4.10, 2.6.1, 2.7.4
Fix Version/s: 2.4.11, 2.6.4, 2.7.5

Type: Bug Priority: Critical - P2
Reporter: Mathieu [X] Assignee: J Rassi
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-13651 "btree: key+recloc already in index" ... Closed
Related
related to SERVER-14829 UpdateIndexData::clear() should reset... Closed
Tested
Operating System: ALL
Backport Completed:
Steps To Reproduce:

db.search.drop()
db.search.insert({ title: 'Power', language: 'fr' })
db.search.ensureIndex({ title: 'text' })
 
db.search.find({ $text: { $search: 'Power' } })
// Nothing... 
 
// When stemmed Power become 'pow'
db.search.find({ $text: { $search: 'pow' } })
// { "_id" : ObjectId("53da06bc72556e0e2e08d614"), "title" : "Power", "language" : "fr" }
 
// So i changed the language field to the correct one (en)
var id = db.search.findOne()._id
db.search.update({ _id: id }, { $set: { language: "en" } })
 
db.search.find({ $text: { $search: 'Power' } })
// Nothing again... (Reindexation is not triggered)
 
db.search.update({ _id: id }, { $set: { title: "Power " } })
db.search.find({ $text: { $search: 'Power' } })
// I finally find my documment (an update of the indexed field trigger the reindexation with the good language)
 

Participants:

 Description   
Issue Status as of Aug 04, 2014

ISSUE SUMMARY
An update to a text-indexed field may fail to update the text index. As a result a text search may not match the field contents, yielding incorrect search results.

For example, given a collection with a text index on field “title”:

> db.col.ensureIndex({title:"text"})

Inserting a document and searching for it produces the expected results:

> db.col.insert({title:"test"})
WriteResult({ "nInserted" : 1 })
> db.col.find({$text:{$search:"test"}})
{ "_id" : ObjectId("53df95d559c54fcf80f8e418"), "title" : "test" }

But when the text-indexed field is modified under the conditions outlined above, queries may return incorrect results:

> db.col.update({title:"test"}, {$set:{title:"fail"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.col.find({$text:{$search:"test"}})
{ "_id" : ObjectId("53df95d559c54fcf80f8e418"), "title" : "fail" }

At this stage, if the document grows sufficiently and needs to be moved, the data in the index entry no longer points to a valid document, and queries that hit the index return an error:

> db.col.update({}, {$set : { padding : new Array(512).join('x') }})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.col.find({$text:{$search:"test"}})
error: {
        "$err" : "BSONObj size: -286331154 (0xEEEEEEEE) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: ObjectId('53dfae9755c5ce157d6a8560')",
        "code" : 10334
}

USER IMPACT
Users who update documents in a collection that contains a text index may see incorrect/incomplete search results.

Specifically, an update may cause corrupt index entries if all of the following conditions are met:

  • the update modifies a text-indexed field, and
  • the update does not change the size of any text-indexed values, and
  • the update is in-place (does not result in a document move), and
  • the update does not modify another index

None of the following operations trigger this bug:

  • an update that changes the size of a text-indexed value
  • an update that results in a document move
  • an update that modifies another index
  • an update that replaces the entire document
  • an insert, query, or delete operation

WORKAROUNDS
No workarounds exist for this issue. To fix this issue, users must upgrade to 2.4.11 or 2.6.4 and then rebuild text indexes, either by dropping and creating each index, or by resyncing a new replica set member.

There is no simple way to identify whether or not a text index is affected by this issue. If any updates have been issued to documents in a collection with a text index, the index may have been impacted.

AFFECTED VERSIONS
MongoDB versions 2.4.0 through 2.4.10, and 2.6.0 through 2.6.3 are affected by this issue.

FIX VERSION
The fix is included in the 2.4.11 and 2.6.4 production releases.

RESOLUTION DETAILS
Correctly determine if update with text index is in-place.

Original description


 Comments   
Comment by Githook User [ 22/Aug/14 ]

Author:

{u'username': u'kkmongo', u'name': u'Kamran Khan', u'email': u'kamran.khan@mongodb.com'}

Message: SERVER-14738 Use language names instead of ISO codes in FTS tests

Closes #752

Signed-off-by: Benety Goh <benety@mongodb.com>
Branch: v2.4
https://github.com/mongodb/mongo/commit/c344f31ae584773c043abe3c718b4f7a151e4e28

Comment by Githook User [ 07/Aug/14 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-14738 Correctly determine if update w/ text index is in-place

(backport of 1f00ffcd22e671f5adeece53c68b5e462ba01ec0)
Branch: v2.4
https://github.com/mongodb/mongo/commit/3e570fc232d1e678324b80e803dabba2e41da9a0

Comment by Ramon Fernandez Marina [ 07/Aug/14 ]

david.bachrach@staples.com, yes, you can re-index your documents after updating them (there's no need to drop the index first) to work around this issue, but this is not listed as a workaround because index rebuilds can take a long time for large collections, so it's not a universal workaround.

Inserting an additional field in your documents (e.g. {x:1}) and building an index on it (db.col.ensureIndex({x:1})) also avoids the issue altogether, but this is a bit of a hack so is not listed as a universal workaround either.

Comment by David Bachrach [ 07/Aug/14 ]

We have a collection with 8 documents. We can easily drop the index before doing updates and then recreate the index after doing the update. The documents are relatively static so won't be changing often. Is this an option until we schedule an upgrade to 2.4.11 or 2.6.4? I didn't see it listed as a workaround, but curious if this would work. Again, drop the index before doing any updates and create the index after any updates.

Thanks

Comment by David Bachrach [ 07/Aug/14 ]

That's what i was looking for. Whether 2.4.8 was affected, which it looks like it is. Thanks

Comment by Githook User [ 04/Aug/14 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-14738 UpdateIndexData::clear() reset all member variables
Branch: master
https://github.com/mongodb/mongo/commit/d7133440dc8f05f0514d2b056d8605513b6e4d1b

Comment by Githook User [ 01/Aug/14 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-14738 Correctly determine if update w/ text index is in-place

(cherry picked from commit 1f00ffcd22e671f5adeece53c68b5e462ba01ec0)
Branch: v2.6
https://github.com/mongodb/mongo/commit/b0221913173ed2b3d85c9a77e71dc648606a0e3d

Comment by Githook User [ 01/Aug/14 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-14738 Correctly determine if update w/ text index is in-place
Branch: master
https://github.com/mongodb/mongo/commit/1f00ffcd22e671f5adeece53c68b5e462ba01ec0

Comment by J Rassi [ 31/Jul/14 ]

Hi Mathieu_Laporte,

We are able to reproduce this issue. Thanks for reporting it. Please continue to watch this ticket for workaround information and updates on when a fix will be available.

~ Jason Rassi

Generated at Thu Feb 08 03:35:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.