[SERVER-12144] Incorrect results for $text queries against multikey compound text indexes Created: 17/Dec/13  Updated: 11/Jul/16  Resolved: 13/Feb/14

Status: Closed
Project: Core Server
Component/s: Querying, Text Search
Affects Version/s: None
Fix Version/s: 2.6.0-rc0

Type: Bug Priority: Major - P3
Reporter: Kay Kim (Inactive) Assignee: hari.khalsa@10gen.com
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to DOCS-2689 pending Text Search features Closed
Participants:

 Description   

$text queries against a compound text index pass an incorrect predicate to the index, unless the index isn't multikey (i.e. all documents have at most one indexable term).

Reproduce with the following:

function runTest(textData) {
    db.foo.drop();
    db.foo.insert({a: 17, b: "irrelevant"});
    db.foo.insert({a: 17, b: textData});
    db.foo.ensureIndex({a: 1, b: "text"});
    return db.foo.count({a: 17, $text: {$search: "foo"}});
}
 
print(runTest("foo")); // correct: outputs 1 (text index is not multikey)
print(runTest("foo bar")); // incorrect: outputs 2 (text index is multikey)

Output for the "correct" case:

2014-01-13T19:56:12.455-0500 [conn3] enumerator received root:
$and
    a $gt 0.0 First: 0 notFirst: full path: a
    TEXT : query=hello, language = , tag=First: notFirst: 0 full path: _fts
 
2014-01-13T19:56:12.455-0500 [conn3] Tagging memoID 0
2014-01-13T19:56:12.455-0500 [conn3] Enumerator: memo right before moving:
2014-01-13T19:56:12.455-0500 [conn3] [Node #0]: AND enumstate counter 0
choice 0:
	subnodes:
	idx[0]
		pos 0 pred a $gt 0.0
 
		pos 1 pred TEXT : query=hello, language = , tag=NULL

Output for the "incorrect" case:

2014-01-13T19:55:46.798-0500 [conn2] enumerator received root:
$and
    a $gt 0.0 First: 0 notFirst: full path: a
    TEXT : query=hello, language = , tag=First: notFirst: 0 full path: _fts
 
2014-01-13T19:55:46.798-0500 [conn2] Tagging memoID 0
2014-01-13T19:55:46.798-0500 [conn2] Enumerator: memo right before moving:
2014-01-13T19:55:46.798-0500 [conn2] [Node #0]: AND enumstate counter 0
choice 0:
	subnodes:
	idx[0]
		pos 0 pred a $gt 0.0

Original description below.

So, given a collection inventory :

   { _id: 1, dept: "tech", description: "a fun green computer" }
   { _id: 2, dept: "tech", description: "a wireless red mouse" }
   { _id: 3, dept: "kitchen", description: "a green placemat" }
   { _id: 4, dept: "kitchen", description: "a red peeler" }
   { _id: 5, dept: "food", description: "a green apple" }
   { _id: 6, dept: "food", description: "a red potato" }

and the following compound index:

db.inventory.ensureIndex( { dept: 1, description: "text" } )

The following query seems to be 'or'ing the conditions or something

>    db.inventory.find( { dept: {$in:[ "kitchen", "food" ] }, $text: { $search: "green" } } )
{ "_id" : 5, "dept" : "food", "description" : "a green apple" }  
{ "_id" : 6, "dept" : "food", "description" : "a red potato" }   // This shouldn't return
{ "_id" : 3, "dept" : "kitchen", "description" : "a green placemat" }
{ "_id" : 4, "dept" : "kitchen", "description" : "a red peeler" }  // this shouldn't return either



 Comments   
Comment by Githook User [ 13/Feb/14 ]

Author:

{u'username': u'hkhalsa', u'name': u'Hari Khalsa', u'email': u'hkhalsa@10gen.com'}

Message: SERVER-12354 SERVER-12144 plan compound text correctly
Branch: master
https://github.com/mongodb/mongo/commit/abc8fd203e7f3e031bc991e27cf36128e9f5792a

Comment by Githook User [ 10/Feb/14 ]

Author:

{u'username': u'hkhalsa', u'name': u'Hari Khalsa', u'email': u'hkhalsa@10gen.com'}

Message: SERVER-12144 better text query plan tests
Branch: master
https://github.com/mongodb/mongo/commit/76d93c0efaf07e2ba3f3841a7d08e77ec4483b75

Generated at Thu Feb 08 03:27:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.