-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: manual
-
Labels:None
The "Match Phrases" section in tutorial/search-for-text.txt could use clarification, specifically to indicate that the compound OR query includes individual terms from phrases (i.e. to clarify that "foo bar \"baz\"" is not a search for ((foo OR bar) AND ("baz")).
Original description below (see linked mailing list discussion in comments):
—
according to the docs at http://docs.mongodb.org/manual/single/index.html#document-tutorial/enable-text-search
searching for
"corto largo \"and tomorrow\""
should perform something like
(corto OR largo OR tomorrow) AND ("and tomorrow")
but it does not. for example, my corpus has ONE document with the word 'dog' in it
> db.search_indices.runCommand( 'text', { search: 'dog', limit: 42} ) { "queryDebugString" : "dog||||||", "language" : "english", "results" : [ { "score" : 0.75, "obj" : { "_id" : ObjectId("51bb42dfaf481c7aa4001b3d"), "context_type" : "B", "context_id" : ObjectId("51bb42dfaf481c7aa4001b3c"), "title" : "search_model_b", "keywords" : [ null ], "fulltext" : "dog search_model_b" } } ], "stats" : { "nscanned" : 1, "nscannedObjects" : 0, "n" : 1, "nfound" : 1, "timeMicros" : 73 }, "ok" : 1 }
however, if i expand the search by including another term, and keep the phrase, we can see that a logical OR occurs - not and AND
> db.search_indices.runCommand( 'text', { search: 'dog "search_model_a"', limit: 1} ) { "queryDebugString" : "dog|search_model_a||||search_model_a||", "language" : "english", "results" : [ { "score" : 110.75000000000001, "obj" : { "_id" : ObjectId("51bb42dfaf481c7aa4001b3b"), "context_type" : "A", "context_id" : ObjectId("51bb42dfaf481c7aa4001b3a"), "title" : "search_model_a", "keywords" : [ null ], "fulltext" : "cat search_model_a" } } ], "stats" : { "nscanned" : 12, "nscannedObjects" : 0, "n" : 1, "nfound" : 1, "timeMicros" : 104 }, "ok" : 1 }
notice the result was returned not because
'dog AND search_model_a'
but because
'dog OR search_model_a'
using ONLY phrases has the expected results
> db.search_indices.runCommand( 'text', { search: '"dog" "search_model_a"', limit: 1} ) { "queryDebugString" : "dog|search_model_a||||dog|search_model_a||", "language" : "english", "results" : [ ], "stats" : { "nscanned" : 12, "nscannedObjects" : 0, "n" : 0, "nfound" : 0, "timeMicros" : 133 }, "ok" : 1 } > db.search_indices.runCommand( 'text', { search: '"dog" "search_model_b"', limit: 1} ) { "queryDebugString" : "dog|search_model_b||||dog|search_model_b||", "language" : "english", "results" : [ { "score" : 112.50000000000003, "obj" : { "_id" : ObjectId("51bb42dfaf481c7aa4001b3d"), "context_type" : "B", "context_id" : ObjectId("51bb42dfaf481c7aa4001b3c"), "title" : "search_model_b", "keywords" : [ null ], "fulltext" : "dog search_model_b" } } ], "stats" : { "nscanned" : 12, "nscannedObjects" : 0, "n" : 1, "nfound" : 1, "timeMicros" : 109 }, "ok" : 1 }
BUG in the docs? or in the code?