[SERVER-9063] Integrate text search into normal query system Created: 21/Mar/13  Updated: 02/Feb/16  Resolved: 17/Dec/13

Status: Closed
Project: Core Server
Component/s: Text Search
Affects Version/s: 2.4.0
Fix Version/s: 2.5.5

Type: New Feature Priority: Major - P3
Reporter: Robert Dickinson Assignee: J Rassi
Resolution: Done Votes: 23
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by DOCS-2380 Document : Integrate text-search into... Closed
is depended on by DRIVERS-199 Integrate text search into normal que... Closed
is depended on by JAVA-1043 Add support for full text search to Q... Closed
Duplicate
is duplicated by SERVER-10240 text query/command does not yield pro... Closed
is duplicated by SERVER-9392 Create compatible text-search query s... Closed
is duplicated by SERVER-10238 add ability to get total match count ... Closed
Related
related to SERVER-8882 Rename text search filter to match to... Closed
is related to SERVER-12277 Don't generate plans for $text querie... Closed
is related to SERVER-11675 Aggregation framework support for tex... Closed
is related to SERVER-8419 More explain-like fields in "stats" f... Closed
is related to SERVER-9772 Add options for text command to enabl... Closed
is related to SERVER-12089 Error message unclear for $text query... Closed
is related to SERVER-12148 Migrate "text" command jstests to use... Closed
Participants:
Linked BF Score: 0

 Description   

db.coll.find({$text: {$search: "\"a phrase\" term1 term2 -negterm",
                      $language: "spanish"},
              name: /a.*/},
             {description: 1, _id: 0})
       .sort({date: 1}).skip(10).limit(10);

original:

I have been using text search against a large database of needs feeds with some success. The problem is that only being able to get the results based on score is not sufficient. What I mainly need to do, and I think a lot of others will want to do, is qualify a document based on it meeting the text search criteria (as one possible filter, to be combined with others), but to return a result set ordered by some other criteria (like date), with the same kind of paging support that we have with queries in general.

So I guess specifically, this just means having a query filter to indicate whether a document does or doesn't match a given text search string.

In my case, I have millions of news feed items from hundreds of feeds. If I want to search Slashdot for the term "iPhone", what I want is to see the news items from that feed that contain "iPhone" in order of most recent first. With text search as it is now, I cannot do that.

Users don't want text search only because they want weighted results that find the most relevant documents, the also want to be able to use text search criteria as additional functionality in their current applications and queries.



 Comments   
Comment by Githook User [ 18/Dec/13 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-9063 SERVER-10026 Remove out-of-date $meta sort test

Sort on textScore without projection no longer supported. Negative
test to be added as part of SERVER-12038, when validation
functionality is complete.
Branch: master
https://github.com/mongodb/mongo/commit/140ccec2dc0933afc6f680457304f156df9db560

Comment by Githook User [ 17/Dec/13 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-9063 Correctly sort on text metadata for sharded queries
Branch: master
https://github.com/mongodb/mongo/commit/b44edc42e42524d2f8bf66d389eeb6d973fbf40e

Comment by Githook User [ 13/Dec/13 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-9063 SERVER-10026 Rename document metadata "text" to "textScore"
Branch: master
https://github.com/mongodb/mongo/commit/1d034bab093f4f7d3b264c50e78207b94a0e267e

Comment by Githook User [ 02/Dec/13 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-11924 SERVER-9063 SERVER-10026 No shard-passthru for fts test

Pending mongos support, disabling fts_score_sort.js in sharded
passthrough.
Branch: master
https://github.com/mongodb/mongo/commit/02db387c70f8e4e828c5297ae9fa06e667421c37

Comment by Githook User [ 02/Dec/13 ]

Author:

{u'username': u'hkhalsa', u'name': u'Hari Khalsa', u'email': u'hkhalsa@10gen.com'}

Message: SERVER-9063 SERVER-10026 can sort on doc metadata, remove implicit text sort
Branch: master
https://github.com/mongodb/mongo/commit/b3ffbcfb747ef3e6efa2b75dc409592087b207ce

Comment by auto [ 15/Oct/13 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-9063 Better error message in TextMatchExpression::matchesSingleElement()
Branch: master
https://github.com/mongodb/mongo/commit/2a1753c33971126e48705d0364a99977e30f5559

Comment by hari.khalsa@10gen.com [ 15/Oct/13 ]

alerner gianfranco rassi@10gen.com Not for 2.5.3 but we should do it for 2.5.4.

Comment by Gianfranco Palumbo [ 15/Oct/13 ]

Will $text support explain()?

Comment by J Rassi [ 12/Oct/13 ]

Above commits implement $text with a blocking stage. Like $near, having a $text predicate currently implies a sort order and a limit of 100, both of which can be overridden. Plans can only be generated for queries with a $text predicate if a text index exists. Attempting a $text query with the old query framework will generate the cryptic error message "assertion src/mongo/db/matcher/expression_text.cpp:45" to send back to the user. Including a $text predicate disallows collection scans and forces the planner to use the text index.

Comment by auto [ 12/Oct/13 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-9063 Don't generate text plans if >1 text index
Branch: master
https://github.com/mongodb/mongo/commit/e9542d111dcd02f93113ad448ca15a1c9b95f1e7

Comment by auto [ 12/Oct/13 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-9063 Fix TextStage to correctly handle returning no results
Branch: master
https://github.com/mongodb/mongo/commit/b1b968db74a834692e48fbd613d9d9fa01dd4029

Comment by auto [ 12/Oct/13 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-9063 Fix uninitialized member in StageDebugCmd::parseQuery
Branch: master
https://github.com/mongodb/mongo/commit/7a677b1d6a5f4ac115d69957a18671f77d889dbb

Comment by auto [ 12/Oct/13 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-9063 Correct type of return value in buildStages error case
Branch: master
https://github.com/mongodb/mongo/commit/535665987c26169e5bdb942371365b03f328e348

Comment by auto [ 12/Oct/13 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-9063 Add new query operator $text
Branch: master
https://github.com/mongodb/mongo/commit/34b88976c91fdeec5ff0b8816d35deef72f3767e

Comment by auto [ 12/Oct/13 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-9063 Add new plan stage TextStage
Branch: master
https://github.com/mongodb/mongo/commit/a1d5f24d910d1407b8c166d99f0df99640845a2e

Comment by auto [ 12/Oct/13 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-9063 Add new match expression TextMatchExpression
Branch: master
https://github.com/mongodb/mongo/commit/08fde2e6a23a98d8ca1ef9593891f4de027d2474

Comment by Tyler Brock [ 24/May/13 ]

It would be great if we could have the text search filter not necessarily produce a score for a given document and simply match for doing operations like counting/grouping matches. Perhaps the server would be smart enough to do that if the score field was turned off in a projection or not implicitly included.

Comment by Antoine Girbal [ 24/May/13 ]

Yes this is a big limitation right now.
Ideally you could do compound with text index, for example

{ txt: "text", date: 1 }

or

{ txt: "text", popularity: 1 }

Then you can just search and iterate document efficiently on that field.

Comment by Suren [ 11/Apr/13 ]

The api would be readable if its something like this:

db.collections.search(filter:{},text:"search string")

db.collections.search(filter:{},text:"search string").skip(n).limit(n)

Comment by Sam Martin [ 11/Apr/13 ]

The text search functionality has meant I no longer need to duplicate information into a string array and multi-key index, and whilst slower in a few use cases, overall the text search is much, much faster especially where the terms searched for do not exactly match the indexed values.

The downside is that to do a simple text search pre-filtering the documents can only be done on a single equals condition which for me means changing the schema and denormalising a flag to indicate which documents are subject to the search.

My requirement would be to search documents matching a particular type,
e.g.

{ filter : { doctype :

{ $in : [ObjectId("..."),ObjectId("...")] }

} },

{ search : "my search term" }

...

If the text command could return a cursor to enable clients to skip/take results that would make it more useful also.

Generated at Thu Feb 08 03:19:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.