[DOCS-3710] How does MongoDB text queries score documents? Created: 04/Jul/14  Updated: 11/Jan/17  Resolved: 09/Jul/14

Status: Closed
Project: Documentation
Component/s: manual
Affects Version/s: mongodb-2.6
Fix Version/s: 01112017-cleanup

Type: Task Priority: Blocker - P1
Reporter: Antonio Quintana Assignee: Unassigned
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:
Days since reply: 9 years, 32 weeks ago

 Description   

I created an text index this way:

db.myCollection.ensureIndex(

{firstName:"text", lastName:"text", email:"text", birthDate:"text"}

,{name:"text_index", weights:{lastName:2, email:2}})

so that I can set a threshold (maybe 3) as a minimum score to consider that 2 records are "almost the same".

Now I create example/dummy documents, for instance:

{firstName:"ROBERT", lastName:"LANE", email: "imnotxavier@mail.com"}

and perform this query db.myCollection.find({$text:{$search: "ROBERT LANE"}},{score:{$meta:"textScore"}})

I get this:

{ "_id" : ObjectId("53b6c9f12a48e559dec0f6ab"), "firstName" : "ROBERT", "lastName" : "LANE", "email" : "imnotxavier@mail.com", "score" : 3.3000000000000003 }

Why does MongoDB do this instead of returning a document with score==3?

I need to know a little more about the details of the scoring system in order to correctly set a threshold and make better decisions.



 Comments   
Comment by Kay Kim (Inactive) [ 09/Jul/14 ]

Will close this ticket as no change required to manual as information relates to internal mechanics. We can re-assess at a later date how much, if any, internal mechanics of this we should document.

Comment by Kay Kim (Inactive) [ 09/Jul/14 ]

Hi Antonio –
Scoring takes into consideration various other factors, such as the total field length, if the match is on the untokenized form of the word, how many times the word appears in the field, etc.

Regards,

Kay Kim

Generated at Thu Feb 08 07:46:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.