[SERVER-45363] Issue with mongodb text indexes and weights when using wildcard specifier Created: 04/Jan/20  Updated: 29/Oct/23  Resolved: 27/Jan/20

Status: Closed
Project: Core Server
Component/s: Text Search
Affects Version/s: None
Fix Version/s: 4.2.4, 4.3.3, 3.6.18, 4.0.17

Type: Bug Priority: Major - P3
Reporter: David Lynch Assignee: Ted Tuckman
Resolution: Fixed Votes: 0
Labels: qopt-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
depends on SERVER-85089 User Summary for SERVER-45363 Closed
Backwards Compatibility: Major Change
Operating System: ALL
Backport Requested:
v4.2, v4.0, v3.6
Steps To Reproduce:

Consider the following:

db.createCollection('animals')db.animals.createIndex({
    "$**": "text" 
    }, {
        name: "fullTextIndex",
        weights: {
            name: 500
        },
        default_language: "english"
    });

When searching the animals collection I would expect textScores for "name" matches to be 500 times higher than other fields that are matched by the wildcard ("$**": "text"). Checking the index confirms this:

 

db.animals.getIndexes();
 
[
    {
        "v" : 2,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "TextWeightBug.animals"
    },
    {
        "v" : 2,
        "key" : {
            "_fts" : "text",
            "_ftsx" : 1
        },
        "name" : "fullTextIndex",
        "ns" : "TextWeightBug.animals",
        "weights" : {
            "$**" : 1,
            "name" : 500
        },
        "default_language" : "english",
        "language_override" : "language",
        "textIndexVersion" : 3
    }
]

Note that the wildcard above does have the expected weight ("$**" : 1)

However the following example shows this to not be the case:

 

db.animals.insertOne({ name: 'Spot', guardian: 'Kevin'});db.animals.aggregate([
    { $match: { $text: { $search: "Kevin" } } },
    { $sort: { score: { $meta: "textScore" } } },
    { $project: { name: 1, score: { $meta: "textScore" } } }
]);

returns:

 

{
    "_id" : ObjectId("5e0b7055bea20a3a5b9530a3"),
    "name" : "Spot",
    "score" : 550.0
}

 

Here i was expecting a score of 1.1, since the only field to match was matched via wildcard. Really confused on how the 500 weight would be applied to the "guardian" field match?

I've tested this code with 3.6.16 and 4.0 and get the same results

 

Also posted here: https://stackoverflow.com/questions/59556771/issue-with-mongodb-text-indexes-and-weights-when-using-wildcard-specifier

 

Sprint: Query 2020-01-27, Query 2020-02-10
Participants:

 Description   
Issue Status as of March 6 2020

ISSUE SUMMARY
When inserting a document into a collection with a text index, the weight value for each field in that document is reflected in scores stored inside the index key. If the index has both a wildcard text index and one or more weighted fields, the index will assign an incorrect weight to any field that is lexicographically smaller (alphabetically earlier, so attribute < name < profession) than a field with specified weight. Instead of assigning the default weight for the wildcard, the field will be assigned the weight of the next specified field in the index.

USER IMPACT
Documents in collections with text indexes that have both a wildcard match and a weighted field may have incorrect text scores in the index, and therefore get incorrect scores back when querying and projecting or sorting on {$meta: "textScore" }.

RECOVERY STEPS
Dropping and recreating a text index on a version of mongod that includes the patch will fix this issue. This should be done for every wildcard text index with custom weights on specific fields.

AFFECTED VERSIONS
This affects all versions prior to 4.3.3.

FIX VERSION
The fix will be included in 4.3.3, 4.2.4, 4.0.17 and 3.6.18.



 Comments   
Comment by Githook User [ 06/Mar/20 ]

Author:

{'name': 'Ted Tuckman', 'username': 'TedTuckman', 'email': 'ted.tuckman@mongodb.com'}

Message: SERVER-45363 Base weight for text index on exact match not possible match

(cherry picked from commit 4bb2ad4c48c07d267c98f5443e0984a5e1ef7209)
Branch: v3.6
https://github.com/mongodb/mongo/commit/fcf0d0ed1ac4bb728cfbcc5597587c703a7f2323

Comment by Githook User [ 05/Mar/20 ]

Author:

{'name': 'Ted Tuckman', 'username': 'TedTuckman', 'email': 'ted.tuckman@mongodb.com'}

Message: SERVER-45363 Base weight for text index on exact match not possible match

(cherry picked from commit 4bb2ad4c48c07d267c98f5443e0984a5e1ef7209)
Branch: v4.2
https://github.com/mongodb/mongo/commit/210c31bd821cea37506b00f803911c91f337cab2

Comment by Githook User [ 05/Mar/20 ]

Author:

{'name': 'Ted Tuckman', 'username': 'TedTuckman', 'email': 'ted.tuckman@mongodb.com'}

Message: SERVER-45363 Base weight for text index on exact match not possible match

(cherry picked from commit 4bb2ad4c48c07d267c98f5443e0984a5e1ef7209)
Branch: v4.0
https://github.com/mongodb/mongo/commit/4d3c2f2c5e6e988c8f19d1231e8d8551170260de

Comment by Githook User [ 27/Jan/20 ]

Author:

{'username': 'TedTuckman', 'name': 'Ted Tuckman', 'email': 'ted.tuckman@mongodb.com'}

Message: SERVER-45363 Base weight for text index on exact match not possible match
Branch: master
https://github.com/mongodb/mongo/commit/4bb2ad4c48c07d267c98f5443e0984a5e1ef7209

Comment by Danny Hatcher (Inactive) [ 06/Jan/20 ]

Thanks for your report; I'll forward to the appropriate team.

Generated at Thu Feb 08 05:08:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.