Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-45363

Issue with mongodb text indexes and weights when using wildcard specifier

    XMLWordPrintable

    Details

    • Backwards Compatibility:
      Major Change
    • Operating System:
      ALL
    • Backport Requested:
      v4.2, v4.0, v3.6
    • Steps To Reproduce:
      Hide

      Consider the following:

      db.createCollection('animals')db.animals.createIndex({
          "$**": "text" 
          }, {
              name: "fullTextIndex",
              weights: {
                  name: 500
              },
              default_language: "english"
          });

      When searching the animals collection I would expect textScores for "name" matches to be 500 times higher than other fields that are matched by the wildcard ("$**": "text"). Checking the index confirms this:

       

      db.animals.getIndexes();
       
      [
          {
              "v" : 2,
              "key" : {
                  "_id" : 1
              },
              "name" : "_id_",
              "ns" : "TextWeightBug.animals"
          },
          {
              "v" : 2,
              "key" : {
                  "_fts" : "text",
                  "_ftsx" : 1
              },
              "name" : "fullTextIndex",
              "ns" : "TextWeightBug.animals",
              "weights" : {
                  "$**" : 1,
                  "name" : 500
              },
              "default_language" : "english",
              "language_override" : "language",
              "textIndexVersion" : 3
          }
      ]

      Note that the wildcard above does have the expected weight ("$**" : 1)

      However the following example shows this to not be the case:

       

      db.animals.insertOne({ name: 'Spot', guardian: 'Kevin'});db.animals.aggregate([
          { $match: { $text: { $search: "Kevin" } } },
          { $sort: { score: { $meta: "textScore" } } },
          { $project: { name: 1, score: { $meta: "textScore" } } }
      ]);

      returns:

       

      {
          "_id" : ObjectId("5e0b7055bea20a3a5b9530a3"),
          "name" : "Spot",
          "score" : 550.0
      }

       

      Here i was expecting a score of 1.1, since the only field to match was matched via wildcard. Really confused on how the 500 weight would be applied to the "guardian" field match?

      I've tested this code with 3.6.16 and 4.0 and get the same results

       

      Also posted here: https://stackoverflow.com/questions/59556771/issue-with-mongodb-text-indexes-and-weights-when-using-wildcard-specifier

       

      Show
      Consider the following: db.createCollection('animals')db.animals.createIndex({ "$**": "text" }, { name: "fullTextIndex", weights: { name: 500 }, default_language: "english" }); When searching the animals collection I would expect textScores for "name" matches to be 500 times higher than other fields that are matched by the wildcard ("$**": "text"). Checking the index confirms this:   db.animals.getIndexes();   [ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "TextWeightBug.animals" }, { "v" : 2, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "fullTextIndex", "ns" : "TextWeightBug.animals", "weights" : { "$**" : 1, "name" : 500 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 } ] Note that the wildcard above does have the expected weight ("$**" : 1) However the following example shows this to not be the case:   db.animals.insertOne({ name: 'Spot', guardian: 'Kevin'});db.animals.aggregate([ { $match: { $text: { $search: "Kevin" } } }, { $sort: { score: { $meta: "textScore" } } }, { $project: { name: 1, score: { $meta: "textScore" } } } ]); returns:   { "_id" : ObjectId("5e0b7055bea20a3a5b9530a3"), "name" : "Spot", "score" : 550.0 }   Here i was expecting a score of 1.1, since the only field to match was matched via wildcard. Really confused on how the 500 weight would be applied to the "guardian" field match? I've tested this code with 3.6.16 and 4.0 and get the same results   Also posted here: https://stackoverflow.com/questions/59556771/issue-with-mongodb-text-indexes-and-weights-when-using-wildcard-specifier  
    • Sprint:
      Query 2020-01-27, Query 2020-02-10

      Description

      Issue Status as of March 6 2020

      ISSUE SUMMARY
      When inserting a document into a collection with a text index, the weight value for each field in that document is reflected in scores stored inside the index key. If the index has both a wildcard text index and one or more weighted fields, the index will assign an incorrect weight to any field that is lexicographically smaller (alphabetically earlier, so attribute < name < profession) than a field with specified weight. Instead of assigning the default weight for the wildcard, the field will be assigned the weight of the next specified field in the index.

      USER IMPACT
      Documents in collections with text indexes that have both a wildcard match and a weighted field may have incorrect text scores in the index, and therefore get incorrect scores back when querying and projecting or sorting on {$meta: "textScore" }.

      RECOVERY STEPS
      Dropping and recreating a text index on a version of mongod that includes the patch will fix this issue. This should be done for every wildcard text index with custom weights on specific fields.

      AFFECTED VERSIONS
      This affects all versions prior to 4.3.3.

      FIX VERSION
      The fix will be included in 4.3.3, 4.2.4, 4.0.17 and 3.6.18.

        Attachments

          Activity

            People

            Assignee:
            ted.tuckman Ted Tuckman
            Reporter:
            dave@sparkie.io David Lynch
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: