[SERVER-29346] Why is MongoDB not using index correctly? Is this bug? Created: 24/May/17  Updated: 21/Jun/17  Resolved: 24/May/17

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.4.2
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Michal Bigos Assignee: Mark Agarunov
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-29347 Why is MongoDB treating queries diffe... Closed
Participants:

 Description   

I have a collection where I have a data in multilanguage form like this:

    > db.names.findOne();
    {
      "_id" : ObjectId("591dbcd877d4ede22d765247"),
      "name" : [
        {
          "locale" : "de",
          "text" : "Brunhilde"
        },
        {
          "locale" : "sk",
          "text" : "Šimon"
        }
      ]
    }

I have defined an indices for each language as:

    db.names.createIndex({ 'name.locale': 1, 'name.text': 1 }, { name: 'name_de', partialFilterExpression: { 'name.locale': 'de' }, collation: { locale: 'de' } });
    db.names.createIndex({ 'name.locale': 1, 'name.text': 1 }, { name: 'name_sk', partialFilterExpression: { 'name.locale': 'sk' }, collation: { locale: 'sk' } });

When I'm quering the collection as:

    > db.names.find({ 'name.locale': 'de', 'name.text': 'Rüdi' }).collation({ locale: 'de' });

The query uses index, but explain plan looks a bit weird:

    ...
    "executionStages" : {
      "stage" : "FETCH",
      "filter" : {
        "name.text" : {
          "$eq" : "Rüdi"
        }
      },
      "nReturned" : 1,
      "executionTimeMillisEstimate" : 0,
      "works" : 10,
      "advanced" : 1,
      "needTime" : 8,
      "needYield" : 0,
      "saveState" : 0,
      "restoreState" : 0,
      "isEOF" : 1,
      "invalidates" : 0,
      "docsExamined" : 9,
      "alreadyHasObj" : 0,
      "inputStage" : {
        "stage" : "IXSCAN",
        "nReturned" : 9,
        "executionTimeMillisEstimate" : 0,
        "works" : 10,
        "advanced" : 9,
        "needTime" : 0,
        "needYield" : 0,
        "saveState" : 0,
        "restoreState" : 0,
        "isEOF" : 1,
        "invalidates" : 0,
        "keyPattern" : {
          "name.locale" : 1,
          "name.text" : 1
        },
        "indexName" : "name_de",
        "collation" : {
          "locale" : "de",
          "caseLevel" : false,
          "caseFirst" : "off",
          "strength" : 3,
          "numericOrdering" : false,
          "alternate" : "non-ignorable",
          "maxVariable" : "punct",
          "normalization" : false,
          "backwards" : false,
          "version" : "57.1"
        },
        "isMultiKey" : true,
        "multiKeyPaths" : {
          "name.locale" : [
            "name"
          ],
          "name.text" : [
            "name"
          ]
        },
        "isUnique" : false,
        "isSparse" : false,
        "isPartial" : true,
        "indexVersion" : 2,
        "direction" : "forward",
        "indexBounds" : {
          "name.locale" : [
            "[\"/1\u0001\u0006\u0001\u0006\", \"/1\u0001\u0006\u0001\u0006\"]"
          ],
          "name.text" : [
            "[MinKey, MaxKey]"
          ]
        },
        "keysExamined" : 9,
        "seeks" : 1,
        "dupsTested" : 9,
        "dupsDropped" : 0,
        "seenInvalidated" : 0
      }
    }

It makes an IXSCAN, but then in FETCH stage MongoDB filters thru "name.text".

On other hand when I make query where "name.text" is used to sort the results as:

    > db.names.find({ 'name.locale': 'de' }).sort({ 'name.text': 1 }).collation({ locale: "de" });

The explain plan looks like:

    ...
    "executionStats" : {
    "executionSuccess" : true,
    "nReturned" : 9,
    "executionTimeMillis" : 1,
    "totalKeysExamined" : 9,
    "totalDocsExamined" : 9,
    "executionStages" : {
      "stage" : "FETCH",
      "nReturned" : 9,
      "executionTimeMillisEstimate" : 0,
      "works" : 10,
      "advanced" : 9,
      "needTime" : 0,
      "needYield" : 0,
      "saveState" : 0,
      "restoreState" : 0,
      "isEOF" : 1,
      "invalidates" : 0,
      "docsExamined" : 9,
      "alreadyHasObj" : 0,
      "inputStage" : {
        "stage" : "IXSCAN",
        "nReturned" : 9,
        "executionTimeMillisEstimate" : 0,
        "works" : 10,
        "advanced" : 9,
        "needTime" : 0,
        "needYield" : 0,
        "saveState" : 0,
        "restoreState" : 0,
        "isEOF" : 1,
        "invalidates" : 0,
        "keyPattern" : {
          "name.locale" : 1,
          "name.text" : 1
        },
        "indexName" : "name_de",
        "collation" : {
          "locale" : "de",
          "caseLevel" : false,
          "caseFirst" : "off",
          "strength" : 3,
          "numericOrdering" : false,
          "alternate" : "non-ignorable",
          "maxVariable" : "punct",
          "normalization" : false,
          "backwards" : false,
          "version" : "57.1"
        },
        "isMultiKey" : true,
        "multiKeyPaths" : {
          "name.locale" : [
            "name"
          ],
          "name.text" : [
            "name"
          ]
        },
        "isUnique" : false,
        "isSparse" : false,
        "isPartial" : true,
        "indexVersion" : 2,
        "direction" : "forward",
        "indexBounds" : {
          "name.locale" : [
            "[\"/1\u0001\u0006\u0001\u0006\", \"/1\u0001\u0006\u0001\u0006\"]"
          ],
          "name.text" : [
            "[MinKey, MaxKey]"
          ]
        },
        "keysExamined" : 9,
        "seeks" : 1,
        "dupsTested" : 9,
        "dupsDropped" : 0,
        "seenInvalidated" : 0
      }
    }

Now the IXSCAN is followed by FETCH without filter.

Question is: why in the first case the filter in FETCH is used and it is not enought to use just index?

In both cases it the `indexBounds` in explain plan are the same, that can explain why in the 2nd query no filter is necessary. I didn't find any constraints in MongoDB documentation which can explain this behavior.



 Comments   
Comment by Mark Agarunov [ 24/May/17 ]

Hello michal bigos,

This ticket appears to be a duplicate of SERVER-29347. As there is more discussion in the other ticket, I have closed this as a duplicate. Please watch SERVER-29347 for further updates on this issue.

Thanks,
Mark

Comment by Michal Bigos [ 24/May/17 ]

Additional information, when I try to use alternation of my index as:

    > db.names.createIndex({ 'name.text': 1 }, { name: 'name_de', partialFilterExpression: { 'name.locale': 'de' }, collation: { locale: 'de' } });

i.e. omitting the field `name.locale` from index. It gives and interesting results, bud sadly still wrong ones. First query explain plan is:

    ...
    "executionStats" : {
      "executionSuccess" : true,
      "nReturned" : 1,
      "executionTimeMillis" : 0,
      "totalKeysExamined" : 1,
      "totalDocsExamined" : 1,
      "executionStages" : {
        "stage" : "FETCH",
        "filter" : {
          "name.locale" : {
            "$eq" : "de"
          }
        },
        "nReturned" : 1,
        "executionTimeMillisEstimate" : 0,
        "works" : 2,
        "advanced" : 1,
        "needTime" : 0,
        "needYield" : 0,
        "saveState" : 0,
        "restoreState" : 0,
        "isEOF" : 1,
        "invalidates" : 0,
        "docsExamined" : 1,
        "alreadyHasObj" : 0,
        "inputStage" : {
          "stage" : "IXSCAN",
          "nReturned" : 1,
          "executionTimeMillisEstimate" : 0,
          "works" : 2,
          "advanced" : 1,
          "needTime" : 0,
          "needYield" : 0,
          "saveState" : 0,
          "restoreState" : 0,
          "isEOF" : 1,
          "invalidates" : 0,
          "keyPattern" : {
            "name.text" : 1
          },
          "indexName" : "name_de",
          "collation" : {
            "locale" : "de",
            "caseLevel" : false,
            "caseFirst" : "off",
            "strength" : 3,
            "numericOrdering" : false,
            "alternate" : "non-ignorable",
            "maxVariable" : "punct",
            "normalization" : false,
            "backwards" : false,
            "version" : "57.1"
          },
          "isMultiKey" : true,
          "multiKeyPaths" : {
            "name.text" : [
              "name"
            ]
          },
          "isUnique" : false,
          "isSparse" : false,
          "isPartial" : true,
          "indexVersion" : 2,
          "direction" : "forward",
          "indexBounds" : {
            "name.text" : [
              "[\"KQ/9\u0001D�\u0006\u0001�\b\", \"KQ/9\u0001D�\u0006\u0001�\b\"]"
            ]
          },
          "keysExamined" : 1,
          "seeks" : 1,
          "dupsTested" : 1,
          "dupsDropped" : 0,
          "seenInvalidated" : 0
        }
      }
    }
    ...

It still does filter in FETCH but via `name.locale` and not via `name.text`. The `indexBounds` changed too and contains `name.text` pattern.
Second query explain plan is:

    ...
    "executionStats" : {
      "executionSuccess" : true,
      "nReturned" : 9,
      "executionTimeMillis" : 0,
      "totalKeysExamined" : 18,
      "totalDocsExamined" : 9,
      "executionStages" : {
        "stage" : "FETCH",
        "filter" : {
          "name.locale" : {
            "$eq" : "de"
          }
        },
        "nReturned" : 9,
        "executionTimeMillisEstimate" : 0,
        "works" : 19,
        "advanced" : 9,
        "needTime" : 9,
        "needYield" : 0,
        "saveState" : 0,
        "restoreState" : 0,
        "isEOF" : 1,
        "invalidates" : 0,
        "docsExamined" : 9,
        "alreadyHasObj" : 0,
        "inputStage" : {
          "stage" : "IXSCAN",
          "nReturned" : 9,
          "executionTimeMillisEstimate" : 0,
          "works" : 19,
          "advanced" : 9,
          "needTime" : 9,
          "needYield" : 0,
          "saveState" : 0,
          "restoreState" : 0,
          "isEOF" : 1,
          "invalidates" : 0,
          "keyPattern" : {
            "name.text" : 1
          },
          "indexName" : "name_de",
          "collation" : {
            "locale" : "de",
            "caseLevel" : false,
            "caseFirst" : "off",
            "strength" : 3,
            "numericOrdering" : false,
            "alternate" : "non-ignorable",
            "maxVariable" : "punct",
            "normalization" : false,
            "backwards" : false,
            "version" : "57.1"
          },
          "isMultiKey" : true,
          "multiKeyPaths" : {
            "name.text" : [
              "name"
            ]
          },
          "isUnique" : false,
          "isSparse" : false,
          "isPartial" : true,
          "indexVersion" : 2,
          "direction" : "forward",
          "indexBounds" : {
            "name.text" : [
              "[MinKey, MaxKey]"
            ]
          },
          "keysExamined" : 18,
          "seeks" : 1,
          "dupsTested" : 18,
          "dupsDropped" : 9,
          "seenInvalidated" : 0
        }
      }
    }
    ...

The `indexBounds` pattern contains min/max ranges of `name.text` (which is clear, while used in sort) and FETCH contains the filter. But ... result is wrong, documents are no more sorted by german translation as before:

    ...
    { "_id" : ObjectId("591dbcd877d4ede22d76524f"), "name" : [ { "locale" : "de", "text" : "Mathias" }, { "locale" : "sk", "text" : "Abel" } ] }
    { "_id" : ObjectId("591dbcd877d4ede22d76524e"), "name" : [ { "locale" : "de", "text" : "Rudi" }, { "locale" : "sk", "text" : "Ábel" } ] }
    { "_id" : ObjectId("591dbcd877d4ede22d765248"), "name" : [ { "locale" : "de", "text" : "Brünhild" }, { "locale" : "sk", "text" : "Žofia" } ] }
    { "_id" : ObjectId("591dbcd877d4ede22d765247"), "name" : [ { "locale" : "de", "text" : "Brunhilde" }, { "locale" : "sk", "text" : "Šimon" } ] }
    { "_id" : ObjectId("591dbcd877d4ede22d765249"), "name" : [ { "locale" : "de", "text" : "Cäcilia" }, { "locale" : "sk", "text" : "Chruňo" } ] }
    { "_id" : ObjectId("591dbcd877d4ede22d76524a"), "name" : [ { "locale" : "de", "text" : "Caecilia" }, { "locale" : "sk", "text" : "Hana" } ] }
    { "_id" : ObjectId("591dbcd877d4ede22d76524b"), "name" : [ { "locale" : "de", "text" : "Günther" }, { "locale" : "sk", "text" : "Cecília" } ] }
    { "_id" : ObjectId("591dbcd877d4ede22d76524c"), "name" : [ { "locale" : "de", "text" : "Gunther" }, { "locale" : "sk", "text" : "Zina" } ] }
    { "_id" : ObjectId("591dbcd877d4ede22d76524d"), "name" : [ { "locale" : "de", "text" : "Rüdi" }, { "locale" : "sk", "text" : "Stano" } ] }
    ...

IMHO there is really something weird by using this index in MongoDB.

Generated at Thu Feb 08 04:20:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.