[SERVER-83322] Compound Wildcard Index bounds reverts to [MinKey, MaxKey] when filter contains fields outside CWI Created: 16/Nov/23  Updated: 22/Jan/24

Status: Needs Verification
Project: Core Server
Component/s: None
Affects Version/s: 7.0.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Michael Dahl Assignee: Backlog - Triage Team
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Server Triage
Operating System: ALL
Participants:

 Description   

I'm observing a weird behaviour with CWI's, where queries containing filters beyond those covered by the CWI, results in the `indexBounds` to revert to '[MinKey, MaxKey]`.

Say I have CWI:

db.cwi.createIndex({ TenantId: 1, 'CustomFields.$**': 1, Date: 1 })

If I query only using fields contained in the CWI:

db.cwi.explain().find({
    TenantId: 1,
    'CustomFields.Prop': 2,
    Date: { $gte: ISODate('2023-01-01'), $lte: ISODate('2024-01-01') },
})

I get this winning query plan:

{
  "stage": "FETCH",
  "planNodeId": new NumberInt("2"),
  "inputStage": {
    "stage": "IXSCAN",
    "planNodeId": new NumberInt("1"),
    "keyPattern": {
      "TenantId": new NumberInt("1"),
      "$_path": new NumberInt("1"),
      "CustomFields.Prop": new NumberInt("1"),
      "Date": new NumberInt("1")
    },
    "indexName": "TenantId_1_CustomFields.$**_1_Date_1",
    "isMultiKey": false,
    "multiKeyPaths": {
      "TenantId": [
      ],
      "$_path": [
      ],
      "CustomFields.Prop": [
      ],
      "Date": [
      ]
    },
    "isUnique": false,
    "isSparse": true,
    "isPartial": false,
    "indexVersion": new NumberInt("2"),
    "direction": "forward",
    "indexBounds": {
      "TenantId": [
        "[1, 1]"
      ],
      "$_path": [
        "[\"CustomFields.Prop\", \"CustomFields.Prop\"]"
      ],
      "CustomFields.Prop": [
        "[2, 2]"
      ],
      "Date": [
        "[new Date(1672531200000), new Date(1704067200000)]"
      ]
    }
  }
}

But if I add an extra criteria, say `

{ Foo: 'Bar' }

`:

db.cwi.explain().find({
    TenantId: 1,
    'CustomFields.Prop': 2,
    Date: { $gte: ISODate('2023-01-01'), $lte: ISODate('2024-01-01') },
    Foo: 'Bar'
})

I suddenly get a query plan where the index bounds on `CustomFields.Prop` and `Date` are both `[MinKey, MaxKey]`:

{
  "stage": "FETCH",
  "planNodeId": new NumberInt("2"),
  "filter": {
    "$and": [
      {
        "CustomFields.Prop": {
          "$eq": new NumberInt("2")
        }
      },
      {
        "Date": {
          "$lte": new ISODate("2024-01-01T00:00:00.000Z")
        }
      },
      {
        "Date": {
          "$gte": new ISODate("2023-01-01T00:00:00.000Z")
        }
      },
      {
        "Foo": {
          "$eq": "Bar"
        }
      }
    ]
  },
  "inputStage": {
    "stage": "IXSCAN",
    "planNodeId": new NumberInt("1"),
    "keyPattern": {
      "TenantId": new NumberInt("1"),
      "$_path": new NumberInt("1"),
      "Date": new NumberInt("1")
    },
    "indexName": "TenantId_1_CustomFields.$**_1_Date_1",
    "isMultiKey": false,
    "multiKeyPaths": {
      "TenantId": [
      ],
      "$_path": [
      ],
      "Date": [
      ]
    },
    "isUnique": false,
    "isSparse": true,
    "isPartial": false,
    "indexVersion": new NumberInt("2"),
    "direction": "forward",
    "indexBounds": {
      "TenantId": [
        "[1, 1]"
      ],
      "$_path": [
        "[MinKey, MaxKey]"
      ],
      "Date": [
        "[MinKey, MaxKey]"
      ]
    }
  }
}

Note the index bounds and the `filter` on the outer fetch stage which is suddenly doing the work of filtering on `CustomFields.Prop` and `Date`.

I see no reason why this is the case, an optimal query plan would be as before with the CWI filtering by all fields it covers, and then only a filter on the fetch stag on `

{ Foo: 'Bar' }

`.



 Comments   
Comment by Michael Dahl [ 22/Jan/24 ]

Is there any further info I can provide to help identify the reason behind this?

We really would love to deploy CWI's in our production code, as we have a set of 20 indexes which all could be replaced by a single CWI - but when we tried doing this the performance tanked, and looking into the profiler output I observed that MongoDB was not using the CWI under any circumstance.

This led me to try and investigate why this was, and at least I came up with this behaviour which I found weird.

It could explain why it wasn't being used as we typically have a lot of criteria in our filters many of which would not be covered by the CWI, but the CWI should be able to shave off a huge amount of false-positives.

Comment by Michael Dahl [ 16/Nov/23 ]

If I "dumb" down the index to a regular wildcard index only, i.e.:

db.cwi.createIndex({ 'CustomFields.$**': 1 })

And run the same query, I get this winning query plan:

{
  "stage": "FETCH",
  "planNodeId": new NumberInt("2"),
  "filter": {
    "$and": [
      {
        "Date": {
          "$lte": new ISODate("2024-01-01T00:00:00.000Z")
        }
      },
      {
        "Date": {
          "$gte": new ISODate("2023-01-01T00:00:00.000Z")
        }
      },
      {
        "Foo": {
          "$eq": "Bar"
        }
      },
      {
        "TenantId": {
          "$eq": new NumberInt("1")
        }
      }
    ]
  },
  "inputStage": {
    "stage": "IXSCAN",
    "planNodeId": new NumberInt("1"),
    "keyPattern": {
      "$_path": new NumberInt("1"),
      "CustomFields.Prop": new NumberInt("1")
    },
    "indexName": "CustomFields.$**_1",
    "isMultiKey": false,
    "multiKeyPaths": {
      "$_path": [
      ],
      "CustomFields.Prop": [
      ]
    },
    "isUnique": false,
    "isSparse": true,
    "isPartial": false,
    "indexVersion": new NumberInt("2"),
    "direction": "forward",
    "indexBounds": {
      "$_path": [
        "[\"CustomFields.Prop\", \"CustomFields.Prop\"]"
      ],
      "CustomFields.Prop": [
        "[2, 2]"
      ]
    }
  }
}

Which is what I would expect, it uses the wildcard index to filter all that it can, and then do the rest as a filter on the fetch stage.

So it would seem to perhaps be related to CWI's.

Generated at Thu Feb 08 06:51:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.