[SERVER-67012] Command error on specific combination of compound index Created: 03/Jun/22  Updated: 29/Oct/23  Resolved: 28/Jul/22

Status: Closed
Project: Core Server
Component/s: Query Execution
Affects Version/s: 4.2.20
Fix Version/s: 4.4.16

Type: Bug Priority: Major - P3
Reporter: Manuel Cabral Assignee: Denis Grebennicov
Resolution: Fixed Votes: 0
Labels: compound-index, index
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File CompassError.png     File DataExample.json     Text File mongoLogs.txt    
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: QE 2022-08-08
Participants:

 Description   

I have a collection with an index declared as: ({siteId: 1, startDate:1, endDate: 1}, {collation: {"locale": "en"}}).

Some aggregate queries generate database errors, such as:

 

{
  "aggregate" : "MyColl",
  "pipeline" : [{
      "$match" : {
        "siteId": {
          "$oid": "5bfeb296db127e1168efbf13"
        },
        "$or" : [{
            "startDate" : null
          }, {
            "startDate" : {
              "$gte" : "2022-01-01"
            }
          }]
      }
    }, {
      "$sort" : {
        "startDate" : 1,
        "endDate" : 1
      }
    }],
  "collation" : {
    "locale" : "en"
  }
}

The database responds:

 

{
  "ok" : 0.0,
  "errmsg" : "assertion src/mongo/bson/bsonobjbuilder.h:195",
  "code" : 8,
  "codeName" : "UnknownError"
}

It only occurs when there are matches where the startDate field is null (undefined).



 Comments   
Comment by Denis Grebennicov [ 28/Jul/22 ]

The associated backports to 4.4 and 4.2 are merged. The fix will be present with the new version release.

Comment by Denis Grebennicov [ 26/Jul/22 ]

Thank you for reporting this manuel.cabral@vooban.com and chris.kelly@mongodb.com for investigating it.

After me having a look at it, it seems to be a fixed issue, which we didn’t backport to 4.4 and 4.2 versions. The bugfix can be found here: SERVER-64403.

Backporting the fix to 4.4 and 4.2 versions.

Comment by Chris Kelly [ 14/Jul/22 ]

Hi Manuel,

I'm not sure why it wasn't triggering for me the first time I looked at this. I was able to reproduce it this time by:

> creating a fresh collection

> inserting all of your data 

> creating the index only after data was inserted (via CLI)

> attempting the aggregation again (to then see it fail)

What's weird is it doesn't seem to even be using that index at the time it errors out.

{
  "queryPlanner": {
    "plannerVersion": 1,
    "namespace": "test.test",
    "indexFilterSet": false,
    "parsedQuery": {},
    "optimizedPipeline": true,
    "winningPlan": {
      "stage": "COLLSCAN",
      "direction": "forward"
    },
    "rejectedPlans": []
  },
  "executionStats": {
    "executionSuccess": true,
    "nReturned": 12,
    "executionTimeMillis": 0,
    "totalKeysExamined": 0,
    "totalDocsExamined": 12,
    "executionStages": {
      "stage": "COLLSCAN",
      "nReturned": 12,
      "executionTimeMillisEstimate": 0,
      "works": 14,
      "advanced": 12,
      "needTime": 1,
      "needYield": 0,
      "saveState": 0,
      "restoreState": 0,
      "isEOF": 1,
      "direction": "forward",
      "docsExamined": 12
    },
    "allPlansExecution": []
  },
  "serverInfo": {
    "host": "XXXXXXXXXXXX",
    "port": 27017,
    "version": "4.2.20",
    "gitVersion": "15c0712952c356cb711c13a42cb3bca8617d4ebc"
  },
  "ok": 1
}
 

But then, subsequent attempts to run that same aggregation will succeed (and will use the index)

{
  "stages": [
    {
      "$cursor": {
        "query": {
          "siteId": {
            "$oid": "629a2dbf122876e605f2cdd1"
          },
          "$or": [
            {
              "startDate": null
            },
            {
              "startDate": {
                "$gte": "2022-01-01"
              }
            }
          ]
        },
        "queryPlanner": {
          "plannerVersion": 1,
          "namespace": "test.test",
          "indexFilterSet": false,
          "parsedQuery": {
            "$and": [
              {
                "$or": [
                  {
                    "startDate": {
                      "$eq": null
                    }
                  },
                  {
                    "startDate": {
                      "$gte": "2022-01-01"
                    }
                  }
                ]
              },
              {
                "siteId": {
                  "$eq": {
                    "$oid": "629a2dbf122876e605f2cdd1"
                  }
                }
              }
            ]
          },
          "queryHash": "46943446",
          "planCacheKey": "396A2F2A",
          "winningPlan": {
            "stage": "FETCH",
            "filter": {
              "$or": [
                {
                  "startDate": {
                    "$eq": null
                  }
                },
                {
                  "startDate": {
                    "$gte": "2022-01-01"
                  }
                }
              ]
            },
            "inputStage": {
              "stage": "IXSCAN",
              "keyPattern": {
                "siteId": 1,
                "startDate": 1,
                "endDate": 1
              },
              "indexName": "siteId_1_startDate_1_endDate_1",
              "collation": {
                "locale": "en",
                "caseLevel": false,
                "caseFirst": "off",
                "strength": 3,
                "numericOrdering": false,
                "alternate": "non-ignorable",
                "maxVariable": "punct",
                "normalization": false,
                "backwards": false,
                "version": "57.1"
              },
              "isMultiKey": false,
              "multiKeyPaths": {
                "siteId": [],
                "startDate": [],
                "endDate": []
              },
              "isUnique": false,
              "isSparse": false,
              "isPartial": false,
              "indexVersion": 2,
              "direction": "forward",
              "indexBounds": {
                "siteId": [
                  "[ObjectId('629a2dbf122876e605f2cdd1'), ObjectId('629a2dbf122876e605f2cdd1')]"
                ],
                "startDate": [
                  "[MinKey, MaxKey]"
                ],
                "endDate": [
                  "[MinKey, MaxKey]"
                ]
              }
            }
          },
          "rejectedPlans": []
        },
        "executionStats": {
          "executionSuccess": true,
          "nReturned": 1,
          "executionTimeMillis": 0,
          "totalKeysExamined": 1,
          "totalDocsExamined": 1,
          "executionStages": {
            "stage": "FETCH",
            "filter": {
              "$or": [
                {
                  "startDate": {
                    "$eq": null
                  }
                },
                {
                  "startDate": {
                    "$gte": "2022-01-01"
                  }
                }
              ]
            },
            "nReturned": 1,
            "executionTimeMillisEstimate": 0,
            "works": 2,
            "advanced": 1,
            "needTime": 0,
            "needYield": 0,
            "saveState": 1,
            "restoreState": 1,
            "isEOF": 1,
            "docsExamined": 1,
            "alreadyHasObj": 0,
            "inputStage": {
              "stage": "IXSCAN",
              "nReturned": 1,
              "executionTimeMillisEstimate": 0,
              "works": 2,
              "advanced": 1,
              "needTime": 0,
              "needYield": 0,
              "saveState": 1,
              "restoreState": 1,
              "isEOF": 1,
              "keyPattern": {
                "siteId": 1,
                "startDate": 1,
                "endDate": 1
              },
              "indexName": "siteId_1_startDate_1_endDate_1",
              "collation": {
                "locale": "en",
                "caseLevel": false,
                "caseFirst": "off",
                "strength": 3,
                "numericOrdering": false,
                "alternate": "non-ignorable",
                "maxVariable": "punct",
                "normalization": false,
                "backwards": false,
                "version": "57.1"
              },
              "isMultiKey": false,
              "multiKeyPaths": {
                "siteId": [],
                "startDate": [],
                "endDate": []
              },
              "isUnique": false,
              "isSparse": false,
              "isPartial": false,
              "indexVersion": 2,
              "direction": "forward",
              "indexBounds": {
                "siteId": [
                  "[ObjectId('629a2dbf122876e605f2cdd1'), ObjectId('629a2dbf122876e605f2cdd1')]"
                ],
                "startDate": [
                  "[MinKey, MaxKey]"
                ],
                "endDate": [
                  "[MinKey, MaxKey]"
                ]
              },
              "keysExamined": 1,
              "seeks": 1,
              "dupsTested": 0,
              "dupsDropped": 0
            }
          },
          "allPlansExecution": []
        }
      }
    },
    {
      "$sort": {
        "sortKey": {
          "startDate": 1,
          "endDate": 1
        }
      }
    }
  ],
  "serverInfo": {
    "host": "XXXXXXXXXXXXXXX",
    "port": 27017,
    "version": "4.2.20",
    "gitVersion": "15c0712952c356cb711c13a42cb3bca8617d4ebc"
  },
  "ok": 1
}

I'll pass this along to QE to see if they may know what causes it to sporadically use the index (or not) and whether that can precipitate this error. The affected code looks to be here.

I have also been attempting this on other versions to see if I can replicate the behavior there. So far, I don't see it happening in 5.0.9. If you are able to reproduce this in other versions, please let us know!

Thank you for your report!

Christopher

Comment by Manuel Cabral [ 05/Jul/22 ]

Hi Chris,

I am using exactly the same dataset as the one provided. I have the same error when running it in Compass:

Comment by Chris Kelly [ 30/Jun/22 ]

Hi Manuel,

Can you elaborate on this:
the pipeline fails using this condition with the match with the specific siteId: ObjectId("629a28f3810498dcb1954637")

Because I was able to get output from your sample dataset using that object Id instead:

{
  "_id": {
    "$oid": "629a2d69810498dcb195492b"
  },
  "siteId": {
    "$oid": "629a28f3810498dcb1954637"
  },
  "startDate": "2022-06-03T15:48:57.4208360Z"
}
... etc

I have also tried it as the first query ran on startup of the DB, so it would be the first thing using that index. Just to verify: is the sample dataset you provided the one that is triggering this behavior?
It looks like something vaguely similar has been reported before in SERVER-14093 which is pretty old, however the code around that hasn't changed much in a while. Are you able to trigger this behavior running this aggregation manually on the container? As in, with mongosh, legacy mongo shell, or Compass. 

For what it's worth, I also had to manually rewrite your aggregation pipeline to get it to work on Compass for me. I'm not 100% sure how you are passing in that aggregation pipeline, so I'd just want to make sure we're on the same page with it. 

Christopher

Comment by Manuel Cabral [ 21/Jun/22 ]

I did some tests again and it seems that my assumption was incorrect. I'm not sure of the failing condition, but the pipeline fails using this condition with the match with the specific siteId: ObjectId("629a28f3810498dcb1954637") (the id in the description was from a previous test with a different dataset)

I'm running the database in a Docker container (official image). I attached the container logs. Strangely enough, if some similar queries are run before it, the aggregation pipeline succeeds. It only seems to fail if it is the first query using the index since the server has been started.

I reproduced the error with two drivers: Node and C# (both latest versions).

Comment by Chris Kelly [ 21/Jun/22 ]

Hi Manuel,

Thanks for your report. I've gone ahead and tried what you've suggested on MongoDB 4.2.20, but am failing to observe your problem. Specifically, I am:

  1. Creating a single node MongoDB instance running 4.2.20
  2. Inserting your sample data
  3. Creating an index:

    {siteId: 1, startDate:1, endDate: 1}, {collation: {"locale": "en"}})

  4. Running this aggregation pipeline query on an entry with a null/undefined startDate:

     [{
         $match: {
             siteId: ObjectId('629a2dbf122876e605f2cdd1'),
             $or: [{
                     startDate: null
                 },
                 {
                      startDate: {
                         $gte: '2022-01-01'
                     }
                 }
             ]
         }
     }, {
         $sort: {
             startDate: 1,
             endDate: 1
         }
     }]
     

  5. Observing a successful response of:

    {
      "_id": {
        "$oid": "629a2dbf810498dcb195494c"
      },
      "siteId": {
        "$oid": "629a2dbf122876e605f2cdd1"
      }
    } 

In order to look into this further I am going to need the following information:

  • the mongod logs covering the event (you can upload them to this ticket)
  • a description of how you are running this aggregation pipeline (from legacy mongo shell? mongosh? a driver?)

Regards,
Christopher

Generated at Thu Feb 08 06:07:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.