[SERVER-83624] Search with $in and $regex returns empty result Created: 28/Nov/23  Updated: 16/Jan/24  Resolved: 16/Jan/24

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 7.0.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Mikhail Lepeshkin Assignee: Alison Rhea Thorne
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Steps To Reproduce:

We can reproduce it on our database with 180k records. But I failed to build synthetic test with this case.

Participants:

 Description   

We have a collection 'plain_time_series'. The typical documents in this collection looks like this:

{
    "_id" : ObjectId("6565b44aad131a54a85cc60c"),
    "name" : "EU-taxonomy Aligned % (SFDR)",
    "companyId" : ObjectId("65577ce17124dd000199e6d9"),
    "attributes" : [],
    "expression" : null
} 

 The following query returns 0 records, but should return 4:

db.getCollection('plain_time_series').find({
  $or: [
    { $and: [
        { name: { $regex: /^\s*EU-taxonomy Aligned % \(SFDR\)\s*$/i } },
        { companyId: { $in: [ ObjectId("65577ce17124dd000199e6d9"), ObjectId("622212e633075400010c9977"), ObjectId("62811ba5541e3600019730d0"), ObjectId("6222112f21573e00018ef332") ] } } 
      ]
    },
    { $and: [
        { name: { $regex: /^\s*EU-Taxonomy Aligned \(SFDR\)\s*$/i } },
        { companyId: { $in: [ ObjectId("65577ce17124dd000199e6d9"), ObjectId("622212e633075400010c9977"), ObjectId("62811ba5541e3600019730d0"), ObjectId("6222112f21573e00018ef332") ] } } 
      ]
    },
    { $and: [
        { name: { $regex: /^\s*EU-taxonomy Eligible % \(SFDR\)\s*$/i } },
        { companyId: { $in: [ ObjectId("65577ce17124dd000199e6d9"), ObjectId("622212e633075400010c9977"), ObjectId("62811ba5541e3600019730d0"), ObjectId("6222112f21573e00018ef332") ] } } 
      ]
    },
    { $and: [
        { name: { $regex: /^\s*EU-Taxonomy Eligible \(SFDR\)\s*$/i } },
        { companyId: { $in: [ ObjectId("65577ce17124dd000199e6d9"), ObjectId("622212e633075400010c9977"), ObjectId("62811ba5541e3600019730d0"), ObjectId("6222112f21573e00018ef332") ] } } 
      ]
    }
  ]
}) 

The other similar queries sometimes returns expected result and sometimes not. 

Adding {{.limit(<any number>) }}seems to solve the issue. 

Below you can find the winning query plan.

{
    "stage" : "FETCH",
    "planNodeId" : 2,
    "filter" : {
        "$or" : [ 
            {
                "name" : {
                    "$in" : [ 
                        /^\\s*EU-Taxonomy Eligible \\(SFDR\\)\\s*$/i, 
                        /^\\s*EU-taxonomy Eligible % \\(SFDR\\)\\s*$/i
                    ]
                }
            }, 
            {
                "name" : {
                    "$in" : [ 
                        /^\\s*EU-Taxonomy Aligned \\(SFDR\\)\\s*$/i, 
                        /^\\s*EU-taxonomy Aligned % \\(SFDR\\)\\s*$/i
                    ]
                }
            }
        ]
    },
    "inputStage" : {
        "stage" : "IXSCAN",
        "planNodeId" : 1,
        "keyPattern" : {
            "companyId" : 1
        },
        "indexName" : "companyId_1",
        "isMultiKey" : false,
        "multiKeyPaths" : {
            "companyId" : []
        },
        "isUnique" : false,
        "isSparse" : false,
        "isPartial" : false,
        "indexVersion" : 2,
        "direction" : "forward",
        "indexBounds" : {
            "companyId" : [ 
                "[ObjectId('6222112f21573e00018ef332'), ObjectId('6222112f21573e00018ef332')]"
                "[ObjectId('622212e633075400010c9977'), ObjectId('622212e633075400010c9977')]"
                "[ObjectId('62811ba5541e3600019730d0'), ObjectId('62811ba5541e3600019730d0')]"
                "[ObjectId('65577ce17124dd000199e6d9'), ObjectId('65577ce17124dd000199e6d9')]"
            ]
        }
    }
}



 Comments   
Comment by Alison Rhea Thorne [ 16/Jan/24 ]

We haven’t heard back from you for some time, so I’m going to close this ticket. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Comment by Alison Rhea Thorne [ 03/Jan/24 ]

We still need additional information to diagnose the problem. If this is still an issue for you, would you please provide the following information?

1. More information on the timeseries that this impacts? Notably specifically timeseries configuration and indexes.
2. Can you provide the .explain() output for your failing query, as well as an example with .limit()?

Comment by Alison Rhea Thorne [ 12/Dec/23 ]

Hello mlepeshkin,

In the interest of attempting to replicate and troubleshoot this issue, can you provide the following information:

1. More information on the timeseries that this impacts? Notably specifically timeseries configuration and indexes.
2. Can you provide the .explain() output for your failing query, as well as an example with .limit()?

Thank you

Generated at Thu Feb 08 06:52:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.