[SERVER-14210] Using regex with the $and operator on the same field more than once Created: 09/Jun/14  Updated: 10/Dec/14  Resolved: 09/Jun/14

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.6.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Eriol Gjergji Assignee: David Storch
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Steps To Reproduce:

I have a simple array of strings within a document, which is indexed, next I do this:

db.listings.find({ $and: [ {suggestion: /^Mercedes/}, {suggestion: /^Ser/} ] }, { suggestion: 1}).explain()

On the explain I get this:

"indexBounds" : {
        "suggestion" : [ 
            [ 
                "Mercedes", 
                "Mercedet"
            ], 
            [ 
                /^Mercedes/, 
                /^Mercedes/
            ]
        ]
    },

but if I put the $or instead of $and I get this:

 "indexBounds" : {
        "suggestion" : [ 
            [ 
                "Mercedes", 
                "Mercedet"
            ], 
            [ 
                "Ser", 
                "Ses"
            ], 
            [ 
                /^Mercedes/, 
                /^Mercedes/
            ], 
            [ 
                /^Ser/, 
                /^Ser/
            ]
        ]
    }

Im querying the same document and the same array on both cases

Participants:

 Description   

Hello,

not sure if this is an issue, but I didn't found anything useful on the docs...

can I query same field using regex and $and operator? when I add the explain() I dont see the second regex shown there



 Comments   
Comment by Eriol Gjergji [ 09/Jun/14 ]

Hey Dave,
sounds good,

Thanks,
Eriol.

Comment by David Storch [ 09/Jun/14 ]

Hi eriol,

Yes, it is perfectly valid to use the $and operator with two regexes over the same field:

> t = db.t
> t.drop()
> t.ensureIndex({suggestion: 1})
> t.save({suggestion: ["Mercedes", "Ser"]})
> t.find({ $and: [ {suggestion: /^Mercedes/}, {suggestion: /^Ser/} ] }, { suggestion: 1})
{ "_id" : ObjectId("5395ce9de228c9ec7226b513"), "suggestion" : [ "Mercedes", "Ser" ] }
> t.find({ $and: [ {suggestion: /^Mercedes/}, {suggestion: /^Ser/} ] }, { suggestion: 1}).explain()
{
	"cursor" : "BtreeCursor suggestion_1",
	"isMultiKey" : true,
	"n" : 1,
	"nscannedObjects" : 1,
	"nscanned" : 2,
	"nscannedObjectsAllPlans" : 1,
	"nscannedAllPlans" : 5,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 4,
	"indexBounds" : {
		"suggestion" : [
			[
				"Mercedes",
				"Mercedet"
			],
			[
				/^Mercedes/,
				/^Mercedes/
			]
		]
	},
	"server" : "Macintosh.local:27017",
	"filterSet" : false
}

Note that the matching document gets returned as expected, even though the index bounds include only the Mercedes regex.

The reason that the index bounds look odd is an internal detail of how the query planner constructs bounds for multikey indices. Since the matching documents have multiple index keys, the query execution must proceed as follows:

  1. Use the index bounds to answer one of the predicates. In this example, the /^Mercedes/ regex is answered via scanning the appropriate index bounds.
  2. Fetch the corresponding documents keyed by the index scan above. These documents match the /^Mercedes/ regex, but not necessarily the /^Ser/ regex.
  3. Filter the result set according to /^Ser/.

I'm closing as Works as Designed, as this is expected behavior. I hope the explanation was helpful, and please let me know if you have further related questions.

Best,
Dave

Generated at Thu Feb 08 03:34:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.