Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Duplicate
Priority: Minor - P4
Fix Version/s: None
Affects Version/s: 3.4.13
Component/s: Performance, Querying
Labels:
None

Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Hi there

We have a collection with a large number of documents and indexes. When we run the following query:

db.getCollection("mycollection").find({"account_id": 123,"term": /^mohamed ali/ }).count()

we see that it uses this index:

	{
		"v" : 2,
		"key" : {
			"account_id" : 1,
			"term" : 1,
			"created_at" : -1
		},
		"name" : "_account_id__term__created_at",
		"ns" : "search_service.mycollection",
		"sparse" : false,
		"background" : true
	},

This is fine. It leads to the following query plan being cached:

	{
		"query" : {
			"account_id" : 123,
			"term" : /^mohamed ali/
		},
		"sort" : {

		},
		"projection" : {

		}
	}

This is fine as well.

But as soon as we change the regex for one where we aren't using the caret to symbolise "begins with", the performance is drastically reduced:

db.getCollection("mycollection").find({"account_id": 123,"term": /mohamed ali/ }).count()

The issue here is that (for whatever reason) it is not using the existing cached query plan for this shape, and ends up picking a poorer choice of index - the two queries with the different regexes still have the same shape. Moreover, once it has done this, this new (poor) choice is cached and will then be used for the /^mohamed ali/ queries leading to much worse performance than they had originally.

Essentially, the upshot is that, following a flush of the query plan cache, you can get great performance on the "begins with" version of the query, until you do a single search without the caret, and from then on, the begins-with version will also perform badly.

Is this expected behaviour?

Many thanks

duplicates

SERVER-32452 Replanning may not occur when a plan with an extremely high 'works' value is cached

Closed

related to

SERVER-66015 Auto-parameterization works incorrectly for indexed regular expression predicates

Closed

SERVER-33678 Make regex indexability a factor of query shapes

Closed

Assignee:: Chris Harris
Reporter:: Oliver Butterfield
Participants:: Chris Harris, David Storch, Oliver Butterfield
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Feb 27 2018 03:23:56 PM UTC
Updated:: Apr 27 2022 07:29:54 PM UTC
Resolved:: Mar 01 2018 10:29:12 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates