[SERVER-61687] Extend index multikeyness metadata for positional paths Created: 22/Nov/21  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Timour Katchaounov Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: indexv3
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Optimization
Participants:

 Description   

The fix for SERVER-57588 is to check if an $elemMatch is on a field which is a positional path component of an indexed path. If that is the case, the index cannot be used for $elemMatch because key boundaries cannot be formed. The problem with that patch is that it is too crude, and may prevent index use even when the parent path doesn't contain arrays. In this case the positional path is not an array index, it has the meaning of a regular path, and the index can still be used.

 

In order to refine that patch, and make multikeyness index metadata more complete, it is necessary to simplify the metadata and collect multikey information even when the index path contains positional components.



 Comments   
Comment by David Storch [ 01/Dec/21 ]

To provide a bit more detail, this ticket suggests changing the definition of multikeyness metadata for index key patterns with positional components such as {"a.0": 1}. To provide a concrete example, imagine that you have the index {"a.0": 1} and the user inserts the document {a: [1, 2, 3]}. For all index versions existing today, path "a" will not be considered multikey in this scenario. This means that for indexes with positional path components, the query optimizer cannot rely on the multikeyness metadata in order to determine whether "a" is ever an array. The whole purpose of the multikeyness metadata is to be able to detect definitively when an indexed path component never contains an array, since this can permit certain beneficial optimizations (such as tighter index bounds). Therefore, it would be an improvement to track paths containing arrays as multikey even if those arrays are positionally indexed.

After some discussion, we've decided that this change would require a bump of the index version, so we have labeled this ticket "indexv3". The old definition of multikeyness for positional path components would continue to apply for v:2 and earlier indexes, but the new definition would apply for v:3 indexes. This way, the optimizer could know to interpret the multikeyness metadata differently for v:3 indexes. Also, it means that we could delete any code associated with the old multikeyness definition once v:2 indexes are no longer supported by the system.

Generated at Thu Feb 08 05:53:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.