[SERVER-17853] Allow more complex expressions to be specified in partial index document filter Created: 01/Apr/15  Updated: 06/Apr/23

Status: Backlog
Project: Core Server
Component/s: Index Maintenance, Querying
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: J Rassi Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 39
Labels: storch
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-22194 partialFilterExpression error - Index... Closed
is duplicated by SERVER-26405 Query planner does not consider $elem... Closed
Related
related to SERVER-32540 Make partial index subset analysis co... Backlog
related to SERVER-18884 Allow $in to be specified in partial ... Closed
is related to SERVER-17858 Partial indexes should support more c... Closed
Assigned Teams:
Query Optimization
Participants:
Case:

 Description   

The 3.1.1 version of partial indexes supports document filters composed using the following query operators:

  • $exists (true only)
  • $gt, $gte, $lt, $lte
  • $eq
  • $type
  • $and (top-level only)

Users should be able to create partial indexes with more complex expressions for the document filter. We should consider allowing the following:

  • $or
  • $elemMatch
  • $geoWithin
  • 'false' as a value for $exists
  • multiple levels for $and (simple normalization, unless other logical operators introduced)
  • $ne and $not

Support for $in and $regex is being tracked in SERVER-18884.



 Comments   
Comment by Asya Kamsky [ 22/Jun/21 ]

turivishal@gmail.com

As I mentioned in the forums, the partialFilterExpression applies to the document (to decide whether the document will be indexed). It does not apply to which array elements will be indexed, so I don't believe that having more complex partial filter expression would help you, at least with current semantics.

Comment by Vishal Turi [ 22/Jun/21 ]

I have a similar issue with Unique index + Partial filter expression $type: string in an array of object property,

and also Added in MongoDB Developer Forum as well,

Inserted Record: 

db.getCollection('collection').insert({
  "members": [{ "userId": "abc1" }, { "userId": null }]
})

Created Index:

db.getCollection('collection').createIndex(
  { "members.userId": 1 },
  {
    "unique" : true,
    "partialFilterExpression" : {
      "members.userId" : { "$type" : "string" }
    }
  }
)

Insert New document with a null value:

db.getCollection('collection').insert({
  "members": [{ "userId": "abc2" }, { "userId": null }]
})

It Throws Error:

>> E11000 duplicate key error collection: sample.collection index: members.userId_1 dup key: { members.userId: null }

Comment by Asya Kamsky [ 03/Dec/20 ]

fowiflowi@gmail.com
> I would like to create a unique index only for documents that are not deleted (e. g. isDeleted:
{ $ne: true })

That would only be possible if all non-deleted documents have the field set to true, but if when creating documents you set isDeleted:false then you can use a partial index with isDeleted:false right now.

Comment by Nikolay Symotiyk [ 23/Nov/20 ]

Here is another case with soft delete: I would like to create a unique index only for documents that are not deleted (e. g. isDeleted: { $ne: true })

Comment by Marc Knaup [ 15/Sep/20 ]

We'd also love to use $ne because $exists: true includes null values which we can exclude by using $ne: null.

Comment by Asya Kamsky [ 26/Jun/20 ]

srperetz have you considered two indexes, one on a and b with filter b exists true and the second on a and c with filter c exists true? Without knowing your full use case it’s hard to be sure but it seems like that should work, no? If the queries are always on three fields then the two indexes would be one on a, b, c and the other on a, c, b.

Comment by Anas Mansouri [ 16/Jun/20 ]

Any update on this ?

Comment by Mike Zraly [ 05/Sep/18 ]

FYI the use case mentioned by Sam Peretz comes up naturally when we are sharding a collection and add the shard key to a sparse compound index to keep covered queries covered.  Since the shard key is always present the index will no longer be sparse.  Since the original index was compound and $or is not supported in partialFilterExpressions there seems to be no way to recover sparseness using a partial index.

Comment by Sam Peretz [ 05/Sep/18 ]

Our use case for $or is where we have a large percentage of documents that are missing certain fields, and we want an index with a filter expression like:

{a: 1, b: 1, c: 1}, {partialFilterExpression:  {$or [{b: {$exists: true}}, {c: {$exists: true}}]}

where a is a field that is always present in all documents, but b and c are fields that are frequently missing.

Does anyone have any suggested workaround for this use case?

Comment by Jeffrey Johnson [ 28/Mar/16 ]

I would agree that $ne and $not operators can be useful. My use case is where over 99% of the documents in a collection have the same value for a certain field. We want to query on any value that is not the common value for that field.

db.collection.ensureIndex({a: 1}, {partialFilterExpression: {b: {$ne: "common_value"}}});
db.collection.find({a: 1, b: "other_value"}); // Should be able to use the above index. 
db.collection.find({a: 1, b: {$ne: "common_value"}}); // Should be able to use the above index. 

Adding $in support would probably work for the use case as long as the non-common values are limited, but it would entail the downside of dropping and rebuilding the index whenever a new value is stored.

Edit: added last example statement with $ne comparison

Comment by João França [ 02/Mar/16 ]

The $ne and $not operators can also be useful.

Comment by J Rassi [ 09/Jun/15 ]

I split out $in and $regex support into SERVER-18884 (these are the simplest of the additional proposed operators).

Generated at Thu Feb 08 03:45:48 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.