[SERVER-44836] count_documents with filter does not use the partialFilterExpression index right when the keys in the filter are indexed Created: 26/Nov/19  Updated: 27/Nov/19  Resolved: 27/Nov/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Tudor Aursulesei Assignee: Eric Sedor
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-28889 Partial index shouldn't do fetch on c... Closed
Operating System: ALL
Participants:

 Description   

Hello,

I have a large sharded collection with similar documents. A part of them have a field called "tag", which is always a string. I've indexed the field using partialFilterExpression:

{"v" : 2,
 "key" : { "tag" : 1 },
 "name" : "tag",
 "ns" : "database.collection",
 "partialFilterExpression" : {
 "tag" : { "$exists" : true } }
 }

 

However, when i run database.collection.count_documents({"tag": "value"}), the result takes a long time to complete. Running current_op shows me the query running:

"secs_running" : NumberLong(616),
 "microsecs_running" : NumberLong(616856589),
 "op" : "getmore",
 "ns" : "database.collection",
 "command" : {
  "getMore" : NumberLong("4801338963376724146"),
  "collection" : "collection",
 "planSummary" : "IXSCAN { tag: 1 }",
 "cursor" : {
 "cursorId" : NumberLong("4801338963376724146"),
 "createdDate" : ISODate("2019-11-26T13:23:57.281Z"),
 "lastAccessDate" : ISODate("2019-11-26T13:23:57.282Z"),
 "nDocsReturned" : NumberLong(0),
 "nBatchesReturned" : NumberLong(0),
 "noCursorTimeout" : false,
 "tailable" : false,
 "awaitData" : false,
 "originatingCommand" : {
 "$truncated" : "{ $truncated: \"{ aggregate: \"collection\", pipeline: [ { $match: { tag: \"value\" } }, { $group: { _id: { $const: 1 }, n: { $sum: { $const: 1 } } } } ], fro...\" }"
 },

 So i'm running an aggregate, which i know can be slow. I remember running these kind of queries very fast, when the field was indexed. Is this because the index is a partialFilterExpression? If so, it looks like a bug, because i think it should be able to use the information from that index in order to complete a simple count operation.

 

I'm running mongod 4.2.1, and the count was issued with pymongo 3.9.0.



 Comments   
Comment by Eric Sedor [ 27/Nov/19 ]

Thanks for following up thestick613.

Comment by Tudor Aursulesei [ 27/Nov/19 ]

Hello,

These two don't work any better:

database.collection.count_documents({"tag": {"$exists": true, "$eq": "value"}})

database.collection.count_documents({"$and": [{"tag": {"$exists": true}}, \{"tag": "value"}]})

 They both result in a IXSCAN, but i was expecting a COUNT_SCAN. I then found SERVER-28889, which is a duplicate of my issue. It looks like partial indexes don't really do much.

Comment by Eric Sedor [ 26/Nov/19 ]

thestick613,

Please see this section of the partial index documentation as it is intended that a query on {"tag": "value"} would not use a partial index with a partialFilterExpression of {{{"tag" :{ "$exists" : true }}}. The query must contain the partialFilterExpression for a partial index to be eligible for use.

For troubleshooting, I encourage you to ask our community by posting on the mongodb-user group or on Stack Overflow with the mongodb tag.

Sincerely,
Eric

Generated at Thu Feb 08 05:07:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.