[SERVER-68672] Query using shard key and document ID does not use available index Created: 09/Aug/22 Updated: 30/Aug/22 Resolved: 30/Aug/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.2.21 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Shachar Hochberg | Assignee: | Eric Sedor |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
I have a sharded collection with around ~ 200M documents, and an index on this collection: {datasetId: 1, _id: 1}, where datasetId is the shard key on the collection. When I query on the collection with the following query:
The query does not use the aforementioned index, it uses the {_id: 1} index, and takes a very long time to run, while greatly increasing the mongo servers cpu usage.
Here are the query execution stats for a datasetId where ~ 40K documents satisfy the query, and an _id which is the oldest available that has specifed datasetId - which took 28.6 seconds:
If I hint the index on the same query it uses it and runs very quickly scanning only the needed documents:
Using any other operator than $gt /$gte works ($lt / $lte on the newest available ObjectId, $ne - all use the index with no need for a hint), the greater than operators are the only ones that do not use the index (example of a $lt query):
I cannot hint at the index each in query of this nature, Why do only the $gt and $gte operators not use the index? seems they should act like any other operator and use the index if it is available? |
| Comments |
| Comment by Eric Sedor [ 11/Aug/22 ] |
|
Hi shachar.h@dataloop.ai, this looks very similar to Can you revisit your use-case in the context of the configuration option and the workarounds discussed on |