[SERVER-77949] Investigate generalisation of DISTINCT_SCAN for clustered collection _id subfields Created: 09/Jun/23 Updated: 22/Jun/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Jordi Olivares Provencio | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Query Optimization
|
| Participants: |
| Description |
|
In order to get a list of all pre-image collections present in config.system.preimages we are performing something similar to DISTINCT_SCAN on _id.nsUUID. The problem is that right now DISTINCT_SCAN falls back to a full collection scan since there are no indexes available for the collection and we are scanning a subfield of _id. We can work around this in our use-case since we know the format of _id and that nsUUID is the first field of the identifier. As a result we can safely enumerate them using simple RecordCursor::seekNear. Ideally we would prefer to use DISTINCT_SCAN. |
| Comments |
| Comment by Kyle Suarez [ 14/Jun/23 ] |
|
This is a request to improve the plan generated by the optimizer so I am sending this to the Query Optimization team. |