[SERVER-83757] [CQF] Remember chunk boundaries for the duration of the query Created: 30/Nov/23 Updated: 18/Jan/24 Resolved: 18/Jan/24 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.3.0-rc0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Svilen Mihaylov (Inactive) | Assignee: | David Percy |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Sprint: | QO 2023-12-11, QO 2023-12-25, QO 2024-01-08, QO 2024-01-22 |
| Participants: |
| Description |
|
We can remember the chunk starting points and keep them consistent for the duration of the query in the hope that this will achieve greater degree of stability and repeatability of the sampling process. Before we issue any sampling queries, we issue a “chunk collection” query. Using the example above, this query will return 5 record ids. We will then store those in a vector. We’ll modify the sampling query above to effectively replace the PhysicalScan with Limit 5 with a ValueScan node which will sequentially return record ids. The rest of the sampling plan remains unchanged. |
| Comments |
| Comment by Githook User [ 18/Jan/24 ] |
|
Author: {'name': 'David Percy', 'email': 'david.percy@mongodb.com', 'username': 'dpercy'}Message: When a query has multiple predicates, and the sampling CE method is Using a separate sample for each predicate can lead to "contradictory" matches more documents than {c: "US"} – this is a bad estimate because filter cannot possibly increase the number of This commit prevents these contradictory estimates by (conceptually) Physically, we materialize only a handful of record IDs: one per chunk, GitOrigin-RevId: 36b07d142f0cf65d4b554160d67e152c64964262 |
| Comment by Githook User [ 18/Jan/24 ] |
|
Author: {'name': 'David Percy', 'email': 'david.percy@mongodb.com', 'username': 'dpercy'}Message: In a previous ticket GitOrigin-RevId: bd0a5cdba6720cac56d60b4f5848ac13166789df |