[SERVER-84955] Investigate how index scan depends on number of selected documents Created: 03/Oct/22 Updated: 12/Jan/24 Resolved: 13/Oct/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Alexander Ignatyev | Assignee: | Ruoxin Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | M6 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Sprint: | QO 2022-10-17 | ||||||||
| Participants: | |||||||||
| Description |
|
It appears that Index Scan has a hidden initialization cost which somehow not revealed directly by our regression models but revealed indirectly by different values of the n_processed coefficient. Design an experiment where the queries selects different number of documents, like 10k, 20k, 30k,..., 150k, try to keep the same number of queries for every point |
| Comments |
| Comment by Alexander Ignatyev [ 12/Oct/22 ] |
|
Yes, we do, per our offline discussion we need at least to try to calibrate on queries that have smaller values of n_proceseed. It is closer to our usual OLTP workflow. |
| Comment by Ruoxin Xu [ 12/Oct/22 ] |
|
The experiments show that more documents returned(lager “n_processed”) the coefficient (average cost to process one document) is smaller. This is as expected due to some potential hidden initialization cost. For example, as shown in the experiments, when n_processed is above 1e6, the cost is 0.0055. While the n_processed is smaller (below 5000), the cost is 0.0433. As discussed, we may want to use more selective queries in calibration? Cc: alexander.ignatyev@mongodb.com |