[SERVER-82887] [CQF] Make sampling in chunks the sampling approach Created: 07/Nov/23  Updated: 29/Nov/23  Resolved: 20/Nov/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.3.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Svilen Mihaylov (Inactive) Assignee: Daniel Segel
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-78764 Revisit sampling CE scan-from-start i... Closed
Backwards Compatibility: Fully Compatible
Participants:

 Description   

Determined the best chunk size and implement as default setting. Tweak tests as needed.



 Comments   
Comment by Githook User [ 29/Nov/23 ]

Author:

{'name': 'David Percy', 'email': 'david.percy@mongodb.com', 'username': 'dpercy'}

Message: SERVER-83310 Make range_descending.js stable by adjusting distribution

The purpose of this test is to confirm correct results when we choose an
IndexScan where some fields are descending. The goal is not to confirm
that a particular plan wins.

The query we're testing has two predicates: on 'a' and 'b', and in the
example data these predicates are fully correlated. When we enabled
sampling in pairs, we make a separate estimate for 'a' and 'a ^ b'.
Because sampling is random it's possible for the 'a ^ b' estimate to be
higher than the 'a' estimate, which makes the optimizer think it's
beneficial to defer filtering on 'b' as long as possible.

This commit fixes the test by changing the data distribution so that
filtering on 'b' is always beneficial, even once we've already filtered
on 'a'. (As it turns out SERVER-82887 already included this change.)

It also improves the error messages in navigateToPath(), to disambiguate
which component of the path was missing (as in .child.child.child) and
to print the overall plan in addition to the subtree.
Branch: master
https://github.com/mongodb/mongo/commit/20514a684a85a15470b0b1f907c426ef9e415b1c

Comment by Githook User [ 20/Nov/23 ]

Author:

{'name': 'Daniel Segel', 'email': 'daniel.segel@mongodb.com', 'username': 'dhsegel'}

Message: SERVER-82887 Set default sampling chunk size to 10
Branch: master
https://github.com/mongodb/mongo/commit/d1decdc80862f3d5f86693ddbb76b0599a4bdbf7

Generated at Thu Feb 08 06:50:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.