[SERVER-82978] [CQF] jstests/cqf/analyze/ce_sample_rate.js failed with parameterization enabled Created: 08/Nov/23  Updated: 21/Nov/23  Resolved: 21/Nov/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.3.0-rc0

Type: Task Priority: Major - P3
Reporter: Jess Balint Assignee: Ben Shteinfeld
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-83100 [CQF] Use-after-free in aggregation w... Closed
Assigned Teams:
Query Optimization
Backwards Compatibility: Fully Compatible
Sprint: QO 2023-11-27
Participants:

 Comments   
Comment by Githook User [ 21/Nov/23 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}

Message: SERVER-82978 Disable parameterization when histogram CE is enabled
Branch: master
https://github.com/mongodb/mongo/commit/753c16d9f82e877763d18a53bf9e127b73b44f6b

Comment by Ben Shteinfeld [ 17/Nov/23 ]

Abandoned https://github.com/10gen/mongo/pull/16905 in favor of https://github.com/10gen/mongo/pull/17022. See PRs themselves for details.

Comment by Ben Shteinfeld [ 10/Nov/23 ]

After SERVER-83100 was resolved, this test was failing due to a mismatch between the CE outputted from histogram estimation and the actual number of documents returned via executionStats. This is caused by parameterization putting ABT expressions into bounds that are not Constants. This causes the histogram estimator to fallback to the heuristic estimator. This causes us to return a heuristic CE for a predicate, which fails the test to fail as it was expecting a more accurate estimate.

Fixing this will require us to design a mechanism for CE to get access to constants before parameterization. One way we could do this is add another child of FunctionCall[getParam] to hold this constant. We will need to consider how this interacts with interval simplifications which generate expressions as bounds. Since the CE module is estimating cardinality for a particular value of the parameter, we could perform replace the FunctionCall[getParam] with the constant and perform constant folding to get a constant for the bound and proceed with CE as usual. I suspect we'll need to do this for both histogram and sampling based CE.

Generated at Thu Feb 08 06:50:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.