[SERVER-70582] [CQF] Sampling CE may cause a MONGO_UNREACHABLE to be reached / segfault in traverseF Created: 14/Oct/22  Updated: 13/Dec/22  Resolved: 13/Dec/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Militsa Sotirova Assignee: Drew Paroski
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File server-70582-debug.patch     HTML File server-70582-stack    
Issue Links:
Depends
Duplicate
duplicates SERVER-70885 Ensure complex SBE expressions proper... Closed
Related
related to SERVER-68516 [CQF] Always translate projections to... Closed
Operating System: ALL
Sprint: QE 2022-10-31, QE 2022-11-14, QE 2022-11-28, QE 2022-12-12, QE 2022-12-26
Participants:

 Description   

During SERVER-68516, an integration test with the following pipeline was written:

coll.aggregate([              
{$project: {a: {$literal: [1, 2, 3, 4]}}},              
{$match: {a: {$elemMatch: {$gte: 2, $lte: 3}}}}          
])

On Evergreen (I cannot reproduce this locally), this test with the internalQueryCardinalityEstimatorMode default set as sampling fails (the test is in jstests/cqf/computed_projection.js) as seen in this patch. The issue from the logs seems to be that we are in sampling CE code and the MONGO_UNREACHABLE here (in the SBE codebase) is reached.

Setting that default to heuristic, as done in this patch, results in that test passing.

During my investigation to see what type is being passed into typeToTags() such that we end up at the MONGO_UNREACHABLE, the type printed in the Evergreen logs  was "unknown tag" which seems to be coming from this printer function. However, every case of the TypeTags enum is represented in that function (except for EndOfShallowValues, which equals MaxKey) so it is not clear what is being passed into typeToTags() when we get to the MONGO_UNREACHABLE.



 Comments   
Comment by Kyle Suarez [ 17/Oct/22 ]

Moving from Cascades to SBE Perf as it looks related to the traverseF work done in SBE Perf. From the top of Svilen's stack trace:

std::pair::pair<…> stl_pair.h:261
mongo::sbe::value::Array::getAt value.h:839
mongo::sbe::value::ArrayEnumerator::getViewOfValue value.cpp:857
mongo::sbe::vm::ByteCode::traverseFInArray vm.cpp:1038
mongo::sbe::vm::ByteCode::traverseF vm.cpp:1022
mongo::sbe::vm::ByteCode::runInternal vm.cpp:5558
mongo::sbe::vm::ByteCode::run vm.cpp:5919
mongo::sbe::vm::ByteCode::runPredicate vm.cpp:5939
mongo::sbe::FilterStage::getNext filter.h:128
mongo::sbe::HashAggStage::open hash_agg.cpp:350
mongo::optimizer::cascades::CESamplingTransportImpl::estimateSelectivity ce_sampling.cpp:286

Generated at Thu Feb 08 06:16:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.