-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
-
None
-
None
-
None
-
None
-
None
-
None
-
None
The query_cbr_histogram_aggregation_passthrough passthrough is OOM-ing in evergreen. The reason is apparently that the passthrough constructs a bunch of histograms and the pipeline that is used in histogram creation is leaking memory.
To reproduce:
1. Run:
buildscripts/resmoke.py run --suites=query_cbr_histogram_aggregation_passthrough "--mongodSetParameters={heapProfilingEnabled: true, heapProfilingSampleIntervalBytes: 10000}" --storageEngineCacheSizeGB=1 > out-with-histogramCE.log 2>&1
and wait for it to finish, or run it for at least 10 minutes.
2. Download the t2 tool (https://github.com/10gen/t2/releases) , and run it. You will get an empty window.
3. Drag the diagnostics file from /data/db/job0/resmoke/diagnostic.data and drop it onto the t2 window. The tool will load the diagnostics and display it.
4. Scroll down to the "heap profiler" section and observe that the top two most-consuming stack traces show a gradual memory increase. Each stack will have a number associated with it, different on each execution, such as stack230.
5. In the log produced by resmoke.py that we captured above, we can fish for the definition of stack 230:
grep 'stackNum":230,' out-with-histogramCE.log
6. Using a json beatifier such as https://jsonformatter.org/, obtain the readable stack trace:
{ "0": "tcmalloc::tcmalloc_internal::SampleifyAllocation()", "1": "tcmalloc::tcmalloc_internal::alloc_small_sampled_hooks_or_perthread<>()", "2": "calloc", "3": "__wt_calloc", "4": "__wt_stat_dsrc_init", "5": "__wt_conn_dhandle_open", "6": "__wt_session_get_dhandle", "7": "__wt_schema_create", "8": "__create_colgroup", "9": "__create_table", "10": "__wt_schema_create", "11": "__wt_session_create", "12": "__session_create", "13": "mongo::WiredTigerSession::create<>()", "14": "mongo::WiredTigerIndex::create()", "15": "mongo::WiredTigerKVEngine::createSortedDataInterface()", "16": "mongo::durable_catalog::createIndex()", "17": "mongo::CollectionImpl::prepareForIndexBuild()", "18": "mongo::IndexBuildBlock::init()", "19": "mongo::IndexBuildBlock::buildEmptyIndex()", "20": "mongo::DatabaseImpl::_createCollection()", "21": "mongo::DatabaseImpl::userCreateNS()", "22": "mongo::write_ops_exec::(anonymous namespace)::makeCollection()::$_0::operator()()", "23": "mongo::write_ops_exec::(anonymous namespace)::makeCollection()", "24": "mongo::write_ops_exec::performUpdates()", "25": "mongo::NonShardServerProcessInterface::update()", "26": "std::_Function_handler<>::_M_invoke()", "27": "mongo::MergeProcessor::flush()", "28": "mongo::DocumentSourceMerge::flush()", "29": "mongo::DocumentSourceWriter<>::doGetNext()", "30": "mongo::exec::agg::Pipeline::getNext()", "31": "mongo::PlanExecutorPipeline::_tryGetNext()", "32": "mongo::PlanExecutorPipeline::_getNext()", "33": "mongo::PlanExecutorPipeline::getNext()", "34": "mongo::(anonymous namespace)::executeUntilFirstBatch()", "35": "mongo::(anonymous namespace)::_runAggregate()", "36": "mongo::runAggregate()", "37": "mongo::(anonymous namespace)::PipelineCommand::Invocation::run()", "38": "mongo::CommandHelpers::runCommandInvocation()", "39": "mongo::(anonymous namespace)::RunCommandAndWaitForWriteConcern::_runImpl()", "40": "mongo::(anonymous namespace)::RunCommandImpl::run()", "41": "mongo::(anonymous namespace)::ExecCommandDatabase::_commandExec()", "42": "mongo::(anonymous namespace)::executeCommand()", "43": "mongo::ServiceEntryPointShardRole::handleRequest()", "44": "mongo::(anonymous namespace)::loopbackBuildResponse()", "45": "mongo::DBDirectClient::_call()", "46": "mongo::DBClientBase::runCommandWithTarget()", "47": "mongo::DBClientBase::runCommandWithTarget()", "48": "mongo::DBClientBase::runCommand()", "49": "mongo::(anonymous namespace)::CmdAnalyze::Invocation::typedRun()", "50": "mongo::CommandHelpers::runCommandInvocation()", "51": "mongo::(anonymous namespace)::RunCommandAndWaitForWriteConcern::_runImpl()", "52": "mongo::(anonymous namespace)::RunCommandImpl::run()", "53": "mongo::(anonymous namespace)::ExecCommandDatabase::_commandExec()", "54": "mongo::(anonymous namespace)::executeCommand()", "55": "mongo::ServiceEntryPointShardRole::handleRequest()", "56": "mongo::transport::SessionWorkflow::Impl::_dispatchWork()", "57": "mongo::future_details::FutureImpl<>::then<>()::{lambda()#1}::operator()()", "58": "mongo::transport::SessionWorkflow::Impl::_doOneIteration()", "59": "mongo::unique_function<>::makeImpl<>()::SpecificImpl::call()", "60": "mongo::ClientStrand::run<>()", "61": "mongo::unique_function<>::makeImpl<>()::SpecificImpl::call()", "62": "mongo::transport::service_executor_synchronous_detail::ServiceExecutorSyncImpl::SharedState::WorkerThreadInfo::run()", "63": "mongo::unique_function<>::makeImpl<>()::SpecificImpl::call()" }
or, in other words, the output of the $merge operator used during histogram construction appears to be retained in memory.
7. Disable the histogram construction , while keeping everything else in the passthrough the same. Observe that memory usage is flat both as reported by top and as reported by t2
diff --git a/buildscripts/resmokeconfig/suites/query_cbr_histogram_aggregation_passthrough.yml b/buildscripts/resmokeconfig/suites/query_cbr_histogram_aggregation_passthrough.yml index bbed244c1ad..ed9db4ee3dc 100644 --- a/buildscripts/resmokeconfig/suites/query_cbr_histogram_aggregation_passthrough.yml +++ b/buildscripts/resmokeconfig/suites/query_cbr_histogram_aggregation_passthrough.yml @@ -42,11 +42,11 @@ executor: objcheck: "" eval: | await import("jstests/libs/override_methods/detect_spawning_own_mongod.js"); - await import("jstests/libs/override_methods/implicit_histograms.js"); +# await import("jstests/libs/override_methods/implicit_histograms.js"); fixture: class: MongoDFixture mongod_options: set_parameters: enableTestCommands: 1 # Automatic mode will fallback when a predicate can't be estimated via histogram - planRankerMode: "automaticCE" +# planRankerMode: "automaticCE"
- related to
-
SERVER-109880 Disable the *cbr_histogram* passthroughs
-
- Closed
-