[SERVER-53090] [SBE] Fix crash when running "bestbuy_agg_query_comparison.js" Created: 27/Nov/20  Updated: 29/Oct/23  Resolved: 10/Dec/20

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: 4.9.0

Type: Bug Priority: Major - P3
Reporter: Drew Paroski Assignee: Drew Paroski
Resolution: Fixed Votes: 0
Labels: qexec-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-51655 Investigate sys-perf benchmark perfor... Closed
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query 2020-12-14
Participants:
Linked BF Score: 67

 Description   

When I tried doing several runs of the "bestbuy_agg_query_comparison.js" benchmark (from https://github.com/10gen/workloads) with useAgg=false and with SBE mode enabled, sometimes the benchmark completed successfully, but other times mongod would crash due to a segfault. For all the crashes I observed, it seems to happen when a cursor is being killed and I always saw the segfault happening inside QuerySolutionNode's destructor.

Below is an example of the stacktrace where the segfault happened. In this particular example, I noticed that the 'children' vector looked corrupted, because _M_end_of_storage wasn't aligned to an 8-byte boundary (in fact, _M_end_of_storage was equal to _M_end+1). I also noticed 'filter' pointer was corrupted (the pointer was equal to 1). Based on this case and other evidence I saw in other crashes, I have a hunch that we have a use-after-free bug where one or more fields are being incremented.

This crash also happened when I was running the "bestbuy_query" suite on Evergreen with SBE enabled: https://spruce.mongodb.com/task/sys_perf_linux_1_node_replSet_bestbuy_query_patch_a194505325087b1e841fdee55c51312a042ce9d2_5fbed6b7c9ec444787969814_20_11_25_22_19_45/logs?execution=1

(Note: The "bestbuy_query" suite on Evergreen is just a wrapper around the "bestbuy_agg_query_comparison.js" benchmark with useAgg = false .)

Stacktrace:

(gdb) p &this->children->_M_impl->_M_end_of_storage
$19 = (std::_Vector_base<mongo::QuerySolutionNode*, std::allocator<mongo::QuerySolutionNode*> >::pointer *) 0x555da6e49eb8
(gdb) p &this->filter
$20 = (std::unique_ptr<mongo::MatchExpression, std::default_delete<mongo::MatchExpression> > *) 0x555da6e49ec0
(gdb) p this->children->_M_impl
$21 = {<std::allocator<mongo::QuerySolutionNode*>> = {<__gnu_cxx::new_allocator<mongo::QuerySolutionNode*>> = {<No data fields>}, <No data fields>}, _M_start = 0x555da2900da8, _M_finish = 0x555da2900db0,
 _M_end_of_storage = 0x555da2900db1}
(gdb) p this->filter
$22 = {_M_t = {
 _M_t = {<std::_Tuple_impl<0, mongo::MatchExpression*, std::default_delete<mongo::MatchExpression> >> = {<std::_Tuple_impl<1, std::default_delete<mongo::MatchExpression> >> = {<std::_Head_base<1, std::default_delete<mongo::MatchExpression>, true>> = {<std::default_delete<mongo::MatchExpression>> = {<No data fields>}, <No data fields>}, <No data fields>}, <std::_Head_base<0, mongo::MatchExpression*, false>> = {
 _M_head_impl = 0x1}, <No data fields>}, <No data fields>}}}
(gdb) bt
#0 std::unique_ptr<mongo::MatchExpression, std::default_delete<mongo::MatchExpression> >::~unique_ptr (
 this=0x555da6e49ec0, __in_chrg=<optimized out>)
 at /opt/mongodbtoolchain/revisions/e9f291f28d89b4dfef68c7b43084178a10bd3734/stow/gcc-v3.etX/include/c++/8.2.0/bits/unique_ptr.h:274
#1 mongo::QuerySolutionNode::~QuerySolutionNode (this=0x555da6e49ea0, __in_chrg=<optimized out>)
 at src/mongo/db/query/query_solution.h:121
#2 0x0000555d9ef5175b in mongo::FetchNode::~FetchNode (this=0x555da6e49ea0, __in_chrg=<optimized out>)
 at src/mongo/db/query/query_solution.h:625
#3 mongo::FetchNode::~FetchNode (this=0x555da6e49ea0, __in_chrg=<optimized out>)
 at src/mongo/db/query/query_solution.h:625
#4 0x0000555d9ec51d12 in std::default_delete<mongo::QuerySolutionNode>::operator() (
 this=0x555da6e4a580, __ptr=<optimized out>)
 at /opt/mongodbtoolchain/revisions/e9f291f28d89b4dfef68c7b43084178a10bd3734/stow/gcc-v3.etX/include/c++/8.2.0/bits/unique_ptr.h:347
#5 std::unique_ptr<mongo::QuerySolutionNode, std::default_delete<mongo::QuerySolutionNode> >::~unique_ptr (this=0x555da6e4a580, __in_chrg=<optimized out>)
 at /opt/mongodbtoolchain/revisions/e9f291f28d89b4dfef68c7b43084178a10bd3734/stow/gcc-v3.etX/include/c++/8.2.0/bits/unique_ptr.h:274
#6 mongo::QuerySolution::~QuerySolution (this=0x555da6e4a560, __in_chrg=<optimized out>)
 at src/mongo/db/query/query_solution.h:311
#7 std::default_delete<mongo::QuerySolution>::operator() (this=0x555dacbf0bb0, __ptr=0x555da6e4a560)
 at /opt/mongodbtoolchain/revisions/e9f291f28d89b4dfef68c7b43084178a10bd3734/stow/gcc-v3.etX/include/c++/8.2.0/bits/unique_ptr.h:81
#8 std::unique_ptr<mongo::QuerySolution, std::default_delete<mongo::QuerySolution> >::~unique_ptr (
 this=0x555dacbf0bb0, __in_chrg=<optimized out>)
 at /opt/mongodbtoolchain/revisions/e9f291f28d89b4dfef68c7b43084178a10bd3734/stow/gcc-v3.etX/include/c++/8.2.0/bits/unique_ptr.h:274
#9 mongo::PlanExecutorSBE::~PlanExecutorSBE (this=0x555dacbf0b00, __in_chrg=<optimized out>)
 at src/mongo/db/query/plan_executor_sbe.h:43
#10 mongo::PlanExecutorSBE::~PlanExecutorSBE (this=0x555dacbf0b00, __in_chrg=<optimized out>)
 at src/mongo/db/query/plan_executor_sbe.h:43
#11 0x0000555d9ebbcca5 in mongo::PlanExecutor::Deleter::operator() (execPtr=0x555dacbf0b00,
 this=0x555da753dab0) at src/mongo/db/query/plan_executor.h:127
#12 std::unique_ptr<mongo::PlanExecutor, mongo::PlanExecutor::Deleter>::~unique_ptr (
 this=0x555da753dab0, __in_chrg=<optimized out>)
 at /opt/mongodbtoolchain/revisions/e9f291f28d89b4dfef68c7b43084178a10bd3734/stow/gcc-v3.etX/include/c++/8.2.0/bits/unique_ptr.h:274
#13 mongo::ClientCursor::~ClientCursor() () at src/mongo/db/clientcursor.cpp:110
#14 0x0000555d9ebc1f65 in mongo::ClientCursor::Deleter::operator() (this=0x7f7324f324e0,
 cursor=0x555da753d900) at src/mongo/db/clientcursor.h:310
#15 std::unique_ptr<mongo::ClientCursor, mongo::ClientCursor::Deleter>::~unique_ptr (
 this=0x7f7324f324e0, __in_chrg=<optimized out>)
 at /opt/mongodbtoolchain/revisions/e9f291f28d89b4dfef68c7b43084178a10bd3734/stow/gcc-v3.etX/include/c++/8.2.0/bits/unique_ptr.h:274
#16 mongo::CursorManager::killCursor(mongo::OperationContext*, long long, bool) ()
 at src/mongo/db/cursor_manager.cpp:433
#17 0x0000555d9ecbd248 in mongo::(anonymous namespace)::killCursorIfAuthorized(mongo::OperationContext*, long long) () at src/mongo/db/run_op_kill_cursors.cpp:80
#18 0x0000555d9ecbd675 in mongo::runOpKillCursors (opCtx=opCtx@entry=0x555dacbef780, numCursorIds=1,
 idsArray=0x555da6e4a940 "#\027\342\001\363\254\250\067\066\064\064.3 Mi\020XS\241]U")
 at src/mongo/db/run_op_kill_cursors.cpp:94
#19 0x0000555d9e273358 in mongo::(anonymous namespace)::receivedKillCursors(mongo::OperationContext*, mongo::Message const&) () at src/mongo/db/service_entry_point_common.cpp:1942
#20 0x0000555d9e265f62 in mongo::(anonymous namespace)::FireAndForgetOpRunner::runSync() ()
 at src/mongo/db/service_entry_point_common.cpp:2220
#21 0x0000555d9e264422 in mongo::(anonymous namespace)::SynchronousOpRunner::run() ()
 at src/mongo/db/service_entry_point_common.cpp:2101
#22 0x0000555d9e263918 in mongo::ServiceEntryPointCommon::handleRequest(mongo::OperationContext*, mongo::Message const&, std::unique_ptr<mongo::ServiceEntryPointCommon::Hooks const, std::default_delete<mongo::ServiceEntryPointCommon::Hooks const> >)::{lambda()#1}::operator()() const ()
 at src/mongo/db/service_entry_point_common.cpp:2339
..

 



 Comments   
Comment by Githook User [ 10/Dec/20 ]

Author:

{'name': 'Drew Paroski', 'email': 'drew.paroski@mongodb.com', 'username': 'paroski'}

Message: SERVER-53090 [SBE] Fix crash when running "bestbuy_agg_query_comparison.js"
Branch: master
https://github.com/mongodb/mongo/commit/e7697afed598f3faa1173cd92c06919432a8b2c2

Generated at Thu Feb 08 05:29:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.