[SERVER-56048] [SBE][unittest] db_s_shard_server_test fails due to unsupported QuerySolutionNode in stage builder Created: 12/Apr/21  Updated: 29/Oct/23  Resolved: 15/Apr/21

Status: Closed
Project: Core Server
Component/s: Query Execution
Affects Version/s: None
Fix Version/s: 5.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Anton Korshunov Assignee: Eric Cox (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File server-56048.patch    
Issue Links:
Related
related to SERVER-56082 [sbe][sharding_op_query] tassert beca... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query Execution 2021-04-19
Participants:

 Description   

When SBE is turned on by default the "ShardLocalTest/FindManyWithLimit" tests fails with an error: 

"FAIL","attr":{"test":"FindManyWithLimit","type":"TestAssertionFailureException","error":"Expected ::mongo::Status::OK() == (swt.getStatus()) (OK == Location4822884 Unsupported QSN in SBE stage builder: ENSURE_SORTED



 Comments   
Comment by Githook User [ 15/Apr/21 ]

Author:

{'name': 'Eric Cox', 'email': 'eric.cox@mongodb.com', 'username': 'ericox'}

Message: SERVER-56048 Fallback to the classic engine when a plan has ENSURE_SORTED node
Branch: master
https://github.com/mongodb/mongo/commit/05ec56e3519f1cf98053e5bd365e8160aa80a4d1

Comment by Eric Cox (Inactive) [ 13/Apr/21 ]

One observation so far is that the offending query that produces an ENSURE_SORTED QSN goes through _getExecutorFind() according to this backtrace,

#4 mongo::stage_builder::SlotBasedStageBuilder::build (this=0x55555691bc00, root=<optimized out>, reqs=...) at src/mongo/db/query/sbe_stage_builder.cpp:1851 
#5 0x00007ffff62fd722 in mongo::stage_builder::SlotBasedStageBuilder::build (this=<optimized out>, root=<optimized out>) at src/mongo/db/query/sbe_stage_builder.cpp:361 
#6 0x00007ffff63988d8 in mongo::stage_builder::buildSlotBasedExecutableTree (opCtx=0x55555696b000, collection=..., cq=..., solution=..., yieldPolicy=<optimized out>) at src/mongo/db/query/stage_builder_util.cpp:76 
#7 0x00007ffff62ced34 in mongo::(anonymous namespace)::SlotBasedPrepareExecutionHelper::buildExecutableTree (this=<optimized out>, solution=...) at src/mongo/db/query/get_executor.cpp:917 
#8 0x00007ffff62cd1ec in mongo::(anonymous namespace)::PrepareExecutionHelper<std::pair<std::unique_ptr<mongo::sbe::PlanStage, std::default_delete<mongo::sbe::PlanStage> >, mongo::stage_builder::PlanStageData>, mongo::(anonymous namespace)::SlotBasedPrepareExecutionResult>::prepare (this=0x7fffffff9f80) at src/mongo/db/query/get_executor.cpp:683 
#9 0x00007ffff62bf8cb in mongo::(anonymous namespace)::getSlotBasedExecutor (opCtx=<optimized out>, collection=<optimized out>, cq=..., requestedYieldPolicy=<optimized out>, plannerOptions=<optimized out>) at src/mongo/db/query/get_executor.cpp:1125 
#10 mongo::getExecutor (opCtx=<optimized out>, collection=<optimized out>, canonicalQuery=..., yieldPolicy=<optimized out>, plannerOptions=<optimized out>) at src/mongo/db/query/get_executor.cpp:1197 
#11 0x00007ffff62c1028 in mongo::(anonymous namespace)::_getExecutorFind (opCtx=<optimized out>, collection=<optimized out>, canonicalQuery=..., yieldPolicy=mongo::PlanYieldPolicy::YieldPolicy::YIELD_AUTO, plannerOptions=140737488325888) at src/mongo/db/query/get_executor.cpp:1219 
#12 mongo::getExecutorFind (opCtx=0x55555696b000, collection=0x7fffffffa860, canonicalQuery=..., permitYield=<optimized out>, plannerOptions=0) at src/mongo/db/query/get_executor.cpp:1233 
#13 0x00007fffee58e6a2 in mongo::(anonymous namespace)::FindCmd::Invocation::run (this=0x555556914540, opCtx=0x55555696b000, result=0x5555569145e0) at src/mongo/db/commands/find_cmd.cpp:450 
#14 0x00007ffff3eddf00 in mongo::CommandHelpers::runCommandInvocation (opCtx=0x55555696b000, request=..., invocation=0x555556914540, response=0x5555569145e0) at src/mongo/db/commands.cpp:199

We used to have an explicit check to see if the find query had an ENSURE_SORTED QSN using the bool !cq->getFindCommand().getNtoreturn(). This check was was removed in favor of checking it the find query was a legacy OP_QUERY style request under SERVER-54410.  The bool isNotLegacy doesn't produce false in this case so we lost the bail-out for ENSURE_SORTED node,

Thread 1 "db_s_shard_serv" hit Breakpoint 1, mongo::(anonymous namespace)::isQuerySbeCompatible (opCtx=0x7ffff5401220, cq=0x7ffff54596a0, plannerOptions=0) at src/mongo/db/query/get_executor.cpp:1180
1180	    const bool isNotLegacy = !CurOp::get(opCtx)->isLegacyQuery();
(gdb) n
(gdb) print isNotLegacy
$1 = true

The repro was straight forward, all we needed to do was hack the query_feature_flag.idl and run the db_s_shard_server_test binary with --suite ShardLocalTest --filter FindManyWithLimit in gdb and get a backtrace when we trip the tassert condition.

Generated at Thu Feb 08 05:38:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.