[SERVER-31212] JSON Schema parser and MatchExpressions must gracefully handle nested schemas Created: 21/Sep/17  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Kyle Suarez Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Query Optimization
Sprint: Query 2017-10-02
Participants:

 Description   

It seems that the JSON Schema parser is susceptible to stack overflow if given a sufficiently nested schema. We definitely can't allow the server to crash when parsing a schema, so recursion must be limited in some fashion.

Reproduction steps vary based on the hardware of the machine, but this schema was able to trigger it at least once on my workstation: https://gist.github.com/ksuarz/d5caa6fc8a9fb003e22cf92eb78596a4



 Comments   
Comment by Kyle Suarez [ 29/Sep/17 ]

Further investigation confirms that this is a stack overflow: the server receives a SIGSEGV because its stack has grown so large it exceeds its allowed memory area. However, this problem is probably quite difficult to encounter in the wild. A production-quality server with decent hardware and a release build should not run into this overflow; I can only reproduce it when compiling the server with all optimizations disabled.

Per in-person conversation with david.storch, we're going to put this ticket back on the backlog and consider it when we fix other stack overflow-related issues.

Comment by Kyle Suarez [ 25/Sep/17 ]

I wrote a quick integration test:

diff --git a/src/mongo/db/nesting_depth_test.cpp b/src/mongo/db/nesting_depth_test.cpp
index fd14418..abca5c5 100644
--- a/src/mongo/db/nesting_depth_test.cpp
+++ b/src/mongo/db/nesting_depth_test.cpp
@@ -471,6 +471,39 @@ TEST_F(NestingDepthFixture, CannotReplaceArrayToExceedDepthLimit) {
         &updateCmd, largeButValidDepth, BSONDepth::getMaxDepthForUserStorage() + 1);
     assertWriteError(kCollectionName, updateCmd.obj(), ErrorCodes::Overflow);
 }
+
+/**
+ * Appends a recursive JSON Schema to 'builder' that is nested 'depth' levels deep. Doesn't include
+ * the "$jsonSchema" at the top level.
+ */
+void appendJSONSchemaHelper(BSONObjBuilder* builder, size_t depth) {
+    if (depth == 0) {
+        return;
+    }
+
+    BSONObjBuilder properties(builder->subobjStart("properties"));
+    appendJSONSchemaHelper(&properties, depth - 1);
+    properties.doneFast();
+}
+
+void appendFindCommandWithJSONSchema(BSONObjBuilder* builder, size_t depth) {
+    builder->append("find", kCollectionName);
+    BSONObjBuilder filter(builder->subobjStart("filter"));
+    {
+        BSONObjBuilder schema(filter.subobjStart("$jsonSchema"));
+        appendJSONSchemaHelper(&schema, depth - 1);
+        schema.doneFast();
+    }
+    filter.doneFast();
+    builder->doneFast();
+}
+
+TEST_F(NestingDepthFixture, CanUseNestedJSONSchemaAtDepthLimit) {
+    const auto largeButValidDepth = BSONDepth::getMaxDepthForUserStorage() - 4;
+    BSONObjBuilder findCmd;
+    appendFindCommandWithJSONSchema(&findCmd, largeButValidDepth);
+    assertCommandOK(kCollectionName, findCmd.obj());
+}
 }  // namespace
 }  // namespace executor
 }  // namespace mongo

I ran the nesting depth test against a mongod running on log level 5 and captured this backtrace from LLDB:

(lldb) thread backtrace
error: mongod 0x0e052ec0: DW_TAG_member '_M_local_buf' refers to type 0x0e05b08c which extends beyond the bounds of 0x0e052eb8
* thread #45: tid = 15463, 0x00005555598df22c mongod`MallocBlock::ProcessFreeQueue(b=0x00007ffff36298c0, size=568, max_free_queue_size=10485760) + 92 at debugallocation.cc:585, name = 'conn18', stop reason = address access protected (fault address: 0x7ffff7e53ea8)
  * frame #0: 0x00005555598df22c mongod`MallocBlock::ProcessFreeQueue(b=0x00007ffff36298c0, size=568, max_free_queue_size=10485760) + 92 at debugallocation.cc:585
    frame #1: 0x00005555598e6993 mongod`MallocBlock::Deallocate(this=0x00007ffff36298c0, type=-271733872, given_size=0) + 307 at debugallocation.cc:571
    frame #2: 0x00005555598dc20a mongod`DebugDeallocate(ptr=0x00007ffff36298e0, type=-271733872, given_size=0) + 346 at debugallocation.cc:1039
    frame #3: 0x000055555998499e mongod`tc_free(ptr=0x00007ffff36298e0) + 110 at debugallocation.cc:1229
    frame #4: 0x000055555722daef mongod`mongo::intrusive_ptr_release(h=0x00007ffff36298e0) + 47 at shared_buffer.h:117
    frame #5: 0x000055555722da9a mongod`boost::intrusive_ptr<mongo::SharedBuffer::Holder>::~intrusive_ptr(this=0x00007ffff7e54cd0) + 42 at intrusive_ptr.hpp:97
    frame #6: 0x000055555722d9e8 mongod`boost::intrusive_ptr<mongo::SharedBuffer::Holder>::operator=(this=0x00007ffff7e54ec0, rhs=0x00007ffff7e54d50) + 72 at intrusive_ptr.hpp:121
    frame #7: 0x000055555722d91f mongod`mongo::SharedBuffer::operator=(this=0x00007ffff7e54ec0, (null)=0x00007ffff7e54d50) + 47 at shared_buffer.h:42
    frame #8: 0x0000555559713762 mongod`mongo::SharedBufferAllocator::free(this=0x00007ffff7e54ec0) + 66 at builder.h:99
    frame #9: 0x00005555597136f8 mongod`mongo::_BufBuilder<mongo::SharedBufferAllocator>::reset(this=0x00007ffff7e54ec0, maxSize=10) + 72 at builder.h:178
    frame #10: 0x000055555971369b mongod`mongo::StringBuilderImpl<mongo::SharedBufferAllocator>::reset(this=0x00007ffff7e54ec0, maxSize=10) + 27 at builder.h:474
    frame #11: 0x000055555971316d mongod`mongo::escape[abi:cxx11](sd=(_data = "properties", _size = 10), escape_slash=false) + 109 at stringutils.cpp:191
 
(271 frames of jsonString)
 
    frame #283: 0x00005555596d16db mongod`mongo::BSONObj::jsonString[abi:cxx11](this=0x00007ffff7ecdfc8, format=Strict, pretty=0, isArray=false) const + 587 at bsonobj.cpp:126
    frame #284: 0x000055555930c130 mongod`mongo::JSONSchemaParser::parse(schema=BSONObj at 0x00007ffff7ecdfc8, ignoreUnknownKeywords=false) + 320 at json_schema_parser.cpp:1558
    frame #285: 0x00005555592a1575 mongod`mongo::MatchExpressionParser::_parse(this=0x00007ffff7ece370, obj=0x00007ffff332cae0, collator=0x0000000000000000, expCtx=0x00007ffff7ecea38, allowedFeatures=18446744073709551607, topLevel=true) + 7749 at expression_parser.cpp:621
    frame #286: 0x0000555557609692 mongod`mongo::MatchExpressionParser::parse(obj=0x00007ffff332cae0, collator=0x0000000000000000, expCtx=0x00007ffff7ecea38, extensionsCallback=0x00007ffff7ecea40, allowedFeatures=18446744073709551607) + 210 at expression_parser.h:137
    frame #287: 0x00005555591b843d mongod`mongo::CanonicalQuery::canonicalize(opCtx=0x00007ffff2ff1a20, qr=unique_ptr<mongo::QueryRequest, std::default_delete<mongo::QueryRequest> > at 0x00007ffff7ecea30, expCtx=0x00007ffff7ecea38, extensionsCallback=0x00007ffff7ecea40, allowedFeatures=18446744073709551607) + 861 at canonical_query.cpp:150
    frame #288: 0x00005555575ea296 mongod`mongo::FindCmd::run(this=0x000055555af28228, opCtx=0x00007ffff2ff1a20, dbname=0x00007ffff7ecf0e8, cmdObj=0x00007ffff7ed0b78, result=0x00007ffff7ecf618) + 1334 at find_cmd.cpp:275
    frame #289: 0x000055555904a89b mongod`mongo::BasicCommand::enhancedRun(this=0x000055555af28228, opCtx=0x00007ffff2ff1a20, request=0x00007ffff7ed0b78, result=0x00007ffff7ecf618) + 155 at commands.cpp:390
    frame #290: 0x0000555559048d9d mongod`mongo::Command::publicRun(this=0x000055555af28228, opCtx=0x00007ffff2ff1a20, request=0x00007ffff7ed0b78, result=0x00007ffff7ecf618) + 77 at commands.cpp:328
    frame #291: 0x000055555758c986 mongod`mongo::(anonymous namespace)::runCommandImpl(opCtx=0x00007ffff2ff1a20, command=0x000055555af28228, request=0x00007ffff7ed0b78, replyBuilder=0x00007ffff0f050e0, startOperationTime=(_time = 0)) + 1766 at service_entry_point_mongod.cpp:446
    frame #292: 0x000055555758a94c mongod`mongo::(anonymous namespace)::execCommandDatabase(opCtx=0x00007ffff2ff1a20, command=0x000055555af28228, request=0x00007ffff7ed0b78, replyBuilder=0x00007ffff0f050e0) + 7804 at service_entry_point_mongod.cpp:714
    frame #293: 0x0000555557587f33 mongod`mongo::(anonymous namespace)::runCommands(this=0x00007ffff7ed0ea0)::$_4::operator()() const + 3091 at service_entry_point_mongod.cpp:831
    frame #294: 0x0000555557585097 mongod`mongo::(anonymous namespace)::runCommands(opCtx=0x00007ffff2ff1a20, message=0x00007ffff29ff2c8) + 103 at service_entry_point_mongod.cpp:787
    frame #295: 0x0000555557583e19 mongod`mongo::ServiceEntryPointMongod::handleRequest(this=0x00007ffff5d2c2c0, opCtx=0x00007ffff2ff1a20, m=0x00007ffff29ff2c8) + 1081 at service_entry_point_mongod.cpp:1092
    frame #296: 0x00005555575a1f56 mongod`mongo::ServiceStateMachine::_processMessage(this=0x00007ffff29ff230, guard=0x00007ffff7ed20f0) + 646 at service_state_machine.cpp:307
    frame #297: 0x000055555759e951 mongod`mongo::ServiceStateMachine::_runNextInGuard(this=0x00007ffff29ff230, guard=0x00007ffff7ed20f0) + 1585 at service_state_machine.cpp:401
    frame #298: 0x000055555759f16f mongod`mongo::ServiceStateMachine::runNext(this=0x00007ffff29ff230) + 159 at service_state_machine.cpp:365
    frame #299: 0x00005555575a1158 mongod`mongo::ServiceStateMachine::scheduleNext(this=0x00007ffff62916a0)::$_4::operator()() const + 24 at service_state_machine.cpp:429
    frame #300: 0x00005555575a1115 mongod`void mongo::ServiceStateMachine::_scheduleFunc<mongo::ServiceStateMachine::scheduleNext(mongo::transport::ServiceExecutor::ScheduleFlags)::$_4>(this=0x00007ffff62916a0)::$_4&&, mongo::transport::ServiceExecutor::ScheduleFlags)::{lambda()#1}::operator()() const + 21 at service_state_machine.h:155
    frame #301: 0x00005555575a0f5d mongod`std::_Function_handler<void (), void mongo::ServiceStateMachine::_scheduleFunc<mongo::ServiceStateMachine::scheduleNext(mongo::transport::ServiceExecutor::ScheduleFlags)::$_4>(mongo::ServiceStateMachine::scheduleNext(mongo::transport::ServiceExecutor::ScheduleFlags)::$_4&&, mongo::transport::ServiceExecutor::ScheduleFlags)::{lambda()#1}>::_M_invoke(__functor=0x00007ffff7ed23b0) + 29 at functional:1871
    frame #302: 0x00005555574b27be mongod`std::function<void ()>::operator(this=0x00007ffff7ed23b0)() const + 62 at functional:2267
    frame #303: 0x0000555558e9e607 mongod`mongo::transport::ServiceExecutorSynchronous::schedule(this=0x00007ffff5d15d60, task=mongo::transport::ServiceExecutor::Task at 0x00007ffff7ed23b0, flags=kMayRecurse)>, mongo::transport::ServiceExecutor::ScheduleFlags) + 295 at service_executor_synchronous.cpp:114
    frame #304: 0x000055555759f556 mongod`void mongo::ServiceStateMachine::_scheduleFunc<mongo::ServiceStateMachine::scheduleNext(mongo::transport::ServiceExecutor::ScheduleFlags)::$_4>(this=0x00007ffff29ff230, func=0x00007ffff7ed2400, flags=kMayRecurse)::$_4&&, mongo::transport::ServiceExecutor::ScheduleFlags) + 150 at service_state_machine.h:154
    frame #305: 0x000055555759e253 mongod`mongo::ServiceStateMachine::scheduleNext(this=0x00007ffff29ff230, flags=kMayRecurse) + 51 at service_state_machine.cpp:429
    frame #306: 0x000055555759d955 mongod`mongo::ServiceStateMachine::_sourceCallback(this=0x00007ffff29ff230, status=Status at 0x00007ffff7ed2990) + 549 at service_state_machine.cpp:228
    frame #307: 0x000055555759e5e6 mongod`mongo::ServiceStateMachine::_runNextInGuard(this=0x00007ffff29ff230, guard=0x00007ffff7ed2b00) + 710 at service_state_machine.cpp:388
    frame #308: 0x000055555759f16f mongod`mongo::ServiceStateMachine::runNext(this=0x00007ffff29ff230) + 159 at service_state_machine.cpp:365
    frame #309: 0x00005555575a1158 mongod`mongo::ServiceStateMachine::scheduleNext(this=0x00007ffff120b2a0)::$_4::operator()() const + 24 at service_state_machine.cpp:429
    frame #310: 0x00005555575a1115 mongod`void mongo::ServiceStateMachine::_scheduleFunc<mongo::ServiceStateMachine::scheduleNext(mongo::transport::ServiceExecutor::ScheduleFlags)::$_4>(this=0x00007ffff120b2a0)::$_4&&, mongo::transport::ServiceExecutor::ScheduleFlags)::{lambda()#1}::operator()() const + 21 at service_state_machine.h:155
    frame #311: 0x00005555575a0f5d mongod`std::_Function_handler<void (), void mongo::ServiceStateMachine::_scheduleFunc<mongo::ServiceStateMachine::scheduleNext(mongo::transport::ServiceExecutor::ScheduleFlags)::$_4>(mongo::ServiceStateMachine::scheduleNext(mongo::transport::ServiceExecutor::ScheduleFlags)::$_4&&, mongo::transport::ServiceExecutor::ScheduleFlags)::{lambda()#1}>::_M_invoke(__functor=0x00007ffff2a16480) + 29 at functional:1871
    frame #312: 0x00005555574b27be mongod`std::function<void ()>::operator(this=0x00007ffff2a16480)() const + 62 at functional:2267
    frame #313: 0x0000555558e9f099 mongod`mongo::transport::ServiceExecutorSynchronous::schedule(this=0x00007ffff6299340)>, mongo::transport::ServiceExecutor::ScheduleFlags)::$_2::operator()() const + 185 at service_executor_synchronous.cpp:131
    frame #314: 0x0000555558e9ee0d mongod`std::_Function_handler<void (), mongo::transport::ServiceExecutorSynchronous::schedule(std::function<void ()>, mongo::transport::ServiceExecutor::ScheduleFlags)::$_2>::_M_invoke(__functor=0x00007ffff120b610) + 29 at functional:1871
    frame #315: 0x00005555574b27be mongod`std::function<void ()>::operator(this=0x00007ffff120b610)() const + 62 at functional:2267
    frame #316: 0x00005555595a7d7c mongod`mongo::(anonymous namespace)::runFunc(ctx=0x00007ffff120b610) + 76 at service_entry_point_utils.cpp:55
    frame #317: 0x00007ffff708f6aa libpthread.so.0`start_thread + 202
    frame #318: 0x00007ffff6dc513d libc.so.6`clone + 109

I'm still trying to understand why the last frames are in TCMalloc code. Unsure if the warning regarding the DW_TAG_member extending out-of-bounds is relevant – like, is part of the stack trace being cut off?

*EDIT*: I tried to step over to the next instruction and got

(lldb) thread step-over
Process 15349 stopped
* thread #45: tid = 15463, 0x00005555598df22c mongod`MallocBlock::ProcessFreeQueue(b=0x00007ffff36298c0, size=568, max_free_queue_size=10485760) + 92 at debugallocation.cc:585, name = conn18, stop reason = address access protected (fault address: 0x7ffff7e53ea8)

The "address access protected" suggests to me this is a stack overflow / stack too large problem, if we've reached the outer extreme of our memory segment?

Comment by Kyle Suarez [ 22/Sep/17 ]

nicholas.zolnierz suggested that I investigate the failure with a debugger. I ran the server under

lldb -- ./mongod -vvvvv

and discovered that the process crash occurred not during parsing, but when printing a debugString at log level 5.

1558
    LOG(5) << "Parsing JSON Schema: " << schema.jsonString();
1559
 
1560
    auto translation = _parse(""_sd, schema, ignoreUnknownKeywords);
1561
    if (shouldLog(logger::LogSeverity::Debug(5)) && translation.isOK()) {
1562
        LOG(5) << "Translated schema match expression: " << translation.getValue()->toString();
1563
    }

While the stated problem definitely still exists in the parser, MatchExpression::debugString() or its underlying string builder should have a protection against this kind of thing. david.storch, should we lump that under here or should I file a separate ticket for that?

Comment by Kyle Suarez [ 21/Sep/17 ]

As discussed with david.storch, converting JSONSchemaParser::parse to iterate (with a heap-based stack) and not recurse would be a big win, though at the cost of (possibly significant) readability.

Generated at Thu Feb 08 04:26:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.