[SERVER-82410] DocumentSourceListSearchIndexes should hold owned copy of command object Created: 24/Oct/23  Updated: 21/Dec/23  Resolved: 25/Oct/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 7.0.2, 6.0.11, 7.1.0
Fix Version/s: 7.2.0-rc0, 6.0.12, 7.0.4

Type: Bug Priority: Major - P3
Reporter: James Wahlin Assignee: James Wahlin
Resolution: Fixed Votes: 0
Labels: bkp
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Fix
Related
related to SERVER-83137 Consider additional validation of siz... Closed
related to SERVER-82945 Increase sharded search / vectorSearc... Backlog
is related to SERVER-74863 Implement $listSearchIndexes aggregat... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.1, v7.0, v6.0
Sprint: QI 2023-10-30
Participants:
Case:

 Description   
Issue summary

Symptoms & User-visible impact:

Sharded clusters on versions 6.0.7-6.0.11 and 7.0.0-7.0.3 can see mongod process crashes as a result of a use-after-free memory issue in the $listSearchIndexes aggregation stage.

Key Diagnostics:

  • Server mongod process may crash with an Invalid Access at address error and Segmentation fault.
  • Log file may contain an Invariant failure referring to the file src/mongo/util/future_impl.h, such as:

msg":"Invariant failure","attr":{"expr":"!callback","file":"src/mongo/util/future_impl.h","line":443}}

  • Segmentation faults with backtraces including DocumentSourceListSearchIndexes, such as:

_ZN5mongo18stack_trace_detail12_GLOBAL__N_117getStackTraceImplERKNS1_7OptionsE.constprop.205
_ZN5mongo15printStackTraceEv
abruptQuitWithAddrSignal
_L_unlock_13
__memcpy_ssse3_back
_ZN5mongo31DocumentSourceListSearchIndexes9doGetNextEv
_ZN5mongo14DocumentSource7getNextEv
_ZN5mongo8Pipeline7getNextEv
_ZN5mongo20PlanExecutorPipeline11_tryGetNextEv
_ZN5mongo20PlanExecutorPipeline8_getNextEv
_ZN5mongo20PlanExecutorPipeline15getNextDocumentEPNS_8DocumentEPNS_8RecordIdE
_ZN5mongo20PlanExecutorPipeline7getNextEPNS_7BSONObjEPNS_8RecordIdE
_ZN5mongo12_GLOBAL__N_110GetMoreCmd10Invocation28acquireLocksAndIterateCursorEPNS_16OperationContextEPNS_3rpc21ReplyBuilderInterfaceERNS_15ClientCursorPinEPNS_5CurOpE
_ZN5mongo12_GLOBAL__N_110GetMoreCmd10Invocation3runEPNS_16OperationContextEPNS_3rpc21ReplyBuilderInterfaceE
_ZN5mongo14CommandHelpers20runCommandInvocationEPNS_16OperationContextERKNS_12OpMsgRequestEPNS_17CommandInvocationEPNS_3rpc21ReplyBuilderInterfaceE
_ZN5mongo14CommandHelpers20runCommandInvocationESt10shared_ptrINS_23RequestExecutionContextEES1_INS_17CommandInvocationEENS_9transport15ServiceExecutor14ThreadingModelE
_ZN5mongo12_GLOBAL__N_120runCommandInvocationESt10shared_ptrINS_23RequestExecutionContextEES1_INS_17CommandInvocationEE
_ZN5mongo12_GLOBAL__N_114RunCommandImpl11_runCommandEv
_ZN5mongo12_GLOBAL__N_132RunCommandAndWaitForWriteConcern24_runCommandWithFailPointEv
_ZN5mongo12_GLOBAL__N_132RunCommandAndWaitForWriteConcern8_runImplEv
_ZN5mongo12_GLOBAL__N_114RunCommandImpl3runEv
_ZN5mongo12_GLOBAL__N_119ExecCommandDatabase12_commandExecEv
_ZZN5mongo12_GLOBAL__N_114executeCommandESt10shared_ptrINS0_13HandleRequest16ExecutionContextEEENUlvE0_clEv
_ZZN5mongo15unique_functionIFvPNS_14future_details15SharedStateBaseEEE8makeImplIZNS1_10FutureImplINS1_8FakeVoidEE16makeContinuationIvZZNOS9_4thenIZNS_12_GLOBAL__N_114executeCommandESt10shared_ptrINSC_13HandleRequest16ExecutionContextEEEUlvE0_EEDaOT_ENKUlvE1_clEvEUlPNS1_15SharedStateImplIS8_EESN_E_EENS7_ISI_EEOT0_EUlS3_E_EEDaSJ_EN12SpecificImpl4callEOS3_
_ZN5mongo14future_details15SharedStateBase20transitionToFinishedEv
_ZN5mongo14future_details10FutureImplINS0_8FakeVoidEE11generalImplIZNOS3_17propagateResultToEPNS0_15SharedStateImplIS2_EEEUlOS2_E_ZNOS3_17propagateResultToES7_EUlONS_6StatusEE0_ZNOS3_17propagateResultToES7_EUlvE1_EEDaOT_OT0_OT1_
_ZZN5mongo15unique_functionIFvPNS_14future_details15SharedStateBaseEEE8makeImplIZNS1_10FutureImplINS1_8FakeVoidEE16makeContinuationIvZZNOS9_4thenIZNS_12_GLOBAL__N_114executeCommandESt10shared_ptrINSC_13HandleRequest16ExecutionContextEEEUlvE_EEDaOT_ENKUlvE1_clEvEUlPNS1_15SharedStateImplIS8_EESN_E_EENS7_ISI_EEOT0_EUlS3_E_EEDaSJ_EN12SpecificImpl4callEOS3_
_ZN5mongo14future_details15SharedStateBase20transitionToFinishedEv
_ZN5mongo12_GLOBAL__N_114executeCommandESt10shared_ptrINS0_13HandleRequest16ExecutionContextEE
_ZN5mongo12_GLOBAL__N_116receivedCommandsESt10shared_ptrINS0_13HandleRequest16ExecutionContextEE
_ZN5mongo12_GLOBAL__N_115CommandOpRunner3runEv
_ZN5mongo23ServiceEntryPointCommon13handleRequestEPNS_16OperationContextERKNS_7MessageESt10unique_ptrIKNS0_5HooksESt14default_deleteIS8_EE
_ZN5mongo23ServiceEntryPointMongod13handleRequestEPNS_16OperationContextERKNS_7MessageE
_ZN5mongo9transport19ServiceStateMachine4Impl14processMessageEv
_ZN5mongo9transport19ServiceStateMachine4Impl12startNewLoopERKNS_6StatusE
_ZZN5mongo15unique_functionIFvNS_6StatusEEE8makeImplIZNS_9transport19ServiceStateMachine4Impl15scheduleNewLoopES1_EUlS1_E_EEDaOT_EN12SpecificImpl4callEOS1_
_ZZN5mongo15unique_functionIFvNS_6StatusEEE8makeImplIZNS_9transport26ServiceExecutorSynchronous18runOnDataAvailableERKSt10shared_ptrINS5_7SessionEES3_EUlS1_E_EEDaOT_EN12SpecificImpl4callEOS1_
_ZZN5mongo15unique_functionIFvvEE8makeImplIZNS_9transport15ServiceExecutor8scheduleENS0_IFvNS_6StatusEEEEEUlvE_EEDaOT_EN12SpecificImpl4callEv
_ZZN5mongo15unique_functionIFvvEE8makeImplIZNS_9transport26ServiceExecutorSynchronous12scheduleTaskES2_NS4_15ServiceExecutor13ScheduleFlagsEEUlvE0_EEDaOT_EN12SpecificImpl4callEv
_ZZN5mongo15unique_functionIFvvEE8makeImplIZNS_25launchServiceWorkerThreadES2_EUlvE2_EEDaOT_EN12SpecificImpl4callEv
_ZN5mongo12_GLOBAL__N_17runFuncEPv
start_thread
clone

  • Users who use Compass may have observed an increase in failures as a result of this issue given the 1.40.0 release of Compass which started offering support for Search indexes.
  • In some situations, the server mongod process may become deadlocked after failing to process the abort signal due to SERVER-82459.

Root cause:

A use-after-free error within the $listSearchIndexes aggregation pipeline stage resulted in a data structure pointing to a freed location within Heap memory. This allows for a wide variety of failure scenarios where threads unrelated to the operation which issued the $listSearchIndexes operation could terminate the server process with Invalid Access segmentation faults.

Fix or Remediations:

  • Upgrade to 6.0.12 or 7.0.4 when available.
  • Reduce or stop the use of the $listSearchIndexes aggregation stage.
    • This may require using a pre-1.40.0 version of Compass until updating to a version of MongoDB containing a fix for this issue.

Previous Description

Current copy is unowned and at risk of being destroyed while held



 Comments   
Comment by James Wahlin [ 25/Oct/23 ]

For reference here is the relevant portion of the stack trace tied to this bug:

/data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/util/stacktrace_posix.cpp:492:44: mongo::printStackTrace()
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/util/signal_handlers_synchronous.cpp:295:28: abruptQuitWithAddrSignal
 ??:0:0: ??
 ??:0:0: ??
 /usr/include/bits/string3.h:51:33: memcpy
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/bson/util/builder.h:397:19: mongo::BasicBufBuilder<mongo::SharedBufferAllocator>::appendBuf(void const*, unsigned long)
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/bson/util/builder.h:395:10: mongo::BasicBufBuilder<mongo::SharedBufferAllocator>::appendBuf(void const*, unsigned long)
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/bson/bsonobjbuilder.h:165:9: mongo::BSONObjBuilderBase<mongo::BSONObjBuilder, mongo::BufBuilder>::append(mongo::StringData, mongo::BSONObj)
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/db/modules/enterprise/src/search/document_source_list_search_indexes.cpp:69:39: mongo::DocumentSourceListSearchIndexes::doGetNext()
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/db/pipeline/document_source.h:356:30: mongo::DocumentSource::getNext()
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/db/pipeline/pipeline.cpp:504:48: mongo::Pipeline::getNext()
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/db/pipeline/plan_executor_pipeline.cpp:133:31: mongo::PlanExecutorPipeline::_tryGetNext()
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/db/pipeline/plan_executor_pipeline.cpp:121:32: mongo::PlanExecutorPipeline::_getNext()
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/db/pipeline/plan_executor_pipeline.cpp:103:30: mongo::PlanExecutorPipeline::getNextDocument(mongo::Document*, mongo::RecordId*)
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/db/pipeline/plan_executor_pipeline.cpp:84:37: mongo::PlanExecutorPipeline::getNext(mongo::BSONObj*, mongo::RecordId*)
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/db/commands/getmore_cmd.cpp:387:72: generateBatch
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/db/commands/getmore_cmd.cpp:664:56: mongo::(anonymous namespace)::GetMoreCmd::Invocation::acquireLocksAndIterateCursor(mongo::OperationContext*, mongo::rpc::ReplyBuilderInterface*, mongo::ClientCursorPin&, mongo::CurOp*)
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/db/commands/getmore_cmd.cpp:777:41: mongo::(anonymous namespace)::GetMoreCmd::Invocation::run(mongo::OperationContext*, mongo::rpc::ReplyBuilderInterface*)
 /data/mci/78d61958954f3efb14daad35a28f21e7/src/src/mongo/db/commands.cpp:213:20: mongo::CommandHelpers::runCommandInvocation(mongo::OperationContext*, mongo::OpMsgRequest const&, mongo::CommandInvocation*, mongo::rpc::ReplyBuilderInterface*) 

Generated at Thu Feb 08 06:49:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.