Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-91512

BlockToRowStage::doSaveState discards the unowned values while still deblocking values

    • Query Optimization
    • Fully Compatible
    • ALL
    • v8.0, v7.3
    • Hide

      NOTE: Consider tuning internalDocumentSourceCursorInitialBatchSize and internalDocumentSourceCursorBatchSizeBytes to reproduce without such many documents.

      1. Compile mongod with ASAN
        1. buildscripts/scons.py --variables-files=etc/scons/mongodbtoolchain_stable_clang.vars --link-model=static --sanitize=address --allocator=system --opt=debug --dbg=on --ninja ICECC=icecc CCACHE=ccache 
        1. Make mongod running
      1. Run the attached script SERVER-91512.js to create time series collection
      2. Run the aggregation pipeline
        1. db.fuzzer_coll.aggregate([
              {$match: {$nor: [{"str": {$gt: "Credit"}}]}},
              {$sort: {_id: 1}},
              {$group: {_id: NumberLong("-1457"), num: {$first: {$bitAnd: [NumberLong("16007")]}}}}
          ]);
      Show
      NOTE: Consider tuning internalDocumentSourceCursorInitialBatchSize and internalDocumentSourceCursorBatchSizeBytes to reproduce without such many documents. Compile mongod with ASAN buildscripts/scons.py --variables-files=etc/scons/mongodbtoolchain_stable_clang.vars --link-model= static --sanitize=address --allocator=system --opt=debug --dbg=on --ninja ICECC=icecc CCACHE=ccache Make mongod running Run the attached script SERVER-91512 .js to create time series collection Run the aggregation pipeline db.fuzzer_coll.aggregate([     {$match: {$nor: [{ "str" : {$gt: "Credit" }}]}},     {$sort: {_id: 1}},     {$group: {_id: NumberLong( "-1457" ), num: {$first: {$bitAnd: [NumberLong( "16007" )]}}}} ]);
    • 200

      The root cause seems to be the interaction between disableSlotAccess and saveState on block_to_row stage. Usually, the block_to_row stage was supposed to make the copies of unowned values in doSaveState. However, the aggregation pipeline from the fuzzer test made block_to_row unable to make the copies by the following event sequence:

      1. DocumentSourceCursor::loadBatch() decides to call _exec->releaseAllAcquiredResources()
      2. PlanExecutorSBE::saveState() decides to save state with discardSlotState = true
      3. CanChangeState::saveState() calls disableSlotAccess() for each stages
      4. CanChangeState::saveState() calls doSaveState() for each stages
      5. BlockToRowStage::doSaveState decides not to make the copies of unowned values because the slot access has been disabled.

      This problem seems to only happen to block_to_row because the other SBE stages do not iterate data in-the-middle (e.g. in the block) when slot access is being disabled. Therefore, it is safe to discard the values in saveState() when the access has been disabled. But the same assumption cannot be applied to BlockToRowStage.

            Assignee:
            hana.pearlman@mongodb.com Hana Pearlman
            Reporter:
            chii.huang@mongodb.com Chi-I Huang
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: