Introduce MongoExtensionDistributedPlanLogic to the Extensions API

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Integration
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      MongoExtensionDistributedPlanLogic

      MongoExtensionDistributedPlanLogic is the abstraction representing what is required to execute an extension stage in a sharded environment. It closely resembles the DistributedPlanLogic abstraction, which is exposed by DocumentSource::distributedPlanLogic, and is used during distributed planning to determine the split point of the pipeline - that is, which parts of the pipeline should run on the shards, and which should run on the merging node. The table below describes the various properties offered by the MongoExtensionDistributedPlanLogic.

      Function Name Return Type Parameters Description
      destroy
      •  
      •  
      Destroys object and frees related resources.
      get_shards_pipeline
      • MongoExtensionStatus*
      • MongoExtensionByteBuf** (output)
      Returns the pipeline to execute on each shard in parallel.

      On success, if the stage has a component that can run on the shards, populates the MongoExtensionByteBuf pointer with a serialized BSON message of the shards pipeline. Note that we will restrict this to only a single shards stage for now for parity with the DistributedPlanLogic shardsStage. If there are cases in the future where an extensions stage may return more than one shards stage, we will remove that restriction and make corresponding modifications to DistributedPlanLogic. 

      Ownership of the buffer is transferred to the Host.

      The MongoExtensionByteBuf will not be allocated if the stage must run exclusively on the merging node..|

      get_merging_pipeline
      • MongoExtensionStatus*
      • MongoExtensionByteBuf** (output)
      Returns stage(s) that will be run on the merging node, or nullptr if nothing needs to run on the merging node.

      On success, if the stage has a component that can run on the merging node, populates the MongoExtensionByteBuf pointer with a serialized BSON message of the merge pipeline. 

      Ownership of the buffer is transferred to the Host.

      The MongoExtensionByteBuf will not be allocated if nothing can run on the merging node.|

      get_sort_pattern
      • MongoExtensionStatus*
      • MongoExtensionByteBuf** (output)
      Returns which fields are ascending and which are descending when merging streams together. 

      On success, if SortKey metadata must be added by the stage during execution, the MongoExtensionByteBuf is populated with a serialized BSON message containing “SortPattern”. 

      Ownership of the buffer is transferred to the host.

      The MongoExtensionByteBuf will not be allocated if no sort key metadata is added by the stage.|

      As part of this ticket:
      1. Add the above structure + vtable to the public API header

      2. Implement a Host adapter for DistributedPlanLogic
      3. Implement an SDK adapter for DistributedPlanLogic
      4. Implement unit tests for testing each of the functions implemented by this new abstraction. 

            Assignee:
            Unassigned
            Reporter:
            Santiago Roche
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: