Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-59951

Make object form of the '_id' group-by expression work to handle multiple group-by keys.

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 5.3.0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Fully Compatible
    • QE 2021-11-15, QE 2021-11-29, QE 2021-12-13, QE 2021-12-27, QE 2022-01-10, QE 2022-01-24

      Currently $group SBE stage builder (SlotBasedStageBuilder::buildGroup) does not support object form of the '_id' group-by expression like _id: {a: "$a", b: "$b"}. Two ideas were proposed so far.

      1) One idea is to pass through ExpressionObject from DocumentSourceGroup to GroupNode QSN to SlotBasedStageBuilder::buildGroup and implement the expression walker for ExpressionObject.

      • We would get $object implementation almost for free too.
      • The group stage's group key becomes a full document with field names which would be less performing compared to multiple group-by keys just like the second idea.
      • We would get the same behavior as the classic engine for generated _id documents.
      • We need to modify DocumentSourceGroup and GroupNode. There would be chaining changes.

      2) Another idea is to insert a mkbson stage manually to compose a _id document out of multiple group-by key expressions before returning a result document.

      • We don't need to modify DocumentSourceGroup and GroupNode. We just need to add some logic to SlotBasedStageBuilder::buildGroup.
      • The group stage's group key becomes multiple group-by key without field names which would be better performing compared to the first idea.
      • We would not get the same behavior as the classic engine since passed-through group-by key expressions do not follow the original order in _id document specification.

      Maybe we can follow a hybrid approach to get the best out of two approaches above though code changes would be bigger.

      1. Pass through ExpressionObject from DocumentSourceGroup to GroupNode QSN to SlotBasedStageBuilder::buildGroup and implement the expression walker for ExpressionObject.
      2. Extract slots and expressions for _id document fields from the generated SBE PlanStage tree from ExpressionObject to pass them to makeHashAgg inside SlotBasedStageBuilder::buildGroup.

      Pros:

      • We would get the same behavior as the classic engine for generated _id documents.
      • We would get the better performance for group stage.
      • We would get $object implementation almost for free too.

      Cons:

      • Code changes would be bigger like more changes to SlotBasedStageBuilder::buildGroup and DocumentSourceGroup and GroupNode and chaining changes.

            Assignee:
            yoonsoo.kim@mongodb.com Yoon Soo Kim
            Reporter:
            yoonsoo.kim@mongodb.com Yoon Soo Kim
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: