Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-51303

Lookup stage followed by $match on type uses wrong field in matching condition

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker - P1
    • Resolution: Fixed
    • Affects Version/s: 4.2.9, 4.4.1, 3.6.20, 4.0.21
    • Fix Version/s: 4.8.1, 4.4.2, 4.2.11, 3.6.21, 4.0.22
    • Component/s: None
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.4, v4.2, v4.0, v3.6
    • Steps To Reproduce:
      Hide

      See below.

      Show
      See below.
    • Sprint:
      Query 2020-10-19, Query 2020-11-02, Query 2020-11-16

      Description

      Issue Status as of Nov 12, 2020

      ISSUE DESCRIPTION AND IMPACT

      Matching by $type on a looked up field after a $lookup stage in an aggregation operation leads to incorrect results, because the $match on $type is pushed into the $lookup without removing the reference to the new field created by $lookup. This failure is unique to the $type operator.

      For example, the following operation matches on "lookedup.name" after a $lookup into the "lookedup" field:

      [
        {
          $lookup: {
            from: "second",
            localField: "field",
            foreignField: "foreignField",
            as: "lookedup",
          },
        },
        { $unwind: "$lookedup" },
        { $match: { "lookedup.name": { $type: "string" } } },
      ]
      

      MongoDB incorrectly optimizes the aggregation pipeline by attempting to match on "lookedup.name" in the "second" collection, where "lookedup.name" does not exist.

      The correct behavior is for the optimized lookup stage to match on “name” instead.

      DIAGNOSIS AND AFFECTED VERSIONS

      All versions since 3.4.0 are affected by this issue.

      Running the .explain() method will help identify the incorrect matching stage.

      db.first.explain().aggregate([
        {
          $lookup: {
            from: "second",
            localField: "field",
            foreignField: "foreignField",
            as: "lookedup",
          },
        },
        { $unwind: "$lookedup" },
        { $match: { "lookedup.name": { $type: "string" } } },
      ]);
      

      If the query optimizer moves the $match stage into the $lookup stage while still referencing the field created by $lookup, the operation is impacted. For example:

      {
        $lookup: {
          from: "second",
          as: "lookedup",
          localField: "field",
          foreignField: "foreignField",
          unwinding: { preserveNullAndEmptyArrays: false },
          matching: { "lookedup.name": { $type: [2] } },
        },
      };
      

      where

      matching: {“lookedup.name”: { $type: [2] } }
      

      should be

      matching: {“name”: { $type: [2] } }
      

      REMEDIATION AND WORKAROUNDS
      This fix will be included in 4.4.2, 4.2.11, 4.0.22, 3.6.21. To work around this issue, you can add a new field "newField" that represents "lookedup.name", run the match on "newField", then unset "newField".

      db.first.aggregate([
        {
          $lookup: {
            from: "second",
            localField: "field",
            foreignField: "foreignField",
            as: "lookedup",
          },
        },
        { $unwind: "$lookedup" },
        { $addFields: { newField: "$lookedup.name" }},
        { $match: { "newField": { $type: "string" } } },
        { $unset: "newField" }
      ]);
      

      Original Description

      A $lookup stage followed by a $match on the type of a looked up field does not work. It seems to be using the wrong field name when the $match gets pushed inside the $lookup stage.

      Steps to Reproduce

      1. Create two collections as follows

      • collection "first" contains: { "field" : "value" }
      • collection "second" contains: { "foreignField" : "value", "name" : "Thomas" }

      2. Run the following aggregation on collection "first" 

      db.first.aggregate(
        [
          {$lookup: {
            from: "second", 
            localField: "field", 
            foreignField: "foreignField", 
            as: "lookedup"}
          }, 
          {$unwind: "$lookedup"}, 
          {$match: {"lookedup.name": {$type: "string"}}}
      ]); 

      3. In the explain, you can see that the $match gets pushed into the lookup stage but uses the original field name "lookedup.name" which doesn't exist in the remote collection.

       { { "$lookup" : { "from" : "second", "as" : "lookedup", "localField" : "field", "foreignField" : "foreignField", "unwinding" : { "preserveNullAndEmptyArrays" : false }, "matching" : { "lookedup.name" : { "$type" : [ 2 ] } } } } 

      If a non-$type query is used (e.g. equality match), the field instead is just "name", which is the correct behavior. 

      {{ "$lookup" : { "from" : "second", "as" : "lookedup", "localField" : "field", "foreignField" : "foreignField", "unwinding" : { "preserveNullAndEmptyArrays" : false }, "matching" : { "name" : { "$eq" : "Thomas" } } } } 

      Expected Results

      Aggregation returns the document because the field lookedup.name is a string

      Actual Results

      Aggregation returns no documents.

       

       

      Hat tip to Nathan Smyth who discovered this bug.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              jacob.evans Jacob Evans
              Reporter:
              thomasr Thomas Rueckstiess
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: