[CSHARP-3783] GroupJoin with projection yields incorrect query Created: 05/Aug/21  Updated: 28/Oct/23  Resolved: 29/Nov/21

Status: Closed
Project: C# Driver
Component/s: Linq, LINQ3
Affects Version/s: 2.13.1
Fix Version/s: 2.14.0

Type: Bug Priority: Major - P3
Reporter: Ales Jeusnik Assignee: James Kovacs
Resolution: Fixed Votes: 0
Labels: Bug
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by CSHARP-3787 Using Where in projection results in ... Closed
Epic Link: CSHARP-3615

 Description   

The C# driver translates the following IMongoQueryable incorrectly:

collection.AsQueryable()
                .Where(x =>
                x.ParentSerialNumber != null
                && x.ParentLevel != null)
                .GroupJoin(
                    collection.AsQueryable(),
                    c => c.ParentSerialNumber,
                    p => p.SerialNumber,
                    (c, p) => new
                    {
                        ChildSerialNumber = c.SerialNumber,
                        ChildLevel = c.Level,
                        ParentSerialNumber = c.ParentSerialNumber,
                        ParentLevel = c.ParentLevel,
                        ParentExists = p.Any(pa => pa.Level == c.ParentLevel),
                        ParentConsistent = p.Any(pa => pa.Level == c.ParentLevel && pa.Status == c.Status)
                    })
                .Where(x => !x.ParentConsistent)
                .OrderBy(x => x.ChildLevel)

 

It translates into:

aggregate([
   {
      "$match":{
         "ParentSerialNumber":{
            "$ne":null
         },
         "ParentLevel":{
            "$ne":null
         }
      }
   },
   {
      "$lookup":{
         "from":"collectionName",
         "localField":"ParentSerialNumber",
         "foreignField":"SerialNumber",
         "as":"p"
      }
   },
   {
      "$project":{
         "ChildSerialNumber":"$SerialNumber",
         "ChildLevel":"$Level",
         "ParentSerialNumber":"$ParentSerialNumber",
         "ParentLevel":"$ParentLevel",
         "ParentExists":{
            "$anyElementTrue":{
               "$map":{
                  "input":"$p",
                  "as":"pa",
                  "in":{
                     "$eq":[
                        "$$pa.Level",
                        "$$pa.ParentLevel"
                     ]
                  }
               }
            }
         },
         "ParentConsistent":{
            "$anyElementTrue":{
               "$map":{
                  "input":"$p",
                  "as":"pa",
                  "in":{
                     "$and":[
                        {
                           "$eq":[
                              "$$pa.Level",
                              "$$pa.ParentLevel"
                           ]
                        },
                        {
                           "$eq":[
                              "$$pa.Status",
                              "$$pa.Status"
                           ]
                        }
                     ]
                  }
               }
            }
         },
         "_id":0
      }
   },
   {
      "$match":{
         "ParentConsistent":{
            "$ne":true
         }
      }
   },
   {
      "$sort":{
         "ChildLevel":1
      }
   }
])

Note the map for ParentExists and ParentConsistent properties.

The correct query would be:

aggregate([
   {
      "$match":{
         "ParentSerialNumber":{
            "$ne":null
         },
         "ParentLevel":{
            "$ne":null
         }
      }
   },
   {
      "$lookup":{
         "from":"collectionName",
         "localField":"ParentSerialNumber",
         "foreignField":"SerialNumber",
         "as":"p"
      }
   },
   {
      "$project":{
         "ChildSerialNumber":"$SerialNumber",
         "ChildLevel":"$Level",
         "ParentSerialNumber":"$ParentSerialNumber",
         "ParentLevel":"$ParentLevel",
         "ParentExists":{
            "$anyElementTrue":{
               "$map":{
                  "input":"$p",
                  "as":"pa",
                  "in":{
                     "$eq":[
                        "$$pa.Level",
                        "$ParentLevel"
                     ]
                  }
               }
            }
         },
         "ParentConsistent":{
            "$anyElementTrue":{
               "$map":{
                  "input":"$p",
                  "as":"pa",
                  "in":{
                     "$and":[
                        {
                           "$eq":[
                              "$$pa.Level",
                              "$ParentLevel"
                           ]
                        },
                        {
                           "$eq":[
                              "$$pa.Status",
                              "$Status"
                           ]
                        }
                     ]
                  }
               }
            }
         },
         "_id":0
      }
   },
   {
      "$match":{
         "ParentConsistent":{
            "$ne":true
         }
      }
   },
   {
      "$sort":{
         "ChildLevel":1
      }
   }
])

Any such projection, even if you try to do a Select projection after GroupJoin and with any linq method yields the same incorrect query.



 Comments   
Comment by James Kovacs [ 30/Nov/21 ]

Hi, ales.jeusnik@inel.com,

Thank you for confirming that this particular issue is fixed in LINQ3.

We are sorry to hear that you have encountered InvalidOperationExceptions with other previously working LINQ2 queries. That is definitely not expected. In developing LINQ3 we successfully ran LINQ3 against our suite of existing LINQ2 tests. Thus you are exercising LINQ features that we do not have adequate test coverage for. It would be extremely helpful if you could open a CSHARP ticket for each type of failing query so that we can investigate and resolve these issues quickly. Thank you in advance.

Sincerely,
James

Comment by Ales Jeusnik [ 30/Nov/21 ]

Hello James, sorry for the late reply.

We have tried the LINQ3 provider and it does indeed generate the correct MQL for this example.

Sadly though we cannot use it as it still has other bugs breaking many queries that used to work.
The exception below is quite common.

  Name Value Type
InnerException {"Operation is not valid due to the current state of the object."} System.Exception {System.InvalidOperationException}

 

Comment by PM Bot [ 29/Nov/21 ]

There hasn't been any recent activity on this ticket, so we're resolving it. Thanks for reaching out! Please feel free to comment on this if you're able to provide more information.

Comment by James Kovacs [ 12/Nov/21 ]

Hi, ales.jeusnik@inel.com,

This issue has been fixed in the new LINQ provider (known as LINQ3) which will be included in the upcoming 2.14 release. 2.14.0-beta1 is available now for testing.

Configure your MongoClientSettings to use LinqProvider.V3 if you want to use this functionality.

To configure a client to use the LINQ3 provider use code like the following

var connectionString = "mongodb://localhost";
var clientSettings = MongoClientSettings.FromConnectionString(connectionString);
clientSettings.LinqProvider = LinqProvider.V3;
var client = new MongoClient(clientSettings);

Please let us know if this resolves your issue.

Sincerely,
James

Comment by James Kovacs [ 09/Aug/21 ]

Hi, ales.jeusnik@inel.com,

We have confirmed that our current LINQ implementation generates incorrect queries as you demonstrated. When performing a GroupJoin with outer and inner documents, it erroneously compares inner documents with themselves rather than inner documents with the outer document.

I have simplified the example that you provided and renamed some variables to hopefully better highlight the issue:

var query = collection.AsQueryable()
    .GroupJoin(
        collection.AsQueryable(),
        outer => outer.ParentSerialNumber,
        inner => inner.SerialNumber,
        (outer, inner) => new
        {
            ParentConsistent = inner.Any(item => item.Status == outer.Status)
        });

This generates the following MQL:

aggregate([
  { "$lookup" :
    {
      "from" : "coll",
      "localField" : "ParentSerialNumber",
      "foreignField" : "SerialNumber",
      "as" : "inner"
    }
  },
  { "$project" :
    {
      "ParentConsistent" : {
        "$anyElementTrue" : {
          "$map" : {
            "input" : "$inner",
            "as" : "item",
            "in" : { "$eq" : ["$$item.Status", "$$item.Status"] }
          }
        }
      },
      "_id" : 0
    }
  }
])

Notably the generated MQL incorrectly compares whether $$item.Status equals $$item.Status whereas the original LINQ query compared item.Status == outer.Status.

We are currently working on a new LINQ provider, which is nearing completion. Testing this same query on the new provider, we get the following MQL:

test.coll.Aggregate([
  { "$project" :
    { "_outer" : "$$ROOT",
      "_id" : 0 }
  },
  { "$lookup" :
    { "from" : "coll",
      "localField" : "_outer.ParentSerialNumber",
      "foreignField" : "SerialNumber",
      "as" : "_inner"
    }
  },
  { "$project" :
    {
      "ParentConsistent" : {
        "$anyElementTrue" : {
          "$map" : {
            "input" : "$_inner",
            "as" : "item",
            "in" : { "$eq" : ["$$item.Status", "$_outer.Status"] }
          }
        }
      },
      "_id" : 0
    }
  }
])

The new LINQ provider is generating the correct MQL comparing the inner and outer documents.

We intend to make the new LINQ provider available on an opt-in basis in the next release of the driver. We will update this issue when the new LINQ implementation is available and you can confirm that it resolves both this issue and CSHARP-3787.

Sincerely,
James

Comment by Esha Bhargava [ 06/Aug/21 ]

ales.jeusnik@inel.com Thank you for reporting this issue. We'll look into it and get back to you soon.

Comment by Ales Jeusnik [ 05/Aug/21 ]

Using the latest C# driver version (2.13.1) and latest MongoDB community server version (5.0.2).

Generated at Wed Feb 07 21:46:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.