[CSHARP-4507] 2.19 Projection provides unexpected results on 4.2 and lower Created: 31/Jan/23  Updated: 27/Oct/23  Resolved: 27/Jun/23

Status: Closed
Project: C# Driver
Component/s: LINQ3
Affects Version/s: 2.19.0
Fix Version/s: None

Type: Bug Priority: Unknown
Reporter: Nick Judson Assignee: James Kovacs
Resolution: Works as Designed Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Documentation Changes Summary:

1. What would you like to communicate to the user about this feature?
2. Would you like the user to see examples of the syntax and/or executable code and its output?
3. Which versions of the driver/connector does this apply to?


 Description   

Please see this post: https://www.mongodb.com/community/forums/t/2-19-breaks-projections/211242/8

 



 Comments   
Comment by James Kovacs [ 05/Apr/23 ]

You are correct, nick@innsenroute.com. Find maps to the find command, which takes a variety of parameters including filter (aka the match clause or predicate), sort, projection, skip, limit, and more. It has defined semantics when all these fields are supplied. For example, it will filter and sort the documents prior to the skip/limit.

Aggregate.Match maps to the aggregate command, which accepts an aggregation pipeline. This is a much more flexible and expressive API. You can skip documents before or after matching them. This would change the semantics of the query and also the ability to use an index. For example, if you want to skip 1000 documents prior to a match, you can't use an index because the index has no notion of how many unindexed documents appeared before the first index entry. MongoDB's query planner can perform a variety of optimizations, such as logically pulling match stages earlier in the pipeline, but it cannot violate the semantics of the aggregation pipeline as written. As the adage goes, with great power comes great responsibility. The aggregation pipeline provides a lot of flexibility to express your query intentions, but it also allows you to write queries that are difficult/impossible for the query planner to optimize.

With any query, you can call query.ToString() to view the MQL that will be sent to the server. (You can also install the MongoDB Analyzer NuGet package, which will show you the MQL as a tooltip.) You can use the MQL to run an explain plan in the mongosh shell to examine the query plan. MongoDB Atlas also includes the MongoDB Atlas Performance Advisor, which will display slow queries and provide index recommendations. It is also worth reading Optimizing MongoDB Compound Indexes.

Hope that helps.

Comment by Nick Judson [ 04/Apr/23 ]

james.kovacs@mongodb.com - quick question regarding replacement of .Find with .Aggregate.Match:

Is it correct that `.Find` does not care about the order of each subsequent expression? Ie, the order of `.order`, `.project`, `.limit` etc. doesn't matter?

Is it correct that for `.Aggregate().Match`, the order does matter?

I noticed some performance regressions after doing a blanket replace of `.Find` with `.Aggregate().Match`, and traced it back to the order of the expressions when moving away from `.Find`.

Thanks.

Comment by José Massada [ 17/Feb/23 ]

I was working around the issue by projecting to an anonymous class but I've also found that `[BsonId]` isn't taken into account and so the return for the _id field will be the default value.

Not sure if I should open a new issue.

 

Edit: works with Aggregate + Match

 

[BsonIgnoreExtraElements]
public sealed class SomeType
{
    [BsonId]
    public int Code { get; set; }
 
    public string SomeUniqueField { get; set; }
}
 
var result = await this.collection
    .Find(st => st.SomeUniqueField == "SomeValue")
    .Project(st => new { st.Code })
    .FirstOrDefaultAsync(cancellationToken);
 
// result.Code == always the default (0)

Comment by Alessandro Giorgetti [ 16/Feb/23 ]

I also noticed the same issue event with MongoDB 6.0.4. To reproduce the issue modify the sample code above in this way:

 

Add a new property to MongoMessage and leave it uninitialized:

[DataContract(IsReference = true)]
[BsonIgnoreExtraElements]
public class MongoMessage
{
  ...
  
  public Guid? CorrelationKey2 { get; set; } 
}

Add the following code to the main test routine:

try
{
    var correlationKey2 = messageCollection.AsQueryable().Select(m => m.CorrelationKey2).FirstOrDefault();
}
catch (Exception ex)
{
    Console.WriteLine($"Exception thrown in projection: {ex.Message}");
} 

When you run the program you can see the very same error even on the most recent version of MongoDB

Comment by Remco Ros [ 13/Feb/23 ]

This also affects users of "Azure Cosmos DB for MongoDB" using version 2.19.0 of MongoDB.Driver.

Have not verified if using an aggregation pipeline (as suggested) will fix this for Cosmos, as we reverted to 2.18.0 for now,

Comment by Nick Judson [ 01/Feb/23 ]

Hi James,

Thanks for the explanation. I have switched over to using the Fluent Aggregate and can confirm that fixes the issue.

Comment by James Kovacs [ 31/Jan/23 ]

[Cross-posting my Community Forums response for easier reference.]

Hi, Nick,

Thank you for your patience and thank you for filing CSHARP-4507. I was able to reproduce the issue with the provided code along with the additional detail that it only fails on MongoDB 4.2 and earlier. I will explain the problem and then some potential workarounds while we work on a fix.

To answer your first question, MongoDB 4.2 is still a supported server version though it will reach end-of-life in April 2023. See our support policy for full details.

Now let's discuss the root cause of this issue. The problem stems from the renaming of the fields in your LINQ query. The following code will display the MQL sent to the server. The same MQL is sent to MongoDB regardless of the server version.

var query = coll.Find(filter).Project(m => new DeleteInfoWrapper(m.ObjectId, m.GridFsObjectId, m.AttachmentGridFsObjectId));
Console.WriteLine(query);

The resulting MQL is:

find({ "_id" : { "$in" : [ObjectId("63d856bb965b1cf474a00a6b"), ObjectId("63d856ba965b1cf474a00a68")] } }, { "ObjectId" : "$_id", "GridFsObjectId" : "$grid", "AttachmentGridFsObjectId" : "$agrd", "_id" : 0 })

The second argument to the `find` command is the projection. Note the syntax "CsharpFieldName": "$databaseFieldName". For example "GridFsObjectId": "$grid". This is where the problem lies. In MongoDB 4.4 and later, you could use this $fieldName syntax to rename fields in projections - whether those projections were part of a find operation or an aggregation pipeline.

However in MongoDB 4.2 and earlier, you could only use this syntax in aggregation pipelines, but not in find projections. Find projections only allowed the older, simpler syntax of fieldName: 1 to include a field. This is further complicated by the fact that for backwards compatibility, fieldName: VALUE where VALUE was truthy in the JavaScript sense. This meant that 0, false, null, and undefined are interpreted as false and pretty much everything else is interpreted as true. Thus MongoDB 4.2 (using find) sees "GridFsObjectId": "$grid" as "GridFsObjectId": true. Since there is no field in the document named GridFsObjectId, the field is omitted leading to the observed behaviour.

In MongoDB 4.4, we enhanced the find projection to use the same syntax as the aggregation pipeline. Thus MongoDB 4.4 sees "GridFsObjectId": "$grid" as "GridFsObjectId": "$grid" and correctly renames the field grid (in the database) to GridFsObjectId in the returned document.

Possible workarounds for this issue (in no particular order):

  • Upgrade to MongoDB 4.4 or later.
  • Continue using LINQ2.
  • Refactor your Fluent Find queries to Fluent Aggregate.

The first two should be self-explanatory. I will note that LINQ3 has greatly enhanced capabilities including support for new aggregation features. LINQ2 will be deprecated in an upcoming version and removed in the 3.0.0 driver. We have not announced a public timeline for the 3.0.0 driver.

Refactoring to use Fluent Aggregation is probably the most straightforward. You can use the same FilterDefinition<> and ProjectionDefinition<,> as you are currently using with Find/Project. Rather than coll.Find(filter).Project(projection) you would instead use coll.Aggregate().Match(filter).Project(projection).

var query = coll.Aggregate().Match(filter).Project(m => new DeleteInfoWrapper(m.ObjectId, m.GridFsObjectId, m.AttachmentGridFsObjectId));
Console.WriteLine(query);

The resulting MQL produces an aggregation pipeline rather than a find command, but the query results are the same:

aggregate([{ "$match" : { "_id" : { "$in" : [ObjectId("63d856bb965b1cf474a00a6b"), ObjectId("63d856ba965b1cf474a00a68")] } } }, { "$project" : { "ObjectId" : "$_id", "GridFsObjectId" : "$grid", "AttachmentGridFsObjectId" : "$agrd", "_id" : 0 } }])

This aggregation pipeline - including the projection - will work on even very old server versions. I tested it on MongoDB 3.6 and it produced the correct results.

Please follow CSHARP-4507 for the fix. Thank you again for reporting it. Let us know if you have any additional questions.

Sincerely,
James

Comment by Nick Judson [ 31/Jan/23 ]

Here is the repro (copied from the linked post):

 

using MongoDB.Driver.Core.Configuration;
using MongoDB.Driver.Linq;
using MongoDB.Driver;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using DnsClient;
using MongoDB.Bson;
using MongoDB.Bson.Serialization.Attributes;
using System.Runtime.Serialization;
 
namespace Mongo219Repro
{
    internal class Program
    {
        static async Task Main(string[] args)
        {
            var database = GetClient().GetDatabase("Test");
            var messageCollection = database.GetCollection<MongoMessage>($"Test_Message");
 
            var ckIndex = new CreateIndexModel<MongoMessage>(Builders<MongoMessage>.IndexKeys.Ascending(m => m.CorrelationKey), new CreateIndexOptions { Name = "IX_CorrelationKey", Sparse = true });
            await messageCollection.Indexes.CreateOneAsync(ckIndex).ConfigureAwait(false);
 
            var mongoMessage = new MongoMessage
            {
                AttachmentDataBytes = null,
                AttachmentSize = 1234,
                QueueType = 4,
                Data = "This is test data",
                DataType = typeof(string).FullName,
                DataSize = 55,
                SourceUri = "manual",
                CorrelationKey = Guid.NewGuid(),
                AccountNumber = "12345",
                ModifiedDateTime = DateTime.UtcNow,
                ExamId = "1234",
                HasProcessingHistory = true,
                Keywords = new List<string>(),
                PatientId = "1234",
                MessageType = "1234",
                MessageDateTime = DateTime.UtcNow,
                MessageDateTimeUtcOffset = 2,
                MessageIdentifier = "1234",
                SendingFacility = "1234",
                MessageExtension = "1234",
                PurgedOnProcessed = false,
                AttachmentGridFsObjectId = ObjectId.GenerateNewId(),
                GridFsObjectId = ObjectId.GenerateNewId(),
                ObjectId = ObjectId.GenerateNewId(),
            };
 
            await messageCollection.InsertOneAsync(mongoMessage).ConfigureAwait(false);
 
            var query = Builders<MongoMessage>.Filter.In(m => m.ObjectId, new[] { mongoMessage.ObjectId });
            var noProjection = await messageCollection.Find(query).FirstOrDefaultAsync().ConfigureAwait(false);
 
            Console.WriteLine($"Without projection: Expecting {mongoMessage.ObjectId}, got {noProjection.ObjectId}");
            Console.WriteLine($"Without projection: Expecting {mongoMessage.AttachmentGridFsObjectId}, got {noProjection.AttachmentGridFsObjectId}");
            Console.WriteLine($"Without projection: Expecting {mongoMessage.GridFsObjectId}, got {noProjection.GridFsObjectId}");
 
            var withProjection = await messageCollection.Find(query).Project(m => new { m.ObjectId, m.GridFsObjectId, m.AttachmentGridFsObjectId }).FirstOrDefaultAsync().ConfigureAwait(false);
 
            Console.WriteLine($"With projection: Expecting {mongoMessage.ObjectId}, got {withProjection.ObjectId}");
            Console.WriteLine($"With projection: Expecting {mongoMessage.AttachmentGridFsObjectId}, got {withProjection.AttachmentGridFsObjectId}");
            Console.WriteLine($"With projection: Expecting {mongoMessage.GridFsObjectId}, got {withProjection.GridFsObjectId}");
 
            var filter = Builders<MongoMessage>.Filter.Empty;
            try
            {
                var upperLimitObjectId = await messageCollection.Find(filter).Sort(Builders<MongoMessage>.Sort.Descending(m => m.ObjectId)).Limit(1).Project(m => m.ObjectId).FirstOrDefaultAsync().ConfigureAwait(false);
            }
            catch (Exception ex) 
            {
                Console.WriteLine($"Exception thrown in projection: {ex.Message}");
            }
 
            Console.ReadLine();
        }
 
        private static MongoClient GetClient()
        {
            var settings = MongoClientSettings.FromConnectionString("mongodb://localhost:27017");
 
            // default settings
            settings.ApplicationName = "Test";
            settings.ConnectTimeout = TimeSpan.FromSeconds(10);
            settings.SocketTimeout = TimeSpan.FromSeconds(60);  // query timeout
 
            // settings.UseSsl = false;
            settings.SocketTimeout = TimeSpan.FromSeconds(60);
            settings.ServerSelectionTimeout = TimeSpan.FromSeconds(10);
            //settings.LinqProvider = LinqProvider.V2;
 
            if (!string.IsNullOrWhiteSpace(settings.ReplicaSetName))
                settings.WriteConcern = WriteConcern.WMajority;
 
            return new MongoClient(settings);
        }
 
        [DataContract(IsReference = true)]
        [BsonIgnoreExtraElements]
        public class MongoMessage
        {
            [BsonId]
            [DataMember]
            public ObjectId ObjectId { get; set; }
 
            [BsonElement("mdt")]
            [BsonIgnoreIfNull]
            [DataMember]
            public DateTime? ModifiedDateTime { get; set; }
 
            [BsonElement("kw")]
            [DataMember]
            public List<string> Keywords { get; set; }
 
            [BsonElement("grid")]
            [DataMember]
            [BsonIgnoreIfDefault]
            public ObjectId GridFsObjectId { get; set; }
 
            [BsonElement("agrd")]
            [DataMember]
            [BsonIgnoreIfDefault]
            public ObjectId AttachmentGridFsObjectId { get; set; }
 
            [BsonElement("data")]
            [DataMember]
            [BsonIgnoreIfNull]
            public string Data { get; set; }
 
            [BsonElement("sz")]
            [DataMember]
            public int DataSize { get; set; }
 
            [BsonElement("tp")]
            [DataMember]
            [BsonIgnoreIfNull]
            public string DataType { get; set; }
 
            [BsonElement("qt")]
            [DataMember]
            public int QueueType { get; set; }
 
            [BsonElement("ck")]
            [DataMember]
            [BsonIgnoreIfNull]
            public Guid? CorrelationKey { get; set; }
 
            [BsonElement("dk")]
            [DataMember]
            [BsonIgnoreIfNull]
            public Guid? SourceDeviceKey { get; set; }
 
            [BsonElement("ph")]
            [DataMember]
            public bool HasProcessingHistory { get; set; }
 
            [BsonElement("src")]
            [DataMember]
            [BsonIgnoreIfNull]
            public string SourceUri { get; set; }
 
            [BsonElement("asz")]
            [DataMember]
            [BsonIgnoreIfDefault]
            public int AttachmentSize { get; set; }
 
            [BsonElement("abt")]
            [DataMember]
            [BsonIgnoreIfNull]
            public byte[] AttachmentDataBytes { get; set; }
 
            [BsonElement("me")]
            [DataMember]
            [BsonIgnoreIfNull]
            public string MessageExtension { get; set; }
 
            [BsonElement("__1")]
            [DataMember]
            [BsonIgnoreIfNull]
            public string MessageIdentifier { get; set; }
 
            [BsonElement("__2")]
            [DataMember]
            [BsonIgnoreIfNull]
            public string MessageType { get; set; }
 
            [BsonElement("__3")]
            [DataMember]
            [BsonIgnoreIfNull]
            public string PatientId { get; set; }
 
            [BsonElement("__4")]
            [DataMember]
            [BsonIgnoreIfNull]
            public string AccountNumber { get; set; }
 
            [BsonElement("__5")]
            [DataMember]
            [BsonIgnoreIfNull]
            public string ExamId { get; set; }
 
            [BsonElement("__6")]
            [DataMember]
            [BsonIgnoreIfNull]
            // [BsonDateTimeOptions(Kind = DateTimeKind.Local)] <-- we ToLocal this value as mongoDB by default converts datetimes to UTC on the way in (but not the way out)
            public DateTime? MessageDateTime { get; set; }
 
            [BsonElement("__61")]
            [DataMember]
            [BsonIgnoreIfNull]
            public short? MessageDateTimeUtcOffset { get; set; }
 
            [BsonElement("__7")]
            [DataMember]
            [BsonIgnoreIfNull]
            public string SendingFacility { get; set; }
 
            // null out message payloads on processed/filtered etc.
            [BsonElement("pop")]
            [DataMember]
            public bool PurgedOnProcessed { get; set; }
 
            #region Deserialized Object
 
            public void ClearMessageObject()
            {
                Data = null;
            }
 
            #endregion
        }
    }
} 

And here is the output (<= v4.2):

>= 4.4

Generated at Wed Feb 07 21:48:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.