[CSHARP-1582] Linq queries are getting the right result, but hydrates more elements than expected. Created: 29/Feb/16  Updated: 02/Mar/16  Resolved: 02/Mar/16

Status: Closed
Project: C# Driver
Component/s: API, Performance
Affects Version/s: 2.2.2, 2.2.3
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Juan Antonio Assignee: Robert Stam
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

Hi all,

Using c# driver through Linq expressions, we've noticed that many objects are hydrated when using AsQueryable() + Linq expresion even though the query returns one result.

...
var collection = _mongoDatabase.GetCollection<User>("Users");
var usersFound = collection.AsQueryable().Where(x => x.UserName == "John Doe");
...

By the other hand, if we use Eq method, the behavior is the expected: there is only one object hydrated.

...
var filter = Builders<User>.Filter.Eq("UserName", "John Doe");
var usersFound = collection.Find(filter);
...

Is that te implemented behaviour of the driver using Linq?

Many thanks!



 Comments   
Comment by Robert Stam [ 02/Mar/16 ]

As Craig predicted I can reproduce what you observed if I use a Func instead of an Expression:

// Func<User, bool> predicate = x => x.UserName == "Jane Doe";
Expression<Func<User, bool>> predicate = x => x.UserName == "Jane Doe";
var first = collection.AsQueryable().Where(predicate).FirstOrDefault();

If I use the first (commented out) predicate then many extra instances are hydrated. If I use the second Expression based form then only the necessary instances are hydrated.

This can be explained because when you use a Func, part of the query is executed client side. It might help to write the code this way to understand what is happening:

Func<User, bool> predicate = x => x.UserName == "Jane Doe";
var queryable = collection.AsQueryable(); // results in a full collection scan
var enumerable = query.Where(predicate); // Where clause will be executed client side
var first = enumerable.FirstOrDefault();

When written this way thousands of extra instances are created client side only to be discarded by the Where clause.

Comment by Juan Antonio [ 02/Mar/16 ]

Ok!
Tanks for your checks and replies.

Comment by Craig Wilson [ 02/Mar/16 ]

Hi Juan,

I believe the problem you are encountering is that, in order to translate your query to MongoDB, we have to be able to see the expression tree. In your second example, you aren't providing us with an expression, but rather a compiled delegate. If you change your argument from Func<T, bool> to Expression<Func<T, bool>>, then it should be translatable.

Craig

Comment by Juan Antonio [ 02/Mar/16 ]

Hi Robert,
Today I was able to access the code again and I've made some tests:

This is working as expected (instance counter is 1):

public static void Main(string[] args)
{
    var user = GetUserById(Guid.Parse("96ce5602-1554-43ef-b16a-e7774af4a938"));
}
 
public User GetUserById(Guid id)
{
    var collection = _mongoDatabase.GetCollection<User>("User");
    return collection.AsQueryable().Where(x => x.Id == id).FirstOrDefault();
}

On the other hand, this code fires the 'multiple hydrate' behaviour. The method receives the same lambda condition as parameter:

public static void Main(string[] args)
{
    var user = ByFunc<User>(x => x.Id == Guid.Parse("96ce5602-1554-43ef-b16a-e7774af4a938")).FirstOrDefault();
}
 
public IEnumerable<T> ByFunc<T>(Func<T, bool> lambda) 
{
    var collection = _mongoDatabase.GetCollection<T>(typeof(T).Name);
    return collection.AsQueryable().Where(lambda);
}

Excuse me for not having noticed before.
Now really don't know if it's a driver issue or not.

Comment by Juan Antonio [ 29/Feb/16 ]

Cool,
I' ll do my tests and report the results.
Thank you again

Comment by Robert Stam [ 29/Feb/16 ]

Oops... I neglected to show the change made to the User class to count how many instances have been created:

public class User
{
    public static int InstancesCreated = 0;
 
    public User()
    {
        InstancesCreated++;
    }
 
    public int Id;
    public string UserName;
}

Comment by Robert Stam [ 29/Feb/16 ]

And if I change the queries to not match any users the output is:

LINQ query returned 0 documents
Find query returned 0 documents
1001 instances of User were created

Still no sign of any extraneous User instances being created. The 1001 instances reported are the 1001 that were inserted to the collection. The two queries resulted in 0 additional instances being created.

Comment by Robert Stam [ 29/Feb/16 ]

My test database has only those two users (the code drops the collection and then inserts those two users).

I've changed the code to insert 1001 documents instead of 2 but the results still look correct.

The new code is:

public static void Main(string[] args)
{
    var client = new MongoClient("mongodb://localhost");
    var database = client.GetDatabase("test");
    var collection = database.GetCollection<User>("test");
 
    database.DropCollection("test");
    var users = new List<User>();
    for (var i = 0; i < 1000; i++)
    {
        var user = new User { Id = i, UserName = "John Doe" };
        users.Add(user);
    }
    users.Add(new User { Id = 1000, UserName = "Jane Doe" });
    collection.InsertMany(users);
 
    var usersFound = collection.AsQueryable().Where(x => x.UserName == "Jane Doe").ToList();
    Console.WriteLine($"LINQ query returned {usersFound.Count} documents");
 
    var filter = Builders<User>.Filter.Eq("UserName", "Jane Doe");
    usersFound = collection.Find(filter).ToList();
    Console.WriteLine($"Find query returned {usersFound.Count} documents");
 
    Console.WriteLine($"{User.InstancesCreated} instances of User were created");
 
    Console.ReadLine();
}

Which produced the following output:

LINQ query returned 1 documents
Find query returned 1 documents
1003 instances of User were created

So no extra instances of User are being created. 1001 documents were inserted into the collection and each query (the LINQ version and the Find version) returned 1 document for a total of 1003.

Comment by Juan Antonio [ 29/Feb/16 ]

Hi Robert,
Thanks for reply so quick!

Does your test database have those 2 users or more? I reproduce it when are more users and the the query target is not the first element from database (if I filter to an unexisting user I get 0 elements, but all items are hydrated at object constructor).

I'll try using ".ToList()" and check again the Linq query.
Also, I'll try changing code setup (ConventionRegistry, for example) and do some more checks.

Comment by Robert Stam [ 29/Feb/16 ]

I am unable to reproduce this. In both cases only one instance of the User class was instantiated.

I used the following code, and set a breakpoint on the User constructor to verify when each instance of the User class was created:

    public class User
    {
        public User()
        {
            UserName = "";
        }
 
        public int Id;
        public string UserName;
    }
 
    public static class Program
    {
        public static void Main(string[] args)
        {
            var client = new MongoClient("mongodb://localhost");
            var database = client.GetDatabase("test");
            var collection = database.GetCollection<User>("test");
 
            database.DropCollection("test");
            collection.InsertMany(new User[]
            {
                new User { Id = 1, UserName = "John Doe" },           
                new User { Id = 2, UserName = "Jane Doe" }
            });
 
            var usersFound = collection.AsQueryable().Where(x => x.UserName == "John Doe").ToList();
 
            var filter = Builders<User>.Filter.Eq("UserName", "John Doe");
            usersFound = collection.Find(filter).ToList();
        }
    }

Generated at Wed Feb 07 21:40:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.