Uploaded image for project: 'C# Driver'
  1. C# Driver
  2. CSHARP-1679

Count using LINQ is slower than collection Count method

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.2.4
    • Component/s: Linq, Operations, Performance
    • None
    • Environment:
      Windows

      Revised Description

      This ticket is the result of an inquiry regarding the performance of counting the items in a collection using LINQ. There are two ways one could count the items in a collection:

      Using the collection Count method
      var count = collection.Count(new BsonDocument());
      
      Using the LINQ Count method
      var count = collection.AsQueryable().Count();
      

      The first way is much faster. The reason is that the first method results in the count command being used, which is optimized at the server (for the case where no filter is supplied). In that case it can return the number of items in the collection in O(1) time from the collection metadata.

      The second way, like all LINQ queries, is translated to an aggregation framework pipeline. The aggregation framework does not have an equivalent to the count command, so we translate the query to the best possible pipeline we know of. As it turns out, the LINQ implementation ends up having to do a full collection scan to count the items.

      So for this particular case of Count (with no filter), the collection method is much faster than the LINQ method.

      In general LINQ methods perform well; this is a special case. Although it is worth pointing out that LINQ queries will always be somewhat slower than hand crafted queries using the collection methods. One reason is that LINQ queries must be translated at run time to the equivalent aggregation framework pipeline. Another reason is that when hand crafting queries you can often use optimizations that are outside the realm of what a general purpose system like LINQ might be able to apply.

      It is possible that we might optimize this one special case LINQ query, I'm not sure. But there is a very easy workaround in the meantime.

      General comment about performance

      Most of our users find the performance of the .NET driver good. We have put a lot of effort into improving performance over time. If there are specific areas (like this one) where performance issues are encountered we appreciate reports of that.

      Original Description

      After few months with C# driver and Linq syntax i feel, that driver in undetermined state. Big performance issues with highloads, many different ways to doing something with worst performance in one case and pretty performance in other case.
      .NET Ecosystem now drives around Linq (not Builders) and Linq Count query
      don't need to use pipeline to simple Count query! Try to catch ideas from EntityFramework. Now driver looks like strage for many developers.
      You are cool developers, but if people many times write only libraries with stable salary - it begin tunnel vision and no improvements in code and no motivation to make driver better (if no improvement issues in Jira found)
      I think driver developers need to write highload code for other projects (maybe already ?)with C# mongo driver and try to see problems, that causing rejecting MongoDB as main database for many companies and i know many companies that rejecting MongoDB becouse C# driver was bottleneck.
      Yes, many problems with performance in incorrect using API, but you target - minimize ways for using API incorrect.
      Thank you for your work.
      Sorry for my English

            Assignee:
            Unassigned Unassigned
            Reporter:
            zoxexivo@gmail.com Ivan Artemov
            Votes:
            2 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: