Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-25865

$group operation is slow since MongoDB 3.2 on Windows

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 3.2.9, 3.3.12
    • Fix Version/s: 3.2.12, 3.3.14
    • Component/s: Aggregation Framework
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v3.2
    • Sprint:
      Query 2016-09-19

      Description

      The $group operation is much slower for MongoDB 3.2/3.3 comparing to MongoDB 3.0 on Windows. I don't see the issue on OSX or Linux.

      • MongoDB 3.0 on Windows:
        Run the following commands to create the collection, index and then run the aggregation.

        use test;
        db.collection.drop();
        for (var i = 0; i < 40000; ++i) { db.collection.insert({x: Math.floor(Math.random()*1000000)});}
        db.collection.createIndex({x: 1});
        var start = new Date().getTime(); db.collection.aggregate( [{$group: {_id: "$x", value: {$sum: 1}}}] ); var end = new Date().getTime(); var time = end - start; print(time);
        

        The aggregation is fast on 3.0:

        > db.collection.drop();
        false
        > for (var i = 0; i < 40000; ++i) { db.collection.insert({x: Math.floor(Math.random()*1000000)});}
        WriteResult({ "nInserted" : 1 })
        > db.collection.createIndex({x: 1});
        {
                "createdCollectionAutomatically" : false,
                "numIndexesBefore" : 1,
                "numIndexesAfter" : 2,
                "ok" : 1 
        }
        > var start = new Date().getTime(); db.collection.aggregate( [{$group: {_id: "$x", value: {$sum: 1}}}] ); var end = new Date().getTime(); var time = end - start; print(time);
        44
        

      • MongoDB 3.2 on Windows:
        Run the same commands on a MongoDB 3.2 instance on Windows and it is much slower:

        > db.collection.drop();
        false
        > for (var i = 0; i < 40000; ++i) { db.collection.insert({x: Math.floor(Math.random()*1000000)});}
        WriteResult({ "nInserted" : 1 })
        > db.collection.createIndex({x: 1});
        {
                "createdCollectionAutomatically" : false,
                "numIndexesBefore" : 1,
                "numIndexesAfter" : 2,
                "ok" : 1 
        }
        > var start = new Date().getTime(); db.collection.aggregate( [{$group: {_id: "$x", value: {$sum: 1}}}] ); var end = new Date().getTime(); var time = end - start; print(time);
        26587
        

      From the diagnostic data, there is "cursor open pinned" while the aggregation command is run, but I don't see the same on OSX. Is this the cause of the slowness on Windows? Diagnostic data is attached.

      We've also tested the same aggregation on MongoDB 3.2 with MMAP storage engine, it is also slow. So this issue doesn't seem to relate to the storage engine.

      Also, if I change the data set from:

      for (var i = 0; i < 40000; ++i) { db.collection.insert({x: Math.floor(Math.random()*1000000)});}
                                                                                          ^^^^^^^
      

      To:

      for (var i = 0; i < 40000; ++i) { db.collection.insert({x: Math.floor(Math.random()*10000)});}
                                                                                          ^^^^^
      

      The aggregation is faster on the second data set (both on MongoDB 3.2 on Windows):

      • First data set

        > db.collection.drop();
        false
        > for (var i = 0; i < 40000; ++i) { db.collection.insert({x: Math.floor(Math.random()*1000000)});}
        WriteResult({ "nInserted" : 1 })
        > db.collection.createIndex({x: 1});
        {
                "createdCollectionAutomatically" : false,
                "numIndexesBefore" : 1,
                "numIndexesAfter" : 2,
                "ok" : 1 
        }
        > var start = new Date().getTime(); db.collection.aggregate( [{$group: {_id: "$x", value: {$sum: 1}}}] ); var end = new Date().getTime(); var time = end - start; print(time);
        26587
        

      • Second data set:

        > db.collection.drop();
        true
        > for (var i = 0; i < 40000; ++i) { db.collection.insert({x: Math.floor(Math.random()*10000)});}
        WriteResult({ "nInserted" : 1 })
        > db.collection.createIndex({x: 1});
        {
                "createdCollectionAutomatically" : false,
                "numIndexesBefore" : 1,
                "numIndexesAfter" : 2,
                "ok" : 1 
        }
        > var start = new Date().getTime(); db.collection.aggregate( [{$group: {_id: "$x", value: {$sum: 1}}}] ); var end = new Date().getTime(); var time = end - start; print(time);
        3020
        

      It seems the $group operations would be slow if the result set is large, and this is more obvious on MongoDB 3.2 on Windows.

        Attachments

        1. aggregation_3.2_windows.png
          aggregation_3.2_windows.png
          29 kB
        2. diagnostic.data.tar
          25 kB
        3. group.png
          group.png
          271 kB

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                15 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: