[CSHARP-1717] The Aggregation Query is taking a long time to finish Created: 20/Jul/16  Updated: 16/Jun/20  Resolved: 16/Jun/20

Status: Closed
Project: C# Driver
Component/s: Performance
Affects Version/s: 1.11
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Jawad Ahmed Khan Assignee: Unassigned
Resolution: Done Votes: 0
Labels: neweng, question
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

62 GB memory on the host, Linux 7.2, VM Host, 4 cpu
The Database has 2 collection Call Log and ErrorLog.
The CallLog is Indexed on CreateDateTime and is the one that we are having performance issues with.



 Description   

The Following Query is taking almost 5 - 10 minutes to execute when date range of 1 day is specified. The MongoDB has around 19,000 transactions per hour.

Here is the MongoDB Query in C#

           string funcList = ConfigurationManager.AppSettings["FilterTransactionFunctionList"] as string;
               var filterList = funcList.Split(',');
 
               var wclLogCursor = wclMongoDB.GetCollection<CallLog>("CallLog").Find(
                    Query<CallLog>.Where(p => p.CreateDateTime >= start.ToUniversalTime() && p.CreateDateTime <= end.ToUniversalTime())
                    );
 
               var match = new BsonDocument 
                {
                    { "$match",
                       new BsonDocument {{ "CreateDateTime", new BsonDocument {
                                            {"$gte", start.ToUniversalTime() },
                                            {"$lte",  end.ToUniversalTime()  }
                                        }}
                       }
                       
                }};
 
               var group = new BsonDocument
                { 
                    { "$group", 
                        new BsonDocument
                        { 
                            { "_id", new BsonDocument {{"FunctionSessionId", "$FunctionSessionId" }}}, 
                            {"CallLogKey", new BsonDocument {{"$first", "$CallLogKey" }}}, 
                            {"CallSessionId", new BsonDocument {{"$first", "$CallSessionId" }}}, 
                            {"CreateDateTime", new BsonDocument {{"$first", "$CreateDateTime" }}}, 
                            {"LastCreateDateTime", new BsonDocument {{"$last", "$CreateDateTime" }}}, 
                            {"FunctionName", new BsonDocument {{"$first", "$FunctionName" }}}, 
                            {"Duration", new BsonDocument {{"$first", "$Duration" }}}, 
                            {"Direction", new BsonDocument {{"$first", "$Direction" }}}, 
                            {"Target", new BsonDocument {{"$first", "$Target" }}}, 
                            {"UICode", new BsonDocument {{"$first", "$UICode" }}}, 
                            {"UserName", new BsonDocument {{"$first", "$UserName" }}}, 
                            {"IsSucceed", new BsonDocument {{"$first", "$IsSucceed" }}}, 
                            {"FunctionCount", new BsonDocument {{"$sum", 1}}},
                            {"Request", new BsonDocument {{"$first", "$CallParameters" }}}, 
                            {"Response", new BsonDocument {{"$last", "$CallParameters" }}}
                        } 
                    } 
                  };
 
               var sort = new BsonDocument
                {
                    {
                        "$sort",
                        new BsonDocument
                        {
                            {"CreateDateTime", 1} , {"CreateDateTime.getTime()", 1}, 
                        }
                    }
                };
               var filterFunctions = new BsonDocument 
                {
                    { "$match",
                     new BsonDocument {{ "FunctionName", new BsonDocument {
                                            {"$nin", new BsonArray(filterList) }
                                        }}}
                       }
                       
                };
 
               var project = new BsonDocument
                { 
                    { 
                        "$project", 
                        new BsonDocument
                            { 
                                {"_id", 0}, 
                                {"CallLogKey", "$CallLogKey"},
                                {"CallSessionId", "$CallSessionId"},
                                {"CreateDateTime","$CreateDateTime"}, 
                                {"FunctionSessionId","$_id.FunctionSessionId"}, 
                                {"FunctionName","$FunctionName"},
                                {"Duration", new BsonDocument(
                                                "$subtract",
                                                new BsonArray() {"$LastCreateDateTime","$CreateDateTime"}
                                             )}, 
                                {"Direction","$Direction"}, 
                                {"Target","$Target"}, 
                                {"UICode","$UICode"}, 
                                {"UserName","$UserName"}, 
                                {"IsSucceed","$IsSucceed"},
                                {"FunctionCount","$FunctionCount"}, 
                                {"Request","$Request"}, 
                                {"Response","$Response"}, 
                            } 
                    } 
                };
 
               var limit = new BsonDocument("$limit", 1000);
               var pipeline = new AggregateArgs()
               {
                   AllowDiskUse = true,
                   OutputMode = AggregateOutputMode.Cursor,
                   Pipeline = new[] { match, sort, group, filterFunctions, project }
               };
 
               var result = wclLogCursor.Collection.Aggregate(pipeline);



 Comments   
Comment by Rachelle Palmer [ 16/Jun/20 ]

For questions like this, we recommend reaching out to our support engineering team. They are able to provide in depth analysis of issues and fine-tune queries. Here are some resources:

  • our MongoDB support portal, located at support.mongodb.com
  • our MongoDB community portal, located here
  • for Atlas users, there is a built in chat feature in the UI, accessible in the lower right hand corner (it looks like a chat icon). Chat is staffed 24/7 by support personnel.
    Folks at any of those can assist in diagnosing and/or reproducing the issue.
    Rachelle
Comment by Jawad Ahmed Khan [ 25/Jul/16 ]

Hi Craig,
Just wanted to follow-up on progress of the incidence on the Ticket we opened last week. Do you have any feedback on what steps we can take to rectify the performance of the query?

Thank you,

Jawad Ahmed Khan
Sr. Systems Analyst,
DER IS - Utility Billed Solutions
Direct Energy
12 Greenway Plaza, Suite 700
Houston, TX 77046
Phone: (713) 877-3968
www.directenergy.com<http://www.directenergy.com/>
• Please consider the environment - do you really need to print this email?

Comment by Jawad Ahmed Khan [ 21/Jul/16 ]

H Criag,
I ran the query in the noSQL Viewer and it took me more than 40 minutes to get the results . The tool by default limits the result to 100 records but it still took the query more than 40 minutes to execute .... Initially we had issue running the query because of Linix ulimit settings but then were reconfigured as per recommendation. Setting list is included below.

db.CallLog.aggregate([
{ $match: {
"CreateDateTime":

{ "$gte": ISODate("2016-07-19T00:00:00.0Z") }

,
"CreateDateTime":

{ "$lte": ISODate("2016-07-20T00:00:00.0Z") }

,
}},
{ $group: {
_id:

{ FunctionSessionId: "$FunctionSessionId" }

,
CallLogKey: {$first: "$CallLogKey"},
CallSessionId: {$first: "$CallSessionId"},
CreateDateTime: {$first: "$CreateDateTime"},
LastCreateDateTime: {$last: "$CallSessionId"},
FunctionName: {$first: "$FunctionName"},
Duration: {$first: "$Duration"},
Direction: {$first: "$Direction"},
Target: {$first: "$Target"},
UICode: {$first: "$UICode"},
UserName: {$first: "UserName"},
IsSucceed: {$first: "$IsSucceed"},
FunctionCount: {$sum: 1},
Request: {$first: "$CallParameters"},
Response: {$last: "$CallParameters"}
}
},
{ $sort: {CreateDateTime: 1}},
{ $match: {
"FunctionName": {$nin : ["login","login-service"]}
}},
{$project :

{ CallLogKey : '$CallLogKey', CallSessionId :"$CallSessionId", CreateDateTime :"$CreateDateTime", FunctionSessionId :"$FunctionSessionId", FunctionName :"$FunctionName", Duration :"$Duration", Direction :"$Direction", Target :"$Target", UICode :"$UICode", UserName :"$UserName", IsSucceed :"$IsSucceed", FunctionCount :"$FunctionCount", Request :"$Request", Response :"$Response", _id : 0 }

}

],

{ allowDiskUse: true }

)

Ulimit Stettings:

[mongo@rtplnspvwcl02 mongodb]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 256993
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 64000
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

[mongo@rtplnspvwcl03 mongodb]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 256993
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 64000
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

[mongo@rtplnspvwcl02 mongodb]$ ps -ef | grep mongod
mongo 1306 1 85 10:06 ? 00:06:35 /software/mongodb/bin/mongod --config /software/mongodb/mongodb.c

[mongo@rtplnspvwcl03 mongodb]$ ps -ef | grep mongod
mongo 1694 1 8 10:10 ? 00:00:23 /software/mongodb/bin/mongod --config /software/mongodb/mongodb.cfg

Thank you,
Jawad Khan

Comment by Craig Wilson [ 20/Jul/16 ]

Sure, seems right. Keep in mind that in the shell, it's only going to pull back 20 rows at a time, so you'll need to call ToArray() or something like that to exhaust the documents. Try both with and without.

Comment by Jawad Ahmed Khan [ 20/Jul/16 ]

Thank you Craig for the quick response. Let me get back to you on the performance of the query from shell.

I will be using the following query as shell equivalent . Let me know if that is not correct

db.CallLog.aggregate([
{ $match: {
"CreateDateTime":

{ "$gte": ISODate("2016-07-19T00:00:00.0Z") }

,
"CreateDateTime":

{ "$lte": ISODate("2016-07-20T00:00:00.0Z") }

,
}},
{ $group: {
_id:

{ FunctionSessionId: "$FunctionSessionId" }

,
CallLogKey: {$first: "$CallLogKey"},
CallSessionId: {$first: "$CallSessionId"},
CreateDateTime: {$first: "$CreateDateTime"},
LastCreateDateTime: {$last: "$CallSessionId"},
FunctionName: {$first: "$FunctionName"},
Duration: {$first: "$Duration"},
Direction: {$first: "$Direction"},
Target: {$first: "$Target"},
UICode: {$first: "$UICode"},
UserName: {$first: "UserName"},
IsSucceed: {$first: "$IsSucceed"},
FunctionCount: {$sum: 1},
Request: {$first: "$CallParameters"},
Response: {$last: "$CallParameters"}
}
},
{ $sort: {CreateDateTime: 1}},
{ $match: {
"FunctionName": {$nin : ["login","login-service"]}
}},
{$project :

{ CallLogKey : '$CallLogKey', CallSessionId :"$CallSessionId", CreateDateTime :"$CreateDateTime", FunctionSessionId :"$FunctionSessionId", FunctionName :"$FunctionName", Duration :"$Duration", Direction :"$Direction", Target :"$Target", UICode :"$UICode", UserName :"$UserName", IsSucceed :"$IsSucceed", FunctionCount :"$FunctionCount", Request :"$Request", Response :"$Response", _id : 0 }

}

],

{ allowDiskUse: true }

)

Thank you,
Jawad

Comment by Craig Wilson [ 20/Jul/16 ]

Hi Jawad,

Does this go faster in the shell, or...? Have you run an explain on this to see if you are hitting indexes? It looks like you are manually constructing the pipeline, so it's not the driver that would be causing this to go slow except for deserializing the results into CallLog entities.

Craig

Generated at Wed Feb 07 21:40:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.