[SERVER-43981] Simple database query is taking over 10 mins even with a suitable index Created: 13/Oct/19  Updated: 05/Nov/19  Resolved: 05/Nov/19

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Ian Hannah Assignee: Danny Hatcher (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File countoutput.txt     PNG File error.png     Text File find_with_itemId_ChId.txt     Text File find_with_itemId_ChId.txt     PNG File screenshot.png    
Participants:

 Description   

We have a collection (called childData) with the following indexes:

m_childDataCollection.CreateIndex(IndexKeys.Ascending(MongoChildData.ItemIdProperty).Ascending(MongoChildData.ChildIdProperty));
m_childDataCollection.CreateIndex(IndexKeys.Ascending(MongoChildData.RedundantTimeProperty));

This query runs very fast (less than a second) and returns 59631 :

db.childdata.find(\{ ItemId: ObjectId('596f322ba28f4b1240f1c2e9') }).count()

This query takes around 15 minutes:

db.childdata.find(\{ ItemId: ObjectId('596f322ba28f4b1240f1c2e9') }, \{ItemId : 1, ChId : 1}).toArray()

We are failing to understand why this second query takes so long even though it is using the index?

Please note that I have also rebuilt these indexes completely. It took over 3 days to do this!

Thanks
Ian



 Comments   
Comment by Danny Hatcher (Inactive) [ 05/Nov/19 ]

I'm glad you were able to figure it out!

Comment by Ian Hannah [ 31/Oct/19 ]

Hi Daniel,

I spoke to a Mongo developer on Slack and it turns out that the index is not used unless the sort order is specified - this is critical when the find does not contain a query.

This is my change:

MongoCursor<MongoChildData> childCursor = childDataCollection.FindAll().SetSortOrder(SortBy.Ascending(MongoChildData.ItemIdProperty, MongoChildData.ChildIdProperty));
 childCursor.SetFields(Fields.Exclude(MongoItem.IdProperty).Include(MongoChildData.ItemIdProperty, MongoChildData.ChildIdProperty));
 childCursor.SetFlags(QueryFlags.NoCursorTimeout);
 childCursor.SetBatchSize(1000);

and is very fast! You can close this issue now. Thanks for your help.

Comment by Danny Hatcher (Inactive) [ 28/Oct/19 ]

I'm unable to open the file you provided. It's possible there is something wrong with our JIRA system. Can you please reupload the explain for both the fast and slow queries to our Secure Uploader?

Comment by Ian Hannah [ 26/Oct/19 ]

Hi Daniel. Have you had a chance to look at this yet?

Just so that you can appreciate how slow this is we are using the C# driver to iterate over 2 million records like this:

MongoCursor<MongoChildData> childCursor = childDataCollection.FindAll();              childCursor.SetFields(Fields.Include(MongoChildData.ItemIdProperty, MongoChildData.ChildIdProperty));  childCursor.SetFlags(QueryFlags.NoCursorTimeout);            childCursor.SetBatchSize(1000);
            
foreach (MongoChildData childData in childCursor)            {     }   

This code is taking days to run. There is an index on itemId and childId so how can this possibly take days to run?

Comment by Ian Hannah [ 22/Oct/19 ]

find_with_itemId_ChId.txt

Attached out from the query returning itemId and ChId. It took some time to run!

Comment by Danny Hatcher (Inactive) [ 21/Oct/19 ]

Most likely it is taking some time to run the query. If it still does not return after ~15 minutes, please provide the mongod logs.

Comment by Ian Hannah [ 21/Oct/19 ]

Hi Daniel.

I cannot get this to work:

mongo -port 27057 -host 192.168.45.107 --authenticationDatabase "admin" -u "menadmin" -p "<password>" --eval "db.getSiblingDB('map').childdata.explain(true).find({ ItemId: ObjectId('596f322ba28f4b1240f1c2e9') }, {ItemId : 1, ChId : 1})"

It seems to open the server and then lock up? See the attached screenshot.

Comment by Danny Hatcher (Inactive) [ 21/Oct/19 ]

The count commands you are running are going against an empty namespace "test.childdata". I assume that the collection name is correct but the database name is wrong. You can specify the specific database to query against in the command using .getSiblingDB():

mongo -port 27057 -host 192.168.45.107 --authenticationDatabase "admin" -u "menadmin" -p "<password>" --eval "db.getSiblingDB('<database>').childdata.explain(true).find({ ItemId: ObjectId('596f322ba28f4b1240f1c2e9') }, {ItemId : 1, ChId : 1})"

Because the way the terminal interprets quotes, the quote type used for the database and the objectID should be the same.

Comment by Ian Hannah [ 21/Oct/19 ]

Hi Daniel. The output with the command without the toArray is attached.

It shows that the estimated query time is 0. This just cannot be. Running the command takes a good few seconds to return the first page.

In the driver I am trying to get all the matching rows into a list and this is taking 13-15 minutes. I assume that toArray will be just as slow. Why is the query so slow?

 

 

Comment by Danny Hatcher (Inactive) [ 17/Oct/19 ]

Can you run the command with all the same options but without the toArray call? If the difference between the explain output provided previously and the new one is small, then the slowness will lie within the toArray.

Comment by Ian Hannah [ 17/Oct/19 ]

Hi Daniel. Can you please confirm how I should use the explain method with the toArray call? Once I know this I will send you the output.

Thanks

Ian

Comment by Ian Hannah [ 15/Oct/19 ]

Output from the count() command attached.

I cannot work out where to put the explain part in the toArray call:

mongo -port 27057 -host 192.168.45.107 --authenticationDatabase "admin" -u "menadmin" -p "<password>" --eval "db.childdata.find({ ItemId: ObjectId('596f322ba28f4b1240f1c2e9') }, {ItemId : 1, ChId : 1}).toArray().explain(true)"

Wherever I put it I get an error

Comment by Danny Hatcher (Inactive) [ 15/Oct/19 ]

Both of the wrappers for the entire eval command the ObjectId use single quotes so it's matching them inappropriately. If you change either the eval quotes or the ObjectId quotes to be double quotes instead it should work.

Comment by Ian Hannah [ 15/Oct/19 ]

I am getting an error trying to run --eval (see attached picture). Any ideas?

Comment by Danny Hatcher (Inactive) [ 15/Oct/19 ]

What is the best way to get the output from the mongo shell into a file so that I can send you the information?

You can run shell commands directly from the terminal by using the --eval option.

// within the shell
db.foo.findOne()
// same command from terminal
mongo --eval 'db.foo.findOne()'

As it returns to stdout, you can easily redirect the results to a file with ">".

Running this command:
db.childdata.find({ ItemId: ObjectId('596f322ba28f4b1240f1c2e9') }).count().explain(true)
does not work

Unfortunately due to a quirk in the mongo shell explain only works at the end of some commands and "count" is not one of them. If you put the explain immediately after the collection, it will run correctly. For example:

db.childdata.explain(true).find({ ItemId: ObjectId('596f322ba28f4b1240f1c2e9') }).count()

so I tried this:

db.childdata.find({ ItemId: ObjectId('596f322ba28f4b1240f1c2e9') }).explain(true)
and this has not returned yet!

The only possible issue I see is there is an escape character between "find" and "ItemId" but they may be an artifact of JIRA and not the actual command you ran. Can you please confirm?

Given the data size you mention, it should return quickly. Can you please try some other simple queries against your data and see if any of them return explains?

Comment by Ian Hannah [ 15/Oct/19 ]

Running this command:

db.childdata.find(\{ ItemId: ObjectId('596f322ba28f4b1240f1c2e9') }).count().explain(true)

does not work so I tried this:

db.childdata.find(\{ ItemId: ObjectId('596f322ba28f4b1240f1c2e9') }).explain(true)

and this has not returned yet! Am I doing something wrong?

Comment by Ian Hannah [ 15/Oct/19 ]

Hi Daniel. What is the best way to get the output from the mongo shell into a file so that I can send you the information?

Comment by Danny Hatcher (Inactive) [ 14/Oct/19 ]

ihannah@meniscus.co.uk, would you be able to connect to the instance via the mongo shell, run both queries with explain(true) and provide the output?

Comment by Ian Hannah [ 13/Oct/19 ]

Just to explain we are using the C# driver to get all the child ids for the specified item id. There are only 60000 so I would expect this to be fast.

Generated at Thu Feb 08 05:04:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.