[SERVER-31714] db.collection.explain() is taking a very long time Created: 25/Oct/17  Updated: 27/Oct/23  Resolved: 02/Nov/17

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: 3.2.11
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Guilherme Amaro Assignee: Dmitry Agranat
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongo_query_explain.txt     Text File mongo_query_explain_true.txt    
Backwards Compatibility: Fully Compatible
Participants:

 Description   

Some find operations on a mongodb sharded cluster we have are very slow, and when running a db.collection.explain() to figure out how which is the execution plan is also taking a very long time to execute (more than 30 minutes).

The command we are issuing to obtain the explain plan is the following:

db.ocs.explain().find({ "site": 1, $and: [ { "TID": { $gte: 195411511, $lte: 198134790 } }, { $and: [ { "adrTyp": "topup" }, { $and: [ { "strDate": { $gte: ISODate("2017-10-24T15:00:00.000+0000"), $lte: ISODate("2017-10-24T16:00:00.000+0000") } }, { $and: [ { "ocs.res.ocs": { $gte: 12, $lte: 14 } }, { $and: [ { "chn": "MULA" } ] } ] } ] } ] } ] })

Is this a normal behavior? how can we understand why the explain command is taking so long?



 Comments   
Comment by Dmitry Agranat [ 02/Nov/17 ]

Hi gamaro,

Thank you for attaching the explain plan with the true option.

However i still want to understand why it takes so long to get the execution plan without this new index. Is it supposed to be like this? shouldn't the explain plan be quick to obtain in all the situations?

  • It is expected for the explain plan method using the executionStats or allPlansExecution modes to take as long as the longest running candidate plan.
  • However, for the queryPlanner explain mode, the explain only executes the query partially in order to obtain the winning plan. At that point, execution ceases and the explain output is returned to the client. The reason that the query is taking a long time to perform plan selection is described by some of the related discussion on SERVER-24312. As documented in https://docs.mongodb.com/master/core/query-plans/, there is a trial execution period during which each plan is partially executed in order to select a winner. The trial ends when either:
  1. a plan executes the entire query,
  2. a plan generates an entire batch of results, or
  3. some large maximum amount of work has been done.

Since none of the candidate plans are producing any results, you were probably experiencing case #3. This is typically symptomatic of a bigger problem such as a missing index. However, it is certainly frustrating to users that this makes it hard to generate the debug information needed to diagnose the performance problem in the first place. There are a few open improvement requests which might help. One is SERVER-24396. This could help minimize the cost of the plan ranking trial period. Another is SERVER-17009, which would allow users to obtain some debug information without doing any query execution whatsoever. This suggested listPlans explain mode would always return quickly (typically in less than one second).

I suggest watching these tickets (SERVER-24396 and SERVER-17009) for future updates and if you are interested, I would encourage you to vote or comment on the issue as a way of indicating to the Development teams your interest.

Thank you,
Dima

Comment by Guilherme Amaro [ 30/Oct/17 ]

Hello Dmitry,

Sorry, it was taking so long to get the explain(true) that i sent you a regular explain, but i forgot to mention that on my reply.

Now i am attaching the output from explain(true), but this is from another database which has a similar issue.
mongo_query_explain_true.txt

This is because we created a new index on the the other database (on the fields adrTyp, key and TID) and after the creation of this new index the explain plan returns almost immediately and the query takes only a few seconds to execute.

However i still want to understand why it takes so long to get the execution plan without this new index. Is it supposed to be like this? shouldn't the explain plan be quick to obtain in all the situations?

Regards

Comment by Dmitry Agranat [ 30/Oct/17 ]

Hi gamaro,

Thank you for attaching the explain plan output. The output indicates that .explain(true) option was not used. Please execute the following and attach the output that includes both the .explain(true) command and the output:

db.ocs.explain(true).find({ "site": 1, $and: [ { "TID": { $gte: 195411511, $lte: 198134790 } }, { $and: [ { "adrTyp": "topup" } , { $and: [ { "strDate": { $gte: ISODate("2017-10-24T15:00:00.000+0000"), $lte: ISODate("2017-10-24T16:00:00.000+0000") } }, { $and: [ { "ocs.res.ocs": { $gte: 12, $lte: 14 } }, { $and: [ { "chn": "MULA" } ] } ] } ] } ] } ] })

Thank you,
Dima

Comment by Guilherme Amaro [ 27/Oct/17 ]

Hello, i have just attached a file with the explain plan output.

mongo_query_explain.txt

Regards,

Guilherme

Comment by Kelsey Schubert [ 26/Oct/17 ]

Hi gamaro,

Thank you for reporting this behavior. My understanding is the the explain command eventually returns, so we can continue to investigate would you please upload the complete output of .explain(true) after it completes?

Thank you,
Kelsey

Generated at Thu Feb 08 04:27:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.