[SERVER-31148] The executionTimeMillis in explain is a lot different from the actual execution time Created: 19/Sep/17  Updated: 07/Jan/18  Resolved: 07/Nov/17

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: MaJinliang [X] Assignee: Mark Agarunov
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux


Participants:

 Description   

db.Trade.find({
    'startDate': {$lte: ISODate("2017-09-08T00:00:00.000Z")},
    $or: [
         {
             'endDate': {$gte: ISODate("2017-09-08T00:00:00.000Z")}
         },
         {
             'endDate': {$exists: false}
         }
     ]
}).sort({'date': -1}).limit(100).explain('executionStats')

There are indexes in startDate_1_endData_1 and data_1
As the executionStats description, the query will use the data_1 index.
In executionStats, it run 3s, but actually, it run 80s



 Comments   
Comment by MaJinliang [X] [ 29/Nov/17 ]

Thanks Agarunov.

Comment by Mark Agarunov [ 07/Nov/17 ]

Hello MaJinliang,

The behavior you're seeing with different indexes used on the primary and secondary is due to the query planner cache using the incorrect plan. As mentioned, you can use an index filter to ensure the proper index is used, or alternatively use hint() to specify the index, which would allow you to use the same index on the primary that is being used on the secondary.

Unfortunately without the complete logs, we're unable to further diagnose this issue so I have closed the ticket. If additional information comes to light, we can reopen the ticket and continue investigating.

Thanks,
Mark

Comment by MaJinliang [X] [ 27/Oct/17 ]

Another question. The query above will use different indexes in primary node and secondary node. In primary node, it will use startDate_1_endData_1, but in secondary node, it will use data_1. Could you help to find the reason.
Thanks,
MaJinliang

Comment by Mark Agarunov [ 26/Oct/17 ]

Hello MaJinliang,

Thank you for providing the additional information. Looking over this, I believe this is is due to query plan caching. To mitigate this issue, you could use an index filter to ensure the proper index is used, which should prevent the query planner from using the slower cached plan. Alternatively, you could use hint() instead which should also prevent the slower plan from being used.

Thanks,
Mark

Comment by MaJinliang [X] [ 13/Oct/17 ]

I uploaded it. Please see again.

Comment by MaJinliang [X] [ 13/Oct/17 ]

I will upload picture when i go home.

Comment by Mark Agarunov [ 12/Oct/17 ]

Hello MaJinliang,

Unfortunately I am not seeing any images uploaded. As you cannot provide the complete logs, would it be possible to provide only the exact line from the logs where this slow query is logged? This should provide some useful information for diagnosing this issue.

Thanks,
Mark

Comment by MaJinliang [X] [ 11/Oct/17 ]

Because my company don't allow upload file to outside net. I taked picture of the log and uploaded it. If you could not see it, please tell me, i will upload again.

Comment by Mark Agarunov [ 09/Oct/17 ]

Hello MaJinliang

Thank you for providing this information. Unfortunately more information from the logs is needed to continue investigating this. Could you please upload the complete log files to the secure upload portal?

Thanks,
Mark

Comment by MaJinliang [X] [ 09/Oct/17 ]

I don't have permission to run db.getCollection('Trade').getPlanCache().clear(). But when i have permission to run db.getCollection('Trade').getPlanCache(), no results return.

Sorry for reply you so late

Comment by MaJinliang [X] [ 09/Oct/17 ]

i see the log, and i found
db.Trade.find({
'startDate': {$lte: ISODate("2017-09-08T00:00:00.000Z")},
$or: [
{
'endDate': {$gte: ISODate("2017-09-08T00:00:00.000Z")}
},
{
'endDate': {$exists: false}
}
]
}).sort(

{'tradeDate': -1}

).limit(100)
will use index startDate_1_endData_1

db.Trade.find({
'startDate': {$lte: ISODate("2017-09-08T00:00:00.000Z")},
$or: [
{
'endDate': {$gte: ISODate("2017-09-08T00:00:00.000Z")}
},
{
'endDate': {$exists: false}
}
]

}).sort(

{'tradeDate': -1}

).limit(100).explain('executionStats')
will use index data_1

Comment by Mark Agarunov [ 22/Sep/17 ]

Hello MaJinliang,

Thank you for the information. The db.getCollection('Trade').getPlanCache().clear() command shouldn't return much output, however there should be a difference in the explain() output from running your query before and after the db.getCollection('Trade').getPlanCache().clear() command and I would like to compare the two outputs. We can proceed diagnosing this issue once you've provided the logs.

Thanks,
Mark

Comment by MaJinliang [X] [ 22/Sep/17 ]

1. I execute db.getCollection('Trade').getPlanCache() but no result return
2. Our db is not a sharded cluster
3. I found that if i run the query on primary, it will cost 80 seconds.If i run the query on secondary, it will cost only 3 seconds

I will provide you log next week, because i am very busy recently
Thanks

Comment by Mark Agarunov [ 20/Sep/17 ]

Hello MaJinliang,

Looking into this futher, I suspect the behavior may be caused by the query planner cache. In addition to the previously requested information, could you also provide the following:

  1. Run the query with explain, verify that the query is running slowly and provide the output,
  2. Execute db.getCollection('Trade').getPlanCache().clear()
  3. Run the query with explain again, verify if the executionTimeMillis matches the actual time it takes for the query to run and provide the output.

Thanks,
Mark

Comment by Mark Agarunov [ 20/Sep/17 ]

Hello MaJinliang,

Thank you for providing this information. To further investigate this, I'd like to request some additional information.

  • Is this a sharded cluster?
  • If so, how is sharding set up?
  • Please provide the complete logs from all mongod instances when this issue is present,
  • Please provide an archive (tar or zip) of the $dbpath/diagnostic.data directory from all mongod instances.

I've created a secure upload portal so that you can send us this data privately.

Thanks,
Mark

Comment by MaJinliang [X] [ 20/Sep/17 ]

I deleted the rejectedPlans and some confidential information. The explain is below.

{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "indexFilterSet" : false,
        "parsedQuery" : {
            "$and" : [ 
                {
                    "$or" : [ 
                        {
                            "endDate" : {
                                "$gte" : ISODate("2017-09-08T00:00:00.000Z")
                            }
                        }, 
                        {
                            "$not" : {
                                "endDate" : {
                                    "$exists" : true
                                }
                            }
                        }
                    ]
                }, 
                {
                    "startDate" : {
                        "$lte" : ISODate("2017-09-08T00:00:00.000Z")
                    }
                }
            ]
        },
        "winningPlan" : {
            "stage" : "LIMIT",
            "limitAmount" : 100,
            "inputStage" : {
                "stage" : "FETCH",
                "filter" : {
                    "$and" : [ 
                        {
                            "$or" : [ 
                                {
                                    "endDate" : {
                                        "$gte" : ISODate("2017-09-08T00:00:00.000Z")
                                    }
                                }, 
                                {
                                    "$not" : {
                                        "endDate" : {
                                            "$exists" : true
                                        }
                                    }
                                }
                            ]
                        }, 
                        {
                            "startDate" : {
                                "$lte" : ISODate("2017-09-08T00:00:00.000Z")
                            }
                        }
                    ]
                },
                "inputStage" : {
                    "stage" : "IXSCAN",
                    "keyPattern" : {
                        "date" : 1.0
                    },
                    "indexName" : "date_1",
                    "isMultiKey" : false,
                    "isUnique" : false,
                    "isSparse" : false,
                    "isPartial" : false,
                    "indexVersion" : 1,
                    "direction" : "backward",
                    "indexBounds" : {
                        "date" : [ 
                            "[MaxKey, MinKey]"
                        ]
                    }
                }
            }
        }
    },
    "executionStats" : {
        "executionSuccess" : true,
        "nReturned" : 100,
        "executionTimeMillis" : 4644,
        "totalKeysExamined" : 164752,
        "totalDocsExamined" : 164752,
        "executionStages" : {
            "stage" : "LIMIT",
            "nReturned" : 100,
            "executionTimeMillisEstimate" : 2160,
            "works" : 164753,
            "advanced" : 100,
            "needTime" : 164652,
            "needYield" : 0,
            "saveState" : 5149,
            "restoreState" : 5149,
            "isEOF" : 1,
            "invalidates" : 0,
            "limitAmount" : 100,
            "inputStage" : {
                "stage" : "FETCH",
                "filter" : {
                    "$and" : [ 
                        {
                            "$or" : [ 
                                {
                                    "endDate" : {
                                        "$gte" : ISODate("2017-09-08T00:00:00.000Z")
                                    }
                                }, 
                                {
                                    "$not" : {
                                        "endDate" : {
                                            "$exists" : true
                                        }
                                    }
                                }
                            ]
                        }, 
                        {
                            "startDate" : {
                                "$lte" : ISODate("2017-09-08T00:00:00.000Z")
                            }
                        }
                    ]
                },
                "nReturned" : 100,
                "executionTimeMillisEstimate" : 2150,
                "works" : 164752,
                "advanced" : 100,
                "needTime" : 164652,
                "needYield" : 0,
                "saveState" : 5149,
                "restoreState" : 5149,
                "isEOF" : 0,
                "invalidates" : 0,
                "docsExamined" : 164752,
                "alreadyHasObj" : 0,
                "inputStage" : {
                    "stage" : "IXSCAN",
                    "nReturned" : 164752,
                    "executionTimeMillisEstimate" : 140,
                    "works" : 164752,
                    "advanced" : 164752,
                    "needTime" : 0,
                    "needYield" : 0,
                    "saveState" : 5149,
                    "restoreState" : 5149,
                    "isEOF" : 0,
                    "invalidates" : 0,
                    "keyPattern" : {
                        "date" : 1.0
                    },
                    "indexName" : "date_1",
                    "isMultiKey" : false,
                    "isUnique" : false,
                    "isSparse" : false,
                    "isPartial" : false,
                    "indexVersion" : 1,
                    "direction" : "backward",
                    "indexBounds" : {
                        "date" : [ 
                            "[MaxKey, MinKey]"
                        ]
                    },
                    "keysExamined" : 164752,
                    "dupsTested" : 0,
                    "dupsDropped" : 0,
                    "seenInvalidated" : 0
                }
            }
        }
    },
    "ok" : 1.0
}

Comment by MaJinliang [X] [ 20/Sep/17 ]

I found that if i run the query on primary, it will cost 80 seconds.If i run the query on secondary, it will cost only 3 seconds

Comment by Mark Agarunov [ 19/Sep/17 ]

Hello MaJinliang,

Thank you for the report. To get a better idea of what may be causing this behavior, could you please provide the output of the explain() from your example query?

Thanks,
Mark

Generated at Thu Feb 08 04:26:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.