ISSUE DESCRIPTION AND IMPACT
Replanning may not work when a slow cached plan with a high works value is inferior to a potential plan.
DIAGNOSIS AND AFFECTED VERSIONS
This can be identified by viewing the mongod logs for slow queries that utilize an IXSCAN, with an excessively high number of keysExamined, and comparing the number of keysExamined against the number in the output of the explain(true) command. explain(true) output for the exact same query (with the same predicate values) will have a much smaller number of keysExamined, or may use a different index entirely.
This issue affects MongoDB versions 3.2, 3.4, 3.6 and 4.0.
REMEDIATIONS AND WORKAROUNDS
Users affected by suboptimal query plan selection may:
The fix for this issue introduces the notion of "inactive" and "active" cache entries. Cache entries are first created in an "inactive" state, and are not used by the planner. They are only used to keep track of the expected number of works that the query will take.
When a plan is run, it is evaluated against the existing inactive entry:
- If the plan's works value is less than or equal to the inactive cache entry's works value, the new plan is put in an "active" entry, and the works value is updated.
- If the new plan's works value is greater than the inactive entry's value, the new plan is not cached. Instead, the works value of the inactive entry is multiplied by two (by default). The multiplier can be adjusted via the InternalQueryCacheWorksGrowthCoefficient server parameter.
This issue is fixed in MongoDB 4.1.1, and will be available in the MongoDB 4.2 production release.