CBR: Queries with $limit may pick inferior plan with residual predicate, correlation

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Optimization
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      In a heavily correlated dataset, a plan with a higher cost may end up finding $limit documents that match the residual predicate very quickly and terminate early, while a plan with a lower cost may need to go through all the inputs in order to find those same $limit documents.

      Note that in this case, $sort is not present, so this is not an issue of picking an index to satisfy the sort vs. picking an index to satisfy the $match.

      To reproduce:

      1. Run:

      buildscripts/resmoke.py run --installDir bazel-bin/install-dist-test/bin --suites=query_golden_classic '--mongodSetParameters={internalQueryFrameworkControl: forceClassicEngine}' jstests/query_golden/plan_stability.js --pauseAfterPopulate 

      and wait until the following is shown:

      [js_test:plan_stability] [jsTest] ----
      [js_test:plan_stability] [jsTest] TestData.pauseAfterPopulate is set. Pausing indefinitely ...
      [js_test:plan_stability] [jsTest] ----
       

      2. Connect with a mongo client to mongodb://127.0.0.1:20000 and run:

      pipeline = [{ '$match': { '$and': [{ 'a_compound': { '$elemMatch': { '$gte': 8 } } }, { 'a_noidx': { '$in': [12, 8] } }, { 'a_compound': { '$lte': 6 }  }] } }, { '$limit': 10 }];
      db.adminCommand({setParameter: 1, planRankerMode: "multiPlanning"});
      db.plan_stability.aggregate(pipeline).explain('executionStats').executionStats.executionTimeMillis;
      db.adminCommand({setParameter: 1, planRankerMode: "samplingCE"});
      db.plan_stability.aggregate(pipeline).explain('executionStats').executionStats.executionTimeMillis;
      

      You will get the following timings:

      Enterprise test> db.adminCommand({setParameter: 1, planRankerMode: "multiPlanning"});
      { was: 'samplingCE', ok: 1 }
      Enterprise test> db.plan_stability.aggregate(pipeline).explain('executionStats').executionStats.executionTimeMillis;
      0
      Enterprise test> db.adminCommand({setParameter: 1, planRankerMode: "samplingCE"});
      { was: 'multiPlanning', ok: 1 }
      Enterprise test> db.plan_stability.aggregate(pipeline).explain('executionStats').executionStats.executionTimeMillis;
      53
      

              Assignee:
              Unassigned
              Reporter:
              Philip Stoev
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: